Skip to content

Enforce scope-priority-based tie-breaking for overlapping auto-select runtimes #594

Description

@YouNeedCryDear

What to build

The runtime auto-selection must handle cases where multiple runtimes overlap on their supported model format and model size range. If overlapping runtimes have autoSelect: true, we must ensure that their priorities are explicitly different. When a user creates an inference service without explicitly specifying a runtime, exactly one runtime—the one with the highest priority—must be auto-selected.

Acceptance criteria

  • The ClusterServingRuntimeValidator and ServingRuntimeValidator webhooks must ensure that if multiple runtimes (even cross-scope between ServingRuntime and ClusterServingRuntime) have autoSelect: true and overlap on SupportedModelFormat and ModelSizeRange, they must have different priorities.
  • If priorities are omitted, the webhook should handle defaults safely or enforce that priority is explicitly specified when an overlap occurs.
  • The runtime selector (pkg/runtimeselector/) correctly honors this priority and strictly auto-selects the runtime with the highest priority when overlaps occur.
  • Existing webhook and selector unit tests are updated to verify the priority overlap rejection and priority-based selection logic.

Blocked by

None - can start immediately

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions