Skip OCI Object Storage download for Ready same-path artifacts#629
Skip OCI Object Storage download for Ready same-path artifacts#629op109lvb wants to merge 2 commits into
Conversation
| } | ||
|
|
||
| func sameModelStoragePath(currentStorage *v1beta1.StorageSpec, candidateStorage *v1beta1.StorageSpec, modelRootDir string, destPath string) bool { | ||
| if currentStorage == nil || candidateStorage == nil || currentStorage.Path == nil || candidateStorage.Path == nil { |
There was a problem hiding this comment.
do we need to handle the case where the getDestPath is empty?
There was a problem hiding this comment.
No need to add.
getDestPath already falls back to modelRootDir + "/" + storageUri when storage.path is empty, and reuse still requires os.Stat(destPath) to pass.
| baseModels, err := s.baseModelLister.List(labels.Everything()) | ||
| if err == nil { | ||
| for _, model := range baseModels { | ||
| key := constants.GetModelConfigMapKey(model.Namespace, model.Name, false) |
There was a problem hiding this comment.
nit: maybe we can extract reusable part between cluster basemodel and basemodel.
There was a problem hiding this comment.
Thanks. I kept the BaseModel and ClusterBaseModel lookup loops separate because that matches the existing repo pattern: they are different CR types with different listers and namespace/key semantics. The shared comparison logic is already extracted into sameModelStoragePath, so I’d avoid a broader refactor in this focused PR.
|
we have this before f8d263f |
For the existing HF reuse path: It works because HF has a model-level commit SHA that model-agent can fetch before download and compare with the SHA recorded in the node ConfigMap. OCI Object Storage does not have the same model-level SHA in the current implementation. To prove an existing OCI local copy matches, the current path has to do per-object size/MD5 validation, which is the expensive work this PR is trying to skip for copied CRs. |
What this PR does
Speeds up copied model reconciliation when the copied CR uses OCI Object Storage and points to the same storage path as a model artifact that is already Ready on the same node.
For same-path object storage reuse, model-agent skips the expensive download/integrity-validation path instead of re-running file validation.
The implementation looks at this node’s Ready model entries and reuses only when the storage URI, path, schema path, storage key, parameters, and resolved destination path match exactly, and the local destination path already exists.
Why we need it
Copied model CRs can point to the same object storage path as an already Ready model on the same node. In that case, re-downloading or re-validating all files adds unnecessary reconciliation latency.
This speeds up copied model CR reconciliation while preserving safety:
How to test
Checklist
make testpasses locally