OCPNODE-3880: Add criocredentialproviderconfig event handler#5487
OCPNODE-3880: Add criocredentialproviderconfig event handler#5487QiWang19 wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
/test all |
865fa47 to
8a072e4
Compare
|
/test all |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
8a072e4 to
75dfbfd
Compare
|
/test all |
75dfbfd to
511fde6
Compare
|
/test all |
511fde6 to
0ff0fe8
Compare
|
/test all |
0ff0fe8 to
5d5008b
Compare
|
/test all |
5d5008b to
7d79743
Compare
147fbf2 to
3027d6a
Compare
|
/test all |
3027d6a to
9a477a8
Compare
|
/test all |
9a477a8 to
58fa6bc
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test all |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
install/0000_80_machine-config_00_rbac.yaml (1)
160-162:⚠️ Potential issue | 🟠 MajorScope
request-serviceaccounts-token-audiencetoserviceaccountsonly.This rule still grants the custom verb across
resources: ["*"], which is broader than the credential-provider flow needs. Please fold it into the existingserviceaccountsrule instead of granting it cluster-wide on every core resource.Suggested fix
- apiGroups: [""] resources: ["serviceaccounts"] - verbs: ["get", "list"] - - apiGroups: [""] - resources: ["*"] - verbs: ["request-serviceaccounts-token-audience"] + verbs: ["get", "list", "request-serviceaccounts-token-audience"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install/0000_80_machine-config_00_rbac.yaml` around lines 160 - 162, The manifest currently grants the custom verb "request-serviceaccounts-token-audience" to resources: ["*"] under apiGroups: [""], which is too broad; remove that standalone rule and instead add "request-serviceaccounts-token-audience" to the existing serviceaccounts rule (the rule that has apiGroups: [""], resources: ["serviceaccounts"] or resources: ["serviceaccounts", ...]) so the verb is scoped only to serviceaccounts; ensure you remove the resources: ["*"] entry and update the serviceaccounts rule's verbs array to include "request-serviceaccounts-token-audience" without duplicating other verbs.pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
562-565:⚠️ Potential issue | 🟠 Major
maxRetriesstill never becomes terminal here.After
Forget(key), re-adding the item withAddAfterturns persistent failures into an endless 1-minute retry loop for every queue using this helper. That defeats the contract implied bymaxRetries.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 562 - 565, The code calls queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute) which resets retry state and defeats the intended maxRetries behavior; instead, check the retry count (use queue.NumRequeues(key) or the workqueue.RateLimitingInterface helper) against maxRetries and only re-enqueue with AddAfter/AddRateLimited when the count is below maxRetries, otherwise call queue.Forget(key) and stop requeuing (log terminal failure). Update the error-handling block around utilruntime.HandleError and klog.V(2).Infof to implement this conditional requeue logic using the existing maxRetries variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1224-1228: The status update is only executed when the reconcile
actually applied changes (applied == true), so no-op reconciles don't advance
CRIOCredentialProviderConfig status; move or add a call to
syncCRIOCredentialProviderConfigStatusOnly(...) so it runs even when applied is
false (i.e., after the applied conditional), ensuring you invoke
syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeMachineConfigRendered,
apicfgv1alpha1.ReasonMachineConfigRenderingSucceeded) on successful
reconciliation/no-op so ObservedGeneration and the rendered condition are
advanced regardless of whether MachineConfigs were modified.
---
Duplicate comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 160-162: The manifest currently grants the custom verb
"request-serviceaccounts-token-audience" to resources: ["*"] under apiGroups:
[""], which is too broad; remove that standalone rule and instead add
"request-serviceaccounts-token-audience" to the existing serviceaccounts rule
(the rule that has apiGroups: [""], resources: ["serviceaccounts"] or resources:
["serviceaccounts", ...]) so the verb is scoped only to serviceaccounts; ensure
you remove the resources: ["*"] entry and update the serviceaccounts rule's
verbs array to include "request-serviceaccounts-token-audience" without
duplicating other verbs.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The code calls queue.Forget(key) and then
queue.AddAfter(key, 1*time.Minute) which resets retry state and defeats the
intended maxRetries behavior; instead, check the retry count (use
queue.NumRequeues(key) or the workqueue.RateLimitingInterface helper) against
maxRetries and only re-enqueue with AddAfter/AddRateLimited when the count is
below maxRetries, otherwise call queue.Forget(key) and stop requeuing (log
terminal failure). Update the error-handling block around
utilruntime.HandleError and klog.V(2).Infof to implement this conditional
requeue logic using the existing maxRetries variable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5e364b1d-5e56-4d04-922d-2361caed86f3
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (6)
- manifests/machineconfigcontroller/clusterrole.yaml
- pkg/controller/container-runtime-config/helpers_test.go
- pkg/daemon/constants/constants.go
- test/framework/envtest.go
- pkg/apihelpers/apihelpers.go
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
9b97140 to
dc39060
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
562-565:⚠️ Potential issue | 🟠 MajorDon't requeue from the terminal retry path.
After
maxRetries, this still doesForgetand thenAddAfter, so broken objects never actually drop out of the queue and will retry forever every minute.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 562 - 565, The terminal retry path currently calls utilruntime.HandleError(err), logs via klog.V(2).Infof, then calls queue.Forget(key) and queue.AddAfter(key, 1*time.Minute) causing infinite requeues; update the terminal path so after reaching maxRetries you call utilruntime.HandleError(err) and queue.Forget(key) (or just Forget and return) but do NOT call queue.AddAfter; locate the logic around maxRetries handling in container_runtime_config_controller.go where queue.Forget and queue.AddAfter are invoked and remove the AddAfter call so failed objects are not requeued forever.pkg/controller/container-runtime-config/container_runtime_config_controller_test.go (1)
659-664:⚠️ Potential issue | 🟡 MinorDon't short-circuit the empty-spec verification path.
This still returns before any later assertions run, so
TestCrioCredentialProviderConfigCreateEmptycan pass while silently masking mismatches incriocpConfigVerifyOptionsfor the non-cloud path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go` around lines 659 - 664, The test currently returns early when verifyOpts.expectMCNilContent is true, which skips later assertions and can mask mismatches for TestCrioCredentialProviderConfigCreateEmpty; change the block in the test that checks verifyOpts.expectMCNilContent so it only asserts that ignCfg.Storage.Files is empty (when expectMCNilContent is true) and do not return from the test—remove the early return (or scope it so only the specific check is skipped) so the remaining criocpConfigVerifyOptions-related assertions still execute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 685-692: The current loop sets allMatch to true when any
criocp.Spec.MatchImages entry is found in provider.MatchImages, so a partial
match passes; change the logic so allMatch reflects that every expected image is
present: initialize allMatch = true before the loop over
criocp.Spec.MatchImages, and for each mi check assert.Contains(t,
provider.MatchImages, string(mi)); if the check fails set allMatch = false and
break (or return) immediately. Use the same symbols (allMatch,
criocp.Spec.MatchImages, provider.MatchImages) so the test correctly requires
every expected image to be present.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 1208-1210: When handling CRIOCredentialProviderConfig overlaps in
syncCRIOCredentialProviderConfigStatusOnly, also clear the stale
ConditionTypeValidated/ReasonConfigurationPartiallyApplied when overlaps are
resolved: if len(overlappedEntries) == 0 call
syncCRIOCredentialProviderConfigStatusOnly with a success/validated condition
(e.g., apicfgv1alpha1.ConditionTypeValidated, ReasonConfigurationApplied or
similar) and a message indicating rendering succeeded so the previous
ConfigurationPartiallyApplied condition is removed; ensure you update the same
condition type (ConditionTypeValidated) rather than only setting it when
overlaps exist.
In `@pkg/controller/container-runtime-config/helpers.go`:
- Line 1383: The parameter name in function generateDropinUnitCredProviderConfig
is misspelled as generticCredProviderConfigPath; rename it to
genericCredProviderConfigPath in the function signature and update all internal
references and callers to use the corrected identifier so the symbol matches
expected spelling (ensure generateDropinUnitCredProviderConfig and any places
that call it compile after the rename).
---
Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 659-664: The test currently returns early when
verifyOpts.expectMCNilContent is true, which skips later assertions and can mask
mismatches for TestCrioCredentialProviderConfigCreateEmpty; change the block in
the test that checks verifyOpts.expectMCNilContent so it only asserts that
ignCfg.Storage.Files is empty (when expectMCNilContent is true) and do not
return from the test—remove the early return (or scope it so only the specific
check is skipped) so the remaining criocpConfigVerifyOptions-related assertions
still execute.
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The terminal retry path currently calls
utilruntime.HandleError(err), logs via klog.V(2).Infof, then calls
queue.Forget(key) and queue.AddAfter(key, 1*time.Minute) causing infinite
requeues; update the terminal path so after reaching maxRetries you call
utilruntime.HandleError(err) and queue.Forget(key) (or just Forget and return)
but do NOT call queue.AddAfter; locate the logic around maxRetries handling in
container_runtime_config_controller.go where queue.Forget and queue.AddAfter are
invoked and remove the AddAfter call so failed objects are not requeued forever.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b802971-f882-404a-af1a-cf29785423f4
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (8)
- pkg/daemon/constants/constants.go
- templates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yaml
- install/0000_80_machine-config_00_rbac.yaml
- test/framework/envtest.go
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
- templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
- templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
- pkg/apihelpers/apihelpers.go
pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
Outdated
Show resolved
Hide resolved
pkg/controller/container-runtime-config/container_runtime_config_controller.go
Show resolved
Hide resolved
dc39060 to
d1eef7d
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (3)
install/0000_80_machine-config_00_rbac.yaml (1)
156-162:⚠️ Potential issue | 🟠 MajorScope
request-serviceaccounts-token-audiencetoserviceaccountsonly.Granting this custom verb on
resources: ["*"]givessystem:nodesthat capability across every core resource. The new credential-provider flow only needs it onserviceaccounts, so this should be merged into the preceding rule instead of expanding the resource scope cluster-wide.Suggested fix
rules: - apiGroups: [""] resources: ["serviceaccounts"] - verbs: ["get", "list"] - - apiGroups: [""] - resources: ["*"] - verbs: ["request-serviceaccounts-token-audience"] + verbs: ["get", "list", "request-serviceaccounts-token-audience"]In OpenShift RBAC, should the custom verb `request-serviceaccounts-token-audience` be granted on `resources: ["serviceaccounts"]` rather than `resources: ["*"]`?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@install/0000_80_machine-config_00_rbac.yaml` around lines 156 - 162, The second rule grants the custom verb "request-serviceaccounts-token-audience" across all core resources; narrow its scope by moving that verb into the prior rule that already targets apiGroups: [""] and resources: ["serviceaccounts"] so the combined rule lists verbs: ["get","list","request-serviceaccounts-token-audience"] (i.e., remove the separate rule with resources: ["*"] and ensure only serviceaccounts retain the custom verb).pkg/controller/container-runtime-config/container_runtime_config_controller.go (2)
549-565:⚠️ Potential issue | 🟠 MajorStop requeueing once
maxRetriesis exhausted.
queue.Forget(key)followed byqueue.AddAfter(key, 1*time.Minute)turns permanent failures into an endless 1-minute retry loop, so invalid objects never actually drop out of the queue despite the controller contract above.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 549 - 565, The current handleQueueErr implementation requeues keys after maxRetries by calling queue.Forget(key) then queue.AddAfter(key, 1*time.Minute), creating an endless retry loop; modify the logic in Controller.handleQueueErr so that when queue.NumRequeues(key) >= maxRetries it only calls queue.Forget(key) (and utilruntime.HandleError(err) / klog as present) and does not call queue.AddAfter(key, 1*time.Minute) so the item is dropped permanently instead of being requeued.
1208-1228:⚠️ Potential issue | 🟠 MajorClear the stale
Validated=Falsecondition when overlaps disappear.The validated condition is only updated inside the overlap branch. If a user fixes the conflicting
matchImageslater, this reconcile only writesMachineConfigRendered=True, so the oldConditionTypeValidated/ partial-applied status remains stuck on the CR. UpdateConditionTypeValidatedon the non-overlap path too someta.SetStatusConditioncan replace the stale failure.Suggested fix
if len(overlappedEntries) > 0 { ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationPartiallyApplied, "CRIOCredentialProviderConfig has one or multiple entries that overlap with the original credential provider config. Skip rendering entries: %v.", overlappedEntries) + } else { + ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, "") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 1208 - 1228, The code only calls syncCRIOCredentialProviderConfigStatusOnly(..., apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationPartiallyApplied, ...) inside the "overlap" branch, leaving a stale Validated=False when overlaps are later resolved; modify the non-overlap path (the branch that proceeds to call syncIgnitionConfig and returns nil/applied) to explicitly clear or set ConditionTypeValidated to a success state (e.g., call syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, apicfgv1alpha1.ReasonConfigurationValidated, "validated") or similar) so meta.SetStatusCondition replaces the previous failure—add this call just before returning nil or after successful MachineConfig rendering in the code paths that use syncIgnitionConfig and when applied is true, referencing syncCRIOCredentialProviderConfigStatusOnly and ConditionTypeValidated to locate the insertion point.
🧹 Nitpick comments (1)
pkg/controller/container-runtime-config/container_runtime_config_controller.go (1)
67-71: Reuse the exported credential-provider path constants here.These literals now duplicate
pkg/daemon/constants/constants.go, so a later path change can silently desync controller rendering from the daemon-side policy wiring. Please build these fromconstants.KubernetesCredentialProvidersDirandconstants.KubeletCrioImageCredProviderConfPathinstead of hardcoding them again.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 67 - 71, Replace the hardcoded credential provider path constants in this file: instead of defining genericCredProviderConfigPath and kubeletCrioImageCredProviderConfPath as string literals, build them from the exported constants in pkg/daemon/constants: use constants.KubernetesCredentialProvidersDir to construct the generic credential provider config filename (generic-credential-provider.yaml) and use constants.KubeletCrioImageCredProviderConfPath for the kubelet crio image credential provider config; update the declarations where builtInLabelKey and these two path constants are defined so the controller reuses constants.KubernetesCredentialProvidersDir and constants.KubeletCrioImageCredProviderConfPath rather than duplicating the literal paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@install/0000_80_machine-config_00_rbac.yaml`:
- Around line 156-162: The second rule grants the custom verb
"request-serviceaccounts-token-audience" across all core resources; narrow its
scope by moving that verb into the prior rule that already targets apiGroups:
[""] and resources: ["serviceaccounts"] so the combined rule lists verbs:
["get","list","request-serviceaccounts-token-audience"] (i.e., remove the
separate rule with resources: ["*"] and ensure only serviceaccounts retain the
custom verb).
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 549-565: The current handleQueueErr implementation requeues keys
after maxRetries by calling queue.Forget(key) then queue.AddAfter(key,
1*time.Minute), creating an endless retry loop; modify the logic in
Controller.handleQueueErr so that when queue.NumRequeues(key) >= maxRetries it
only calls queue.Forget(key) (and utilruntime.HandleError(err) / klog as
present) and does not call queue.AddAfter(key, 1*time.Minute) so the item is
dropped permanently instead of being requeued.
- Around line 1208-1228: The code only calls
syncCRIOCredentialProviderConfigStatusOnly(...,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationPartiallyApplied, ...) inside the "overlap"
branch, leaving a stale Validated=False when overlaps are later resolved; modify
the non-overlap path (the branch that proceeds to call syncIgnitionConfig and
returns nil/applied) to explicitly clear or set ConditionTypeValidated to a
success state (e.g., call syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated,
apicfgv1alpha1.ReasonConfigurationValidated, "validated") or similar) so
meta.SetStatusCondition replaces the previous failure—add this call just before
returning nil or after successful MachineConfig rendering in the code paths that
use syncIgnitionConfig and when applied is true, referencing
syncCRIOCredentialProviderConfigStatusOnly and ConditionTypeValidated to locate
the insertion point.
---
Nitpick comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 67-71: Replace the hardcoded credential provider path constants in
this file: instead of defining genericCredProviderConfigPath and
kubeletCrioImageCredProviderConfPath as string literals, build them from the
exported constants in pkg/daemon/constants: use
constants.KubernetesCredentialProvidersDir to construct the generic credential
provider config filename (generic-credential-provider.yaml) and use
constants.KubeletCrioImageCredProviderConfPath for the kubelet crio image
credential provider config; update the declarations where builtInLabelKey and
these two path constants are defined so the controller reuses
constants.KubernetesCredentialProvidersDir and
constants.KubeletCrioImageCredProviderConfPath rather than duplicating the
literal paths.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 049b026b-d651-43da-9a25-0d4750052dd9
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (5)
- templates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yaml
- pkg/apihelpers/apihelpers.go
- test/framework/envtest.go
- templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
- pkg/controller/container-runtime-config/container_runtime_config_controller_test.go
|
/test verify |
d1eef7d to
5d64533
Compare
|
@QiWang19: This pull request references OCPNODE-3880 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
♻️ Duplicate comments (4)
test/framework/envtest.go (1)
119-121:⚠️ Potential issue | 🟡 MinorAlso wire this CRD through the shared envtest helpers.
NewTestEnvnow installsCRIOCredentialProviderConfig, butCheckCleanEnvironment,CleanEnvironment, andCreateObjectsstill don't list/delete/create that type. Any envtest using the common helpers can leak these objects across cases and can't seed them through the shared setup path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/framework/envtest.go` around lines 119 - 121, NewTestEnv installs CRIOCredentialProviderConfig but the shared helpers CheckCleanEnvironment, CleanEnvironment, and CreateObjects don't handle that type; update those helpers to include CRIOCredentialProviderConfig in their lists/operations so envtest correctly creates, verifies absence, and deletes these objects. Specifically, add the CRIOCredentialProviderConfig GVK/Kind into the type lists or switch cases used by CreateObjects, ensure CheckCleanEnvironment also checks for zero instances of CRIOCredentialProviderConfig, and have CleanEnvironment remove any CRIOCredentialProviderConfig resources found so tests don't leak them between cases.pkg/controller/container-runtime-config/container_runtime_config_controller_test.go (1)
658-663:⚠️ Potential issue | 🟡 MinorDon't short-circuit the empty-spec verification path.
expectMCNilContentcurrently means “no files at all” and returns before the drop-in assertions run. That makesTestCrioCredentialProviderConfigCreateEmptyskip validation of the non-cloud drop-in case entirely, and it would also reject a valid drop-in-only MachineConfig.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go` around lines 658 - 663, The test short-circuits when verifyOpts.expectMCNilContent is true, returning immediately after checking ignCfg.Storage.Files and thus skipping the subsequent drop-in assertions (affecting TestCrioCredentialProviderConfigCreateEmpty and similar cases); change the logic in the verification block so that you still assert len(ignCfg.Storage.Files) == 0 when verifyOpts.expectMCNilContent is true but do not return—allow execution to continue to the drop-in assertions (the code referencing verifyOpts.expectMCNilContent and ignCfg.Storage.Files should be updated to remove the early return and only perform the emptiness check).pkg/controller/container-runtime-config/container_runtime_config_controller.go (2)
562-565:⚠️ Potential issue | 🟠 MajorDon't requeue from the terminal branch.
queue.Forget(key)followed byqueue.AddAfter(key, 1*time.Minute)makesmaxRetriesnon-terminal again. Persistent failures will keep coming back forever instead of actually dropping out of the queue.Suggested fix
utilruntime.HandleError(err) klog.V(2).Infof("Dropping %s %q out of the queue: %v", resourceType, key, err) queue.Forget(key) - queue.AddAfter(key, 1*time.Minute) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 562 - 565, The terminal/error branch currently calls queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute), which prevents retries from becoming terminal; remove the requeue call so terminal errors are dropped: in the block handling utilruntime.HandleError(err) and logging via klog.V(2).Infof, keep queue.Forget(key) (and return if applicable) but remove queue.AddAfter(key, 1*time.Minute) so the item truly stops being requeued.
1208-1228:⚠️ Potential issue | 🟠 MajorClear
ConditionTypeValidatedafter conflicts are fixed.This only writes the validated condition on the overlap path. Once the conflicting
matchImagesare removed, the oldValidated=False/ConfigurationPartiallyAppliedcondition is left behind even though the next reconcile succeeds. Set the same condition type back to a success state in the no-overlap path as well.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/container-runtime-config/container_runtime_config_controller.go` around lines 1208 - 1228, The validated condition is only set when overlaps exist; after overlaps are resolved you must clear/reset that ConditionTypeValidated to success so old Validated=False / ConfigurationPartiallyApplied isn't left behind. In the code path around overlappedEntries and before calling ctrl.syncIgnitionConfig (referencing overlappedEntries, ctrl.syncCRIOCredentialProviderConfigStatusOnly and syncIgnitionConfig), add a call to ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil, apicfgv1alpha1.ConditionTypeValidated, <appropriate success reason constant e.g. apicfgv1alpha1.ReasonConfigurationValidated or a matching success reason>, "CRIOCredentialProviderConfig validated and no overlapping entries") so the validated condition is marked successful when no overlaps are present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller_test.go`:
- Around line 658-663: The test short-circuits when
verifyOpts.expectMCNilContent is true, returning immediately after checking
ignCfg.Storage.Files and thus skipping the subsequent drop-in assertions
(affecting TestCrioCredentialProviderConfigCreateEmpty and similar cases);
change the logic in the verification block so that you still assert
len(ignCfg.Storage.Files) == 0 when verifyOpts.expectMCNilContent is true but do
not return—allow execution to continue to the drop-in assertions (the code
referencing verifyOpts.expectMCNilContent and ignCfg.Storage.Files should be
updated to remove the early return and only perform the emptiness check).
In
`@pkg/controller/container-runtime-config/container_runtime_config_controller.go`:
- Around line 562-565: The terminal/error branch currently calls
queue.Forget(key) and then queue.AddAfter(key, 1*time.Minute), which prevents
retries from becoming terminal; remove the requeue call so terminal errors are
dropped: in the block handling utilruntime.HandleError(err) and logging via
klog.V(2).Infof, keep queue.Forget(key) (and return if applicable) but remove
queue.AddAfter(key, 1*time.Minute) so the item truly stops being requeued.
- Around line 1208-1228: The validated condition is only set when overlaps
exist; after overlaps are resolved you must clear/reset that
ConditionTypeValidated to success so old Validated=False /
ConfigurationPartiallyApplied isn't left behind. In the code path around
overlappedEntries and before calling ctrl.syncIgnitionConfig (referencing
overlappedEntries, ctrl.syncCRIOCredentialProviderConfigStatusOnly and
syncIgnitionConfig), add a call to
ctrl.syncCRIOCredentialProviderConfigStatusOnly(nil,
apicfgv1alpha1.ConditionTypeValidated, <appropriate success reason constant e.g.
apicfgv1alpha1.ReasonConfigurationValidated or a matching success reason>,
"CRIOCredentialProviderConfig validated and no overlapping entries") so the
validated condition is marked successful when no overlaps are present.
In `@test/framework/envtest.go`:
- Around line 119-121: NewTestEnv installs CRIOCredentialProviderConfig but the
shared helpers CheckCleanEnvironment, CleanEnvironment, and CreateObjects don't
handle that type; update those helpers to include CRIOCredentialProviderConfig
in their lists/operations so envtest correctly creates, verifies absence, and
deletes these objects. Specifically, add the CRIOCredentialProviderConfig
GVK/Kind into the type lists or switch cases used by CreateObjects, ensure
CheckCleanEnvironment also checks for zero instances of
CRIOCredentialProviderConfig, and have CleanEnvironment remove any
CRIOCredentialProviderConfig resources found so tests don't leak them between
cases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 41d65ca0-f596-478f-a2ac-0897802b2836
📒 Files selected for processing (13)
install/0000_80_machine-config_00_rbac.yamlmanifests/machineconfigcontroller/clusterrole.yamlpkg/apihelpers/apihelpers.gopkg/controller/container-runtime-config/container_runtime_config_controller.gopkg/controller/container-runtime-config/container_runtime_config_controller_test.gopkg/controller/container-runtime-config/helpers.gopkg/controller/container-runtime-config/helpers_test.gopkg/daemon/constants/constants.gotemplates/arbiter/01-arbiter-kubelet/_base/units/kubelet.service.yamltemplates/arbiter/01-arbiter-kubelet/on-prem/units/kubelet.service.yamltemplates/master/01-master-kubelet/on-prem/units/kubelet.service.yamltemplates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yamltest/framework/envtest.go
🚧 Files skipped from review as they are similar to previous changes (5)
- templates/master/01-master-kubelet/on-prem/units/kubelet.service.yaml
- pkg/apihelpers/apihelpers.go
- install/0000_80_machine-config_00_rbac.yaml
- templates/worker/01-worker-kubelet/on-prem/units/kubelet.service.yaml
- manifests/machineconfigcontroller/clusterrole.yaml
5d64533 to
137ce18
Compare
|
/verified by @QiWang19 |
|
@QiWang19: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
137ce18 to
c8f5763
Compare
|
/lgtm |
|
/retest-required |
| ctrl.syncCRIOCredentialProviderConfigStatusOnly(err, apicfgv1alpha1.ConditionTypeMachineConfigRendered, apicfgv1alpha1.ReasonMachineConfigRenderingFailed, "could not generate CRIOCredentialProvider Ignition config: %v", err) | ||
| return err | ||
| } | ||
| if len(overlappedEntries) > 0 { |
There was a problem hiding this comment.
When overlaps exist, this sets ConditionTypeValidated to False with ReasonConfigurationPartiallyApplied. If the user later removes the conflicting entries from the CR, no code path resets this condition back to True. The stale warning persists indefinitely.
Do we need to add an else branch (or an unconditional call after the loop) that sets ConditionTypeValidated to True when len(overlappedEntries) == 0?
| builtInLabelKey = "machineconfiguration.openshift.io/mco-built-in" | ||
| configMapName = "crio-default-container-runtime" | ||
| forceSyncOnUpgrade = "force-sync-on-upgrade" | ||
| genericCredProviderConfigPath = "/etc/kubernetes/credential-providers/generic-credential-provider.yaml" |
There was a problem hiding this comment.
Nit: kubeletCrioImageCredProviderConfPath duplicates constants.KubeletCrioImageCredProviderConfPath from pkg/daemon/constants/constants.go. Consider using the constants package instead of redefining the value here.
c8f5763 to
c953926
Compare
Signed-off-by: Qi Wang <qiwan@redhat.com>
c953926 to
18b41eb
Compare
|
@QiWang19: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/retest |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: haircommander, QiWang19, saschagrunert The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
- What I did
Implement criocredentialprovierconfig that is used by
crio-credential-providerplugin to fetch private mirror image pull secrets from the secret object.The handler creates
97-pool-generated-credentialproviderconfigto rollout configurations to file/etc/kubernetes/credential-providers/[platform]-credential-provider.yaml.workflow: https://github.com/openshift/enhancements/blob/master/enhancements/api-review/criocredentialproviderconfig-for-namespace-scoped-mirror-authentication.md#workflow-description
- How to verify it
clusterCRIOCredentialProviderConfig resource, file updated with a new sectionname: crio-credential-providernamespace:mynamespacecontainers.imageis from mirrorsourceregistryjournalctl _COMM=crio-credentialon the scheduled node- Description for the changelog
Summary by CodeRabbit
New Features
Chores / Defaults
Security & Permissions
Tests