Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 2uasimojo The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@2uasimojo: This pull request references HIVE-2391 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
29af27f to
5249404
Compare
|
@jianping-shu this passed e2e-vsphere, so I reckon it's probably ready for you to take another stab at it! |
|
/hold Looks like I missed refactoring the preflight auth check for the new creds shape. |
5249404 to
39cf13e
Compare
|
/hold cancel |
apis/hive/v1/vsphere/machinepools.go
Outdated
| // Topology is the vSphere topology that will be used for virtual machines. | ||
| // If it is not present, a default value will be used. | ||
| // +optional | ||
| Topology *vsphere.Topology `json:"topology,omitempty"` |
There was a problem hiding this comment.
Part of the reason I switched to Topology here is that someone requested a "Folder" field (and it was present (alongside ResourcePool and TagIDs) on the Topology struct)
There was a problem hiding this comment.
That is to say, this object should grow at least a Folder field (and override as necessary) alongside some of the other topology fields
There was a problem hiding this comment.
As discussed offline:
In its current form, this PR is only keeping the overrides that already existed in the API. If we need to support more overrides and/or move the existing ones to per-fd-capable, we should do that under a separate card/PR.
If you can dig up "someone requested", let's get it into a new card.
Meanwhile, do you agree/accept leaving this as is?
| @@ -81,7 +81,7 @@ func validateVSphereMachineSets(t *testing.T, mSets []*machineapi.MachineSet, ex | |||
| assert.Equal(t, int32(4), vsphereProvider.NumCPUs, "unexpected NumCPUs") | |||
| assert.Equal(t, int32(4), vsphereProvider.NumCoresPerSocket, "unexpected NumCoresPerSocket") | |||
| assert.Equal(t, int32(512), vsphereProvider.DiskGiB, "unexpected DiskGiB") | |||
There was a problem hiding this comment.
Per note near the top: we should be able to override the folder of the generated machineset
|
The new multi-creds changes LGTM. My only concern (as noted in review comments) is that there are some additional fields (at least |
39cf13e to
896e851
Compare
|
/hold for QE |
896e851 to
7996e22
Compare
|
/test e2e security e2e: infra flake |
|
Allowable to override |
| } | ||
| if vsphere.DefaultDatastore == "" { | ||
| allErrs = append(allErrs, field.Required(vspherePath.Child("defaultDatastore"), "must specify vSphere defaultDatastore")) | ||
| if len(vsphere.Infrastructure.VCenters) == 0 { |
There was a problem hiding this comment.
Need to check if vsphere.Infrastructure != nil firstly, the CD with deprecated fields only doesn't have infrastructure.
There was a problem hiding this comment.
Good find @jianping-shu.
I actually think we want to loosen this check a bit. If I understand Mark's code, it'll upconvert the CD platform section to the new shape... but only after it has already been stored in etcd*. I.e. when we hit this admission hook, it is possible and allowable for the CR to be in the old shape.
@dlom do you agree?
*Side note: a mutating webhook might have been appropriate instead -- though I'm not sure how that works for legacy-shape CRs already in etcd.
7996e22 to
066a4e4
Compare
066a4e4 to
ca2765c
Compare
|
/test coverage that's a weird one. |
ca2765c to
0746f91
Compare
|
/test e2e Actual test passed; infra flake during must-gather. security is same old same old, override when needed. |
Followon addressing review from openshift#2731. MachinePool: - Removed `Topology` override - Restored ResourcePool and TagIDs overrides - Removed `osImage` detected from arbitrary master; using whatever's passed through from FD Topology (which defaults sanely if unset). Deprovision: - Changed `--vsphere-vcenter` StringVar to `--vsphere-vcenters` StringSliceVar Platform Creds: - Redesigned to take `vcenters`, a list of vcenter server/username/password, matching (and unmarshaling into) the corresponding chunk of metadata.json. Docs: - Updated install-config sample to zonal shape. - Documented new creds shape.
0746f91 to
aa9dcc3
Compare
WalkthroughThis PR refactors vSphere platform configuration in Hive by introducing a new Infrastructure-based nested structure to replace flat configuration fields, adds multi-vCenter support via a vCenters array, marks legacy fields as deprecated with conversion utilities, and updates CRDs, controllers, and validators to support both old and new schemas. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Comment |
|
Updated for the issue @jianping-shu found where we were using the same Latest push also includes a trivial rebase. |
|
@2uasimojo: This pull request references HIVE-2391 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 18
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/creds/vsphere/vsphere.go (1)
18-20:⚠️ Potential issue | 🟡 MinorDocstring is outdated.
The docstring still mentions "configures VSphere credential environment variables" but the implementation now populates
metadata.VSphere.VCentersinstead.📝 Suggested docstring update
-// ConfigureCreds loads secrets designated by the environment variables CLUSTERDEPLOYMENT_NAMESPACE, -// CREDS_SECRET_NAME, and CERTS_SECRET_NAME and configures VSphere credential environment variables -// and config files accordingly. +// ConfigureCreds loads secrets designated by the environment variables CLUSTERDEPLOYMENT_NAMESPACE, +// CREDS_SECRET_NAME, and CERTS_SECRET_NAME. It populates metadata.VSphere.VCenters with credentials +// and installs VSphere certificates accordingly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/creds/vsphere/vsphere.go` around lines 18 - 20, Update the ConfigureCreds docstring to reflect the current behavior: instead of saying it "configures VSphere credential environment variables," state that ConfigureCreds loads secrets and populates metadata.VSphere.VCenters (and any related metadata fields) with the vCenter connection information; mention which environment variables it reads (CLUSTERDEPLOYMENT_NAMESPACE, CREDS_SECRET_NAME, CERTS_SECRET_NAME) and that it writes into metadata.VSphere.VCenters rather than setting env vars so readers can locate the changed behavior in the ConfigureCreds implementation.
🟡 Minor comments (6)
pkg/install/generate.go-892-903 (1)
892-903:⚠️ Potential issue | 🟡 MinorConsider handling the case where both VCenters and DeprecatedVCenter are empty.
If
VCentersis an empty slice andDeprecatedVCenteris also empty,joinedVCenterswill be an empty string, resulting in--vsphere-vcenters ""being passed to the deprovision command. This may cause unexpected behavior or unclear error messages downstream.🛡️ Suggested validation
if legacy { var joinedVCenters string if len(req.Spec.Platform.VSphere.VCenters) == 0 && req.Spec.Platform.VSphere.DeprecatedVCenter != "" { joinedVCenters = req.Spec.Platform.VSphere.DeprecatedVCenter } else { // StringSliceVar accepts comma-delimited joinedVCenters = strings.Join(req.Spec.Platform.VSphere.VCenters, ",") } + if joinedVCenters == "" { + // This shouldn't happen if the ClusterDeprovision was created correctly, + // but log a warning for debugging purposes + // Consider returning an error or using a sensible default + } args = []string{ "deprovision", "vsphere", "--vsphere-vcenters", joinedVCenters, } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/install/generate.go` around lines 892 - 903, The code builds joinedVCenters from req.Spec.Platform.VSphere.VCenters / DeprecatedVCenter and always appends "--vsphere-vcenters", joinedVCenters to args; handle the case where both are empty by either returning a validation error or omitting the flag: check req.Spec.Platform.VSphere.VCenters and req.Spec.Platform.VSphere.DeprecatedVCenter and if both are empty do one of two things (preferred: return an error from the generator indicating missing vCenter info; alternative: only append "--vsphere-vcenters" to args when joinedVCenters != ""), updating the logic around joinedVCenters and the args slice (referencing joinedVCenters, req.Spec.Platform.VSphere.VCenters, DeprecatedVCenter, and args) so we never pass an empty "--vsphere-vcenters" value.docs/hiveutil.md-108-115 (1)
108-115:⚠️ Potential issue | 🟡 MinorThe required-parameters table still reflects the pre-JSON flow.
The example below puts
apiVIPsandingressVIPsinside--vsphere-platform-spec-json, but this table still marks--vsphere-api-vipand--vsphere-ingress-vipas required. Please remove those rows or clarify they are only needed when the JSON omits the VIPs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/hiveutil.md` around lines 108 - 115, The table in docs/hiveutil.md incorrectly lists --vsphere-api-vip and --vsphere-ingress-vip as required even though the JSON flow (VSPHERE_INSTALLER_PLATFORM_SPEC_JSON / --vsphere-platform-spec-json) can contain apiVIPs and ingressVIPs; update the table to either remove the two VIP rows or add a clarifying note that --vsphere-api-vip and --vsphere-ingress-vip are only required when VSPHERE_INSTALLER_PLATFORM_SPEC_JSON / --vsphere-platform-spec-json does not include apiVIPs/ingressVIPs, referencing the exact flags (--vsphere-platform-spec-json, --vsphere-api-vip, --vsphere-ingress-vip) so readers understand the conditional requirement.config/crds/hive.openshift.io_clusterpools.yaml-1345-1348 (1)
1345-1348:⚠️ Potential issue | 🟡 MinorFix the migration targets in the deprecation text.
These hints point users to
FailureDomains.Topology.NetworkandVCenters.Username, but the schema here exposesfailureDomains[].topology.networksandvcenters[].user. Since this text is surfaced bykubectl explain, the current wording sends users to fields that do not exist.Also applies to: 1427-1430
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/crds/hive.openshift.io_clusterpools.yaml` around lines 1345 - 1348, Update the deprecation descriptions to reference the actual schema fields exposed: change the Network deprecation target from FailureDomains.Topology.Network to failureDomains[].topology.networks, and change the vcenter username deprecation target from VCenters.Username to vcenters[].user; locate the description blocks for the Network field (currently mentioning FailureDomains.Topology.Network) and the vcenter username field (the block that mentions VCenters.Username) and replace the incorrect dotted paths with the exact field names shown above so kubectl explain points to existing fields.hack/app-sre/saas-template.yaml-2948-2953 (1)
2948-2953:⚠️ Potential issue | 🟡 MinorFix the deprecated field path in the description.
The migration target says
FailureDomains.Topology.Network, but the schema definestopology.networks. As written,kubectl explainpoints users at a field that does not exist.Also applies to: 6925-6930
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/app-sre/saas-template.yaml` around lines 2948 - 2953, Update the deprecated field path in the description for the network field: change the migration target from "FailureDomains.Topology.Network" to the actual schema path "FailureDomains.Topology.networks" so kubectl explain points to the correct field; locate the description text for the network field (symbol: network) and replace the incorrect path, and make the same change for the other occurrence referenced (the second network description block).config/crds/hive.openshift.io_clusterdeployments.yaml-1684-1692 (1)
1684-1692:⚠️ Potential issue | 🟡 Minor
format: ipannotation is misplaced on array type (same asapiVIPs).Same issue as
apiVIPs- the format annotation should be on the items type, not the array container.Suggested fix
ingressVIPs: description: |- IngressVIPs contains the VIP(s) for ingress. In dual stack clusters it contains an IPv4 and IPv6 address, otherwise only one VIP - format: ip items: type: string + format: ip maxItems: 2 type: array🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/crds/hive.openshift.io_clusterdeployments.yaml` around lines 1684 - 1692, The schema for ingressVIPs wrongly applies "format: ip" to the array node; move the format annotation to the array items like done for apiVIPs so each string item has format: ip, i.e., remove or relocate the top-level format under ingressVIPs and add format: ip under ingressVIPs.items (the items.type: string node) to validate each IP entry correctly.config/crds/hive.openshift.io_clusterdeployments.yaml-1329-1337 (1)
1329-1337:⚠️ Potential issue | 🟡 MinorMove
format: ipfrom array to items schema level.In OpenAPI 3.0, format validation for array items must be specified in the
itemsschema, not on the array container. This issue originates in the Go source annotation and needs to be fixed there via kubebuilder markers.Source fix location
In
vendor/github.com/openshift/installer/pkg/types/vsphere/platform.go(lines 92-98), the kubebuilder annotation+kubebuilder:validation:Format=ipon theAPIVIPs []stringfield should be adjusted so controller-gen generates the correct schema structure with format inside the items definition.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/crds/hive.openshift.io_clusterdeployments.yaml` around lines 1329 - 1337, The array-level `+kubebuilder:validation:Format=ip` must be applied to the element type so the generated OpenAPI puts format inside `items`; fix by defining a new named element type (e.g., type APIVIP string) with the kubebuilder annotation `+kubebuilder:validation:Format=ip` on that type, then change the field signature from `APIVIPs []string` to `APIVIPs []APIVIP` (keep existing min/max item annotations on the slice); update any imports/usage accordingly so controller-gen emits `items.format: ip`.
🧹 Nitpick comments (8)
pkg/controller/utils/vsphereutils/vsphereutils_test.go (1)
12-84: Add a case for the pre-populatedInfrastructurepath.
ConvertDeprecatedFieldsis also a no-op whenplatform.Infrastructureis already set, and that is the path native zonal configs will take. This test only exercises the legacy-conversion branch, so a future change could accidentally overwrite existing Infrastructure without any test failing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/utils/vsphereutils/vsphereutils_test.go` around lines 12 - 84, Add a new subtest in TestConvertDeprecatedFields that pre-populates hivevsphere.Platform.Infrastructure with a non-nil installervsphere.Platform value and asserts ConvertDeprecatedFields does not modify it: construct a platform with Infrastructure set (instead of only deprecated fields), set Expected to the same installervsphere.Platform instance, call ConvertDeprecatedFields(platform), require.NoError, and compare platform.Infrastructure to Expected to ensure the function is a no-op when platform.Infrastructure is already present (reference symbols: TestConvertDeprecatedFields, ConvertDeprecatedFields, hivevsphere.Platform.Infrastructure, installervsphere.Platform).apis/hive/v1/clusterdeprovision_types.go (1)
135-140: Consider addingomitemptyto the deprecated field.The
DeprecatedVCenterfield lacksomitempty, which means it will always be serialized even when empty. For cleaner JSON output on new resources, consider:📝 Suggested change
// DeprecatedVCenter is the vSphere vCenter hostname. // Deprecated: use VCenters instead. // +optional - DeprecatedVCenter string `json:"vCenter"` + DeprecatedVCenter string `json:"vCenter,omitempty"` // VCenters are potentially multiple vCenter hostnames. Prefer this field over VCenter. - VCenters []string `json:"vCenters"` + VCenters []string `json:"vCenters,omitempty"`🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apis/hive/v1/clusterdeprovision_types.go` around lines 135 - 140, The DeprecatedVCenter struct field currently uses `json:"vCenter"` and will always be serialized even when empty; update its JSON tag to include `omitempty` (i.e., change the tag for DeprecatedVCenter on the struct to `json:"vCenter,omitempty"`) so empty deprecated values are omitted from JSON output while leaving VCenters unchanged; locate the DeprecatedVCenter field in the same struct definition (symbol: DeprecatedVCenter) and update only the tag to preserve backward compatibility.apis/hive/v1/vsphere/platform.go (1)
14-16: Minor documentation inconsistency in deprecation comments.The deprecation comments reference
Platform.ConvertDeprecatedFields, but the actual function isvsphereutils.ConvertDeprecatedFields(inpkg/controller/utils/vsphereutils/). Consider updating to reference the correct location for clarity.📝 Suggested documentation fix
// VCenter is the domain name or IP address of the vCenter. // Deprecated: Please use Platform.Infrastructure instead - // See also: Platform.ConvertDeprecatedFields + // See also: vsphereutils.ConvertDeprecatedFields // +optional DeprecatedVCenter string `json:"vCenter,omitempty"`Apply similar changes to the other deprecated field comments.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apis/hive/v1/vsphere/platform.go` around lines 14 - 16, Update the deprecation comment to reference the correct function name/location: replace the incorrect reference to Platform.ConvertDeprecatedFields with vsphereutils.ConvertDeprecatedFields (the actual function in pkg/controller/utils/vsphereutils) in the comment above the deprecated field; apply the same fix to any other deprecated-field comments that mention Platform.ConvertDeprecatedFields so they consistently point to vsphereutils.ConvertDeprecatedFields.pkg/validating-webhooks/hive/v1/clusterdeployment_validating_admission_hook_test.go (1)
1901-1926: Consider makingexpectedAllowed: falseexplicit in test cases.Several test cases (lines 1901-1906, 1907-1916, 1917-1926) rely on the default zero value of
expectedAllowedbeingfalse. While functionally correct, making this explicit improves test readability and makes the expected behavior immediately clear.📝 Suggested clarification
{ name: "vsphere platform cannot be de-zonal-ified", oldObject: validVSphereClusterDeployment(), newObject: deprecatedVSphereClusterDeployment(), operation: admissionv1beta1.Update, + expectedAllowed: false, }, { name: "vsphere platform immutable if not being zonal-ified (deprecated shape)", oldObject: func() *hivev1.ClusterDeployment { cd := deprecatedVSphereClusterDeployment() cd.Spec.Platform.VSphere.CredentialsSecretRef.Name = "changed" return cd }(), newObject: deprecatedVSphereClusterDeployment(), operation: admissionv1beta1.Update, + expectedAllowed: false, }, { name: "vsphere platform immutable if not being zonal-ified (zonal shape)", oldObject: func() *hivev1.ClusterDeployment { cd := validVSphereClusterDeployment() cd.Spec.Platform.VSphere.Infrastructure.ClusterOSImage = "changed" return cd }(), newObject: validVSphereClusterDeployment(), operation: admissionv1beta1.Update, + expectedAllowed: false, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/validating-webhooks/hive/v1/clusterdeployment_validating_admission_hook_test.go` around lines 1901 - 1926, Add explicit expectedAllowed: false to the three test case entries named "vsphere platform cannot be de-zonal-ified", "vsphere platform immutable if not being zonal-ified (deprecated shape)", and "vsphere platform immutable if not being zonal-ified (zonal shape)" in the test table in clusterdeployment_validating_admission_hook_test.go; locate each case (they use oldObject/newObject with validVSphereClusterDeployment() or deprecatedVSphereClusterDeployment() and operation admissionv1beta1.Update) and add the field expectedAllowed: false to make the intended failure expectation explicit and improve readability.pkg/clusterresource/builder_test.go (1)
145-175: Assert the generated vSphereInfrastructuretoo.This fixture now exercises the new
VSphereCloudBuilder.Infrastructurepath, but the paired test still only checks Secret refs. A regression that drops or rewritescd.Spec.Platform.VSphere.Infrastructurewould still pass. Please add at least one assertion for the propagatedVCenters,FailureDomains, and VIPs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/clusterresource/builder_test.go` around lines 145 - 175, Add assertions in the test that verify the generated ClusterDeployment/ClusterSpec Platform.VSphere.Infrastructure was populated from the fixture: assert cd.Spec.Platform.VSphere.Infrastructure.VCenters contains an entry with Server "test", Port 123 and Datacenters ["test"]; assert Infrastructure.FailureDomains contains a domain with Name "test", Region "test", Zone "test" and Topology.Datacenter "test" (optionally check other Topology fields like ComputeCluster); and assert Infrastructure.APIVIPs and IngressVIPs equal ["192.168.0.2"] and ["192.168.0.3"] respectively. Locate the assertions near the existing Secret-ref checks that currently validate the VSphereCloudBuilder.Infrastructure path (use the test's cd or cluster result variable) and add strict equality checks for these fields.contrib/pkg/createcluster/create.go (2)
772-831: Redundant credential defaulting loop.Lines 824-831 duplicate the credential defaulting logic from lines 773-780. When using the JSON platform spec path, both loops execute on the same
platform.VCentersslice. When using the legacy path,platform.VCentersis empty, making the second loop a no-op.Consider removing one of the loops to avoid confusion.
♻️ Suggested consolidation
if len(platformBytes) > 0 { err = json.Unmarshal(platformBytes, &platform) if err != nil { return nil, fmt.Errorf("error decoding platform %s: %w", o.VSpherePlatformSpecJSON, err) } - - // Set credentials on VCenters if using new structure - for i := range platform.VCenters { - if platform.VCenters[i].Username == "" { - platform.VCenters[i].Username = vsphereUsername - } - if platform.VCenters[i].Password == "" { - platform.VCenters[i].Password = vspherePassword - } - } } else { // ... legacy flags handling ... } + // Set credentials on VCenters (applies to JSON path; no-op for legacy) for i := range platform.VCenters { if platform.VCenters[i].Username == "" { platform.VCenters[i].Username = vsphereUsername } if platform.VCenters[i].Password == "" { platform.VCenters[i].Password = vspherePassword } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@contrib/pkg/createcluster/create.go` around lines 772 - 831, There are two identical loops setting platform.VCenters[i].Username/Password from vsphereUsername and vspherePassword (iterating platform.VCenters) which is duplicated; remove the redundant loop and keep a single credential-defaulting loop for platform.VCenters (use the existing loop that runs when the JSON platform spec is processed or consolidate into one after the platform population logic), ensuring the code still sets Username/Password for each entry in platform.VCenters and that legacy fields (platform.Deprecated*) remain untouched.
833-838: Single-element VIP slices may be incomplete for multi-VIP scenarios.When
platform.APIVIPsorplatform.IngressVIPsare empty, the code defaults to a single-element slice from the legacy flags. This works for single-VIP setups but may not align with zonal deployments requiring multiple VIPs.Consider whether this defaulting behavior should warn users or validate VIP count against failure domains.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@contrib/pkg/createcluster/create.go` around lines 833 - 838, The existing fallback that sets platform.APIVIPs = []string{o.VSphereAPIVIP} and platform.IngressVIPs = []string{o.VSphereIngressVIP} can produce an insufficient single-VIP configuration for multi-VIP (zonal) setups; update the code where platform.APIVIPs/IngressVIPs are defaulted to first legacy VIP to first check the configured failure domain count and either (a) log a clear warning (using the repo logger) when only one VIP is being injected but multiple failure domains exist, or (b) fail validation by returning an error indicating the user must supply one VIP per failure domain; reference platform.APIVIPs, platform.IngressVIPs, o.VSphereAPIVIP and o.VSphereIngressVIP in your change and perform the count comparison against the cluster failure domains before accepting the single-VIP fallback.pkg/clusterresource/vsphere.go (1)
35-44:NewVSphereCloudBuilderFromSecretdoes not initialize Infrastructure.This constructor only sets
Username,Password, andCACert, leavingInfrastructurenil. Callers using this constructor will need to setInfrastructureseparately, otherwiseGetCloudPlatformandaddInstallConfigPlatformwill produce incomplete configurations.Consider documenting this expectation or adding an
Infrastructureparameter.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/clusterresource/vsphere.go` around lines 35 - 44, NewVSphereCloudBuilderFromSecret currently returns a VSphereCloudBuilder with Username, Password and CACert but leaves Infrastructure nil; change the constructor to accept an Infrastructure parameter (e.g. add infrastructure *configv1.Infrastructure to NewVSphereCloudBuilderFromSecret’s signature) and set the returned VSphereCloudBuilder.Infrastructure = infrastructure (or otherwise populate Infrastructure appropriately) so callers of GetCloudPlatform and addInstallConfigPlatform receive a fully-initialized VSphereCloudBuilder.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/crds/hive.openshift.io_clusterdeployments.yaml`:
- Around line 1637-1662: The CRD schema incorrectly sets format: ipv6 for the
gateway, ipAddrs, and nameservers properties (symbols: gateway, ipAddrs,
nameservers) which rejects IPv4; fix by removing the single format: ipv6 entries
for these fields so no single-format validation is enforced, or replace them
with a pattern-based validation that matches both IPv4 and IPv6/CIDR formats
(use a single regex for addresses/CIDRs), and/or adjust controller-gen
configuration so it does not collapse multiple +kubebuilder:validation:Format
annotations into a single format; update the YAML for the three properties
accordingly to allow both IPv4 and IPv6.
In `@config/crds/hive.openshift.io_clusterdeprovisions.yaml`:
- Around line 355-368: The schema currently marks vCenters as required which
blocks legacy objects that only set vCenter; make vCenters optional in the CRD
so backward-compatible ClusterDeprovision objects using the legacy vCenter field
are accepted, and move the validation that requires either a legacy vCenter or a
non-empty vCenters array out of the CRD and into higher-level validation logic
(controller/validator) that checks "vCenter OR non-empty vCenters" at runtime;
update/remove vCenters from the required list and ensure the controller enforces
the combined validation.
In `@config/crds/hive.openshift.io_clusterpools.yaml`:
- Around line 1263-1288: The schema for the host-networking fields is too
restrictive: update the gateway and nameservers properties to accept both IPv4
and IPv6 by replacing their single "format: ipv6" with an anyOf combining
"format: ipv4" and "format: ipv6" for the string items (refer to the gateway and
nameservers property definitions), and change ipAddrs from "format: ipv6" to a
CIDR-aware format by using "format: cidr" for the array items (refer to the
ipAddrs property) so the schema matches the documented examples and accepts
IPv4, IPv6 and CIDR values.
- Around line 955-963: The OpenAPI `format: ip` is applied to the array node for
apiVIPs and ingressVIPs so individual entries aren’t validated; update the CRD
source markers for spec.platform.vsphere.infrastructure.apiVIPs and
spec.platform.vsphere.infrastructure.ingressVIPs so that `format: ip` is moved
from the array level into the `items` schema (i.e., ensure items declares type:
string and format: ip) and remove the top-level format from the array
declaration so controller-gen emits per-element IP validation.
In `@contrib/pkg/deprovision/vsphere.go`:
- Around line 50-72: After calling vspherecreds.ConfigureCreds(client,
o.metadata) ensure the legacy-path guard: check
o.metadata.ClusterPlatformMetadata.VSphere (or o.metadata.VSphere) and that
o.metadata.VSphere.VCenters has length > 0; if not, return a clear error (e.g.,
wrap with "no vCenter configured: provide --vsphere-vcenters or use new-style
creds Secret") so the command fails fast when the legacy credentials path
produced an empty VCenters slice; place this validation immediately after
ConfigureCreds() and reference the
o.vCenters/ConfigureCreds/metadata.VSphere.VCenters symbols when making the
change.
In `@hack/app-sre/saas-template.yaml`:
- Around line 3083-3129: The vCenter credentials are being stored directly in
the CR spec via infrastructure.vcenters[*].user and
infrastructure.vcenters[*].password which leaks secrets; remove those fields
from the schema and instead reference an existing credentialsSecretRef (or add a
credentialsSecretRef field if missing) so the CR only contains a reference to a
Kubernetes Secret; update the vcenters schema (the object under vcenters → items
→ properties) to delete or mark user/password as deprecated and document using
credentialsSecretRef, and update any validation/required lists (and related
ClusterDeployment/ClusterPool references) to ensure credentials are retrieved
from the referenced Secret rather than stored in the CR.
- Around line 2605-2627: The failureDomains array allows duplicate names which
makes hosts[].failureDomain (and VSpherePlatformDeploymentZone references)
ambiguous; change the failureDomains list to a keyed map by adding Kubernetes
list-type metadata (e.g. add x-kubernetes-list-type: "map" and
x-kubernetes-list-map-keys: ["name"] on the failureDomains schema) so entries
are keyed by the name property, or alternatively add an explicit uniqueness
constraint keyed on name; apply the same change to the other occurrences you
noted (around the other specified ranges) so all
VSpherePlatformFailureDomainSpec definitions enforce unique name keys.
- Around line 4958-4972: The schema currently marks vCenters as required which
breaks legacy manifests that only set the deprecated vCenter; update the schema
so validation accepts either the singular vCenter or the plural vCenters (e.g.,
remove vCenters from the required list or replace the required array with a
oneOf/anyOf constraint that allows either vCenter (string) or vCenters (array of
string)); ensure you reference and adjust the vCenter and vCenters property
definitions and the required list/validation block so older ClusterDeprovision
manifests validate successfully.
- Around line 2835-2873: The CRD schema currently restricts gateway, ipAddrs
items, and nameservers items to format: ipv6 only; change each of those string
schemas (gateway, the items schema under ipAddrs, and the items schema under
nameservers) to accept both IPv4 and IPv6 by replacing the single "format: ipv6"
with an anyOf/oneOf that allows format: ipv4 and format: ipv6 (e.g., anyOf: -
type: string; format: ipv4 - type: string; format: ipv6), and apply the same
change to the second occurrence noted in the comment so validation accepts
either family while keeping examples intact.
In `@pkg/controller/clusterdeployment/installconfigvalidation.go`:
- Around line 67-72: The credential check is inconsistent: hasCreds is set true
when either ic.Platform.VSphere.DeprecatedUsername OR DeprecatedPassword is
non-empty (using ||) but for each ic.Platform.VSphere.VCenters it requires both
vcenter.Username AND vcenter.Password (using &&); change the deprecated-path
check to require both DeprecatedUsername and DeprecatedPassword (use &&) so
hasCreds is true only when both credentials are present, matching the VCenters
validation logic (refer to hasCreds, ic.Platform.VSphere.DeprecatedUsername,
DeprecatedPassword, ic.Platform.VSphere.VCenters, vcenter.Username,
vcenter.Password).
In `@pkg/controller/clusterpool/clusterpool_controller.go`:
- Around line 322-331: The migration updates deprecated vSphere fields which
changes the value hashed by calculatePoolVersion(), causing legacy members to
appear stale; fix by normalizing vSphere fields before any pool-version is
computed or by rewriting member annotations during the migration: either (A)
ensure vsphereutils.ConvertDeprecatedFields(clp.Spec.Platform.VSphere) is
invoked (on a DeepCopy of the ClusterPool as you already do) prior to any call
to calculatePoolVersion() so existing hashing uses the normalized Platform
(search for calculatePoolVersion()), or (B) after r.Update(ctx, clp) iterate the
ClusterPool members and overwrite their pool-version annotations with the
recalculated hash (use the same calculatePoolVersion() and the same annotation
key used elsewhere) so members and pool stay consistent.
In `@pkg/controller/machinepool/vsphereactuator.go`:
- Around line 51-65: The code builds an installertypes.InstallConfig named ic by
assigning cd.Spec.Platform.VSphere.Infrastructure directly to ic.VSphere, then
mutates ic.VSphere.FailureDomains which also mutates the shared
ClusterDeployment object; to fix, replace the direct assignment with a deep copy
of cd.Spec.Platform.VSphere.Infrastructure (use the type's DeepCopy method or
create a copy struct) before assigning to ic.VSphere so subsequent mutations to
failureDomain.Topology.ResourcePool and failureDomain.Topology.TagIDs (based on
pool.Spec.Platform.VSphere.ResourcePool and pool.Spec.Platform.VSphere.TagIDs)
only affect the local ic and not cd.
In `@pkg/controller/utils/credentials.go`:
- Around line 84-117: The code currently succeeds if it validates only the
vcenters it could materialize; change the logic in the validation path that
builds the vcenters slice so the function fails if the resolved credential set
is incomplete: after constructing vcenters (both the "vcenters" yaml path and
the legacy projection path), ensure vcenters is non-empty and contains
credentials for every entry in cd.Spec.Platform.VSphere.Infrastructure.VCenters
(match by Server/VCenter name), returning an error if any configured VCenter is
missing or if required secret keys
(constants.UsernameSecretKey/constants.PasswordSecretKey) are absent; only then
iterate and call validateVSphereCredentials for each entry and return success.
In `@pkg/creds/vsphere/vsphere.go`:
- Around line 34-42: The old-style credentials branch assings username/password
into metadata.VSphere.VCenters without checking metadata.VSphere for nil,
risking a panic; update the code handling the old-style creds (the block that
reads credsSecret.Data[constants.UsernameSecretKey] and
constants.PasswordSecretKey and loops over metadata.VSphere.VCenters) to first
verify metadata.VSphere is non-nil (and if it is nil either return a clear error
or initialize metadata.VSphere with an appropriate VCenters slice, matching the
post-zonal defensiveness), then proceed to set each VCenter's Username and
Password.
In `@pkg/installmanager/installmanager.go`:
- Around line 739-752: The code iterates over
cd.Spec.Platform.VSphere.Infrastructure.VCenters without checking that
Infrastructure is non-nil which can panic; update the block that builds vcenters
(where vcenters is made and populated and where installertypes.ClusterMetadata /
installertypesvsphere.Metadata are constructed) to first check if
cd.Spec.Platform.VSphere.Infrastructure != nil and only range over
Infrastructure.VCenters when non-nil, otherwise leave vcenters as an empty slice
(or set VCenters to nil/empty) so creation of installertypes.ClusterMetadata and
installertypesvsphere.Metadata is safe.
In
`@pkg/validating-webhooks/hive/v1/clusterdeployment_validating_admission_hook.go`:
- Around line 671-678: The current change overwrites
oldObject.Spec.Platform.VSphere with the incoming struct which masks unrelated
immutable edits; instead, create a normalized copy of the old vSphere struct
(e.g., normalizedOldVSphere := oldObject.Spec.Platform.VSphere.DeepCopy() or
equivalent), set only its Infrastructure field to
cd.Spec.Platform.VSphere.Infrastructure, and pass that normalized copy into the
immutability check (hasChangedImmutableField or whatever call uses the old
value) rather than assigning cd.Spec.Platform.VSphere back onto
oldObject.Spec.Platform.VSphere; this preserves original credential/certificate
and other immutable fields for comparison while allowing the zonal
Infrastructure migration.
- Around line 535-540: The validation currently conflates legacy (pre-zonal) and
new shaped vSphere specs; change the logic so that when vsphere.Infrastructure
is nil or empty you validate the legacy topology fields (e.g. require
vsphere.DeprecatedVCenter and its legacy fields), and only when
vsphere.Infrastructure is provided/has entries enforce the new shape (require
vsphere.Infrastructure.VCenters). Update the conditional around
vsphere.DeprecatedVCenter / vsphere.Infrastructure to branch: if Infrastructure
== nil or len(VCenters)==0 -> run legacy-required checks (using
vsphere.DeprecatedVCenter and legacy field paths via vspherePath), else ->
require vsphere.Infrastructure.VCenters; keep references to
vsphere.DeprecatedVCenter, vsphere.Infrastructure,
vsphere.Infrastructure.VCenters and vspherePath to locate and adjust the checks.
In `@pkg/validating-webhooks/hive/v1/clusterpool_validating_admission_hook.go`:
- Around line 237-245: The early return inside the HIVE-2391 block
short-circuits update validation and skips validateClusterPoolPlatform; instead,
after detecting the legacy→zonal transition
(oldObject.Spec.Platform.VSphere.Infrastructure == nil &&
newObject.Spec.Platform.VSphere.Infrastructure != nil) keep the debug log but
call validateClusterPoolPlatform for the new object and return its
AdmissionResponse (or combine its result) rather than immediately returning
Allowed: true — update the logic around the HIVE-2391 check so
validateClusterPoolPlatform(newObject, ...) is invoked before allowing the
change.
---
Outside diff comments:
In `@pkg/creds/vsphere/vsphere.go`:
- Around line 18-20: Update the ConfigureCreds docstring to reflect the current
behavior: instead of saying it "configures VSphere credential environment
variables," state that ConfigureCreds loads secrets and populates
metadata.VSphere.VCenters (and any related metadata fields) with the vCenter
connection information; mention which environment variables it reads
(CLUSTERDEPLOYMENT_NAMESPACE, CREDS_SECRET_NAME, CERTS_SECRET_NAME) and that it
writes into metadata.VSphere.VCenters rather than setting env vars so readers
can locate the changed behavior in the ConfigureCreds implementation.
---
Minor comments:
In `@config/crds/hive.openshift.io_clusterdeployments.yaml`:
- Around line 1684-1692: The schema for ingressVIPs wrongly applies "format: ip"
to the array node; move the format annotation to the array items like done for
apiVIPs so each string item has format: ip, i.e., remove or relocate the
top-level format under ingressVIPs and add format: ip under ingressVIPs.items
(the items.type: string node) to validate each IP entry correctly.
- Around line 1329-1337: The array-level `+kubebuilder:validation:Format=ip`
must be applied to the element type so the generated OpenAPI puts format inside
`items`; fix by defining a new named element type (e.g., type APIVIP string)
with the kubebuilder annotation `+kubebuilder:validation:Format=ip` on that
type, then change the field signature from `APIVIPs []string` to `APIVIPs
[]APIVIP` (keep existing min/max item annotations on the slice); update any
imports/usage accordingly so controller-gen emits `items.format: ip`.
In `@config/crds/hive.openshift.io_clusterpools.yaml`:
- Around line 1345-1348: Update the deprecation descriptions to reference the
actual schema fields exposed: change the Network deprecation target from
FailureDomains.Topology.Network to failureDomains[].topology.networks, and
change the vcenter username deprecation target from VCenters.Username to
vcenters[].user; locate the description blocks for the Network field (currently
mentioning FailureDomains.Topology.Network) and the vcenter username field (the
block that mentions VCenters.Username) and replace the incorrect dotted paths
with the exact field names shown above so kubectl explain points to existing
fields.
In `@docs/hiveutil.md`:
- Around line 108-115: The table in docs/hiveutil.md incorrectly lists
--vsphere-api-vip and --vsphere-ingress-vip as required even though the JSON
flow (VSPHERE_INSTALLER_PLATFORM_SPEC_JSON / --vsphere-platform-spec-json) can
contain apiVIPs and ingressVIPs; update the table to either remove the two VIP
rows or add a clarifying note that --vsphere-api-vip and --vsphere-ingress-vip
are only required when VSPHERE_INSTALLER_PLATFORM_SPEC_JSON /
--vsphere-platform-spec-json does not include apiVIPs/ingressVIPs, referencing
the exact flags (--vsphere-platform-spec-json, --vsphere-api-vip,
--vsphere-ingress-vip) so readers understand the conditional requirement.
In `@hack/app-sre/saas-template.yaml`:
- Around line 2948-2953: Update the deprecated field path in the description for
the network field: change the migration target from
"FailureDomains.Topology.Network" to the actual schema path
"FailureDomains.Topology.networks" so kubectl explain points to the correct
field; locate the description text for the network field (symbol: network) and
replace the incorrect path, and make the same change for the other occurrence
referenced (the second network description block).
In `@pkg/install/generate.go`:
- Around line 892-903: The code builds joinedVCenters from
req.Spec.Platform.VSphere.VCenters / DeprecatedVCenter and always appends
"--vsphere-vcenters", joinedVCenters to args; handle the case where both are
empty by either returning a validation error or omitting the flag: check
req.Spec.Platform.VSphere.VCenters and
req.Spec.Platform.VSphere.DeprecatedVCenter and if both are empty do one of two
things (preferred: return an error from the generator indicating missing vCenter
info; alternative: only append "--vsphere-vcenters" to args when joinedVCenters
!= ""), updating the logic around joinedVCenters and the args slice (referencing
joinedVCenters, req.Spec.Platform.VSphere.VCenters, DeprecatedVCenter, and args)
so we never pass an empty "--vsphere-vcenters" value.
---
Nitpick comments:
In `@apis/hive/v1/clusterdeprovision_types.go`:
- Around line 135-140: The DeprecatedVCenter struct field currently uses
`json:"vCenter"` and will always be serialized even when empty; update its JSON
tag to include `omitempty` (i.e., change the tag for DeprecatedVCenter on the
struct to `json:"vCenter,omitempty"`) so empty deprecated values are omitted
from JSON output while leaving VCenters unchanged; locate the DeprecatedVCenter
field in the same struct definition (symbol: DeprecatedVCenter) and update only
the tag to preserve backward compatibility.
In `@apis/hive/v1/vsphere/platform.go`:
- Around line 14-16: Update the deprecation comment to reference the correct
function name/location: replace the incorrect reference to
Platform.ConvertDeprecatedFields with vsphereutils.ConvertDeprecatedFields (the
actual function in pkg/controller/utils/vsphereutils) in the comment above the
deprecated field; apply the same fix to any other deprecated-field comments that
mention Platform.ConvertDeprecatedFields so they consistently point to
vsphereutils.ConvertDeprecatedFields.
In `@contrib/pkg/createcluster/create.go`:
- Around line 772-831: There are two identical loops setting
platform.VCenters[i].Username/Password from vsphereUsername and vspherePassword
(iterating platform.VCenters) which is duplicated; remove the redundant loop and
keep a single credential-defaulting loop for platform.VCenters (use the existing
loop that runs when the JSON platform spec is processed or consolidate into one
after the platform population logic), ensuring the code still sets
Username/Password for each entry in platform.VCenters and that legacy fields
(platform.Deprecated*) remain untouched.
- Around line 833-838: The existing fallback that sets platform.APIVIPs =
[]string{o.VSphereAPIVIP} and platform.IngressVIPs =
[]string{o.VSphereIngressVIP} can produce an insufficient single-VIP
configuration for multi-VIP (zonal) setups; update the code where
platform.APIVIPs/IngressVIPs are defaulted to first legacy VIP to first check
the configured failure domain count and either (a) log a clear warning (using
the repo logger) when only one VIP is being injected but multiple failure
domains exist, or (b) fail validation by returning an error indicating the user
must supply one VIP per failure domain; reference platform.APIVIPs,
platform.IngressVIPs, o.VSphereAPIVIP and o.VSphereIngressVIP in your change and
perform the count comparison against the cluster failure domains before
accepting the single-VIP fallback.
In `@pkg/clusterresource/builder_test.go`:
- Around line 145-175: Add assertions in the test that verify the generated
ClusterDeployment/ClusterSpec Platform.VSphere.Infrastructure was populated from
the fixture: assert cd.Spec.Platform.VSphere.Infrastructure.VCenters contains an
entry with Server "test", Port 123 and Datacenters ["test"]; assert
Infrastructure.FailureDomains contains a domain with Name "test", Region "test",
Zone "test" and Topology.Datacenter "test" (optionally check other Topology
fields like ComputeCluster); and assert Infrastructure.APIVIPs and IngressVIPs
equal ["192.168.0.2"] and ["192.168.0.3"] respectively. Locate the assertions
near the existing Secret-ref checks that currently validate the
VSphereCloudBuilder.Infrastructure path (use the test's cd or cluster result
variable) and add strict equality checks for these fields.
In `@pkg/clusterresource/vsphere.go`:
- Around line 35-44: NewVSphereCloudBuilderFromSecret currently returns a
VSphereCloudBuilder with Username, Password and CACert but leaves Infrastructure
nil; change the constructor to accept an Infrastructure parameter (e.g. add
infrastructure *configv1.Infrastructure to NewVSphereCloudBuilderFromSecret’s
signature) and set the returned VSphereCloudBuilder.Infrastructure =
infrastructure (or otherwise populate Infrastructure appropriately) so callers
of GetCloudPlatform and addInstallConfigPlatform receive a fully-initialized
VSphereCloudBuilder.
In `@pkg/controller/utils/vsphereutils/vsphereutils_test.go`:
- Around line 12-84: Add a new subtest in TestConvertDeprecatedFields that
pre-populates hivevsphere.Platform.Infrastructure with a non-nil
installervsphere.Platform value and asserts ConvertDeprecatedFields does not
modify it: construct a platform with Infrastructure set (instead of only
deprecated fields), set Expected to the same installervsphere.Platform instance,
call ConvertDeprecatedFields(platform), require.NoError, and compare
platform.Infrastructure to Expected to ensure the function is a no-op when
platform.Infrastructure is already present (reference symbols:
TestConvertDeprecatedFields, ConvertDeprecatedFields,
hivevsphere.Platform.Infrastructure, installervsphere.Platform).
In
`@pkg/validating-webhooks/hive/v1/clusterdeployment_validating_admission_hook_test.go`:
- Around line 1901-1926: Add explicit expectedAllowed: false to the three test
case entries named "vsphere platform cannot be de-zonal-ified", "vsphere
platform immutable if not being zonal-ified (deprecated shape)", and "vsphere
platform immutable if not being zonal-ified (zonal shape)" in the test table in
clusterdeployment_validating_admission_hook_test.go; locate each case (they use
oldObject/newObject with validVSphereClusterDeployment() or
deprecatedVSphereClusterDeployment() and operation admissionv1beta1.Update) and
add the field expectedAllowed: false to make the intended failure expectation
explicit and improve readability.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 945ad764-0420-42da-ad48-2747a8dae741
⛔ Files ignored due to path filters (17)
apis/go.sumis excluded by!**/*.sumapis/hive/v1/vsphere/zz_generated.deepcopy.gois excluded by!**/zz_generated*apis/hive/v1/zz_generated.deepcopy.gois excluded by!**/zz_generated*apis/vendor/github.com/openshift/installer/LICENSEis excluded by!**/vendor/**apis/vendor/github.com/openshift/installer/NOTICEis excluded by!**/vendor/**apis/vendor/github.com/openshift/installer/pkg/types/vsphere/OWNERSis excluded by!**/vendor/**apis/vendor/github.com/openshift/installer/pkg/types/vsphere/doc.gois excluded by!**/vendor/**apis/vendor/github.com/openshift/installer/pkg/types/vsphere/machinepool.gois excluded by!**/vendor/**apis/vendor/github.com/openshift/installer/pkg/types/vsphere/metadata.gois excluded by!**/vendor/**apis/vendor/github.com/openshift/installer/pkg/types/vsphere/platform.gois excluded by!**/vendor/**apis/vendor/github.com/openshift/installer/pkg/types/vsphere/zz_generated.deepcopy.gois excluded by!**/vendor/**,!**/zz_generated*apis/vendor/modules.txtis excluded by!**/vendor/**vendor/github.com/openshift/hive/apis/hive/v1/clusterdeprovision_types.gois excluded by!**/vendor/**,!vendor/**vendor/github.com/openshift/hive/apis/hive/v1/vsphere/machinepools.gois excluded by!**/vendor/**,!vendor/**vendor/github.com/openshift/hive/apis/hive/v1/vsphere/platform.gois excluded by!**/vendor/**,!vendor/**vendor/github.com/openshift/hive/apis/hive/v1/vsphere/zz_generated.deepcopy.gois excluded by!**/vendor/**,!vendor/**,!**/zz_generated*vendor/github.com/openshift/hive/apis/hive/v1/zz_generated.deepcopy.gois excluded by!**/vendor/**,!vendor/**,!**/zz_generated*
📒 Files selected for processing (33)
apis/go.modapis/hive/v1/clusterdeprovision_types.goapis/hive/v1/vsphere/machinepools.goapis/hive/v1/vsphere/platform.goconfig/crds/hive.openshift.io_clusterdeployments.yamlconfig/crds/hive.openshift.io_clusterdeprovisions.yamlconfig/crds/hive.openshift.io_clusterpools.yamlconfig/crds/hive.openshift.io_machinepools.yamlcontrib/pkg/createcluster/create.gocontrib/pkg/deprovision/vsphere.godocs/hiveutil.mddocs/using-hive.mdgo.modhack/app-sre/saas-template.yamlhack/e2e-common.shpkg/clusterresource/builder_test.gopkg/clusterresource/vsphere.gopkg/controller/clusterdeployment/clusterdeployment_controller.gopkg/controller/clusterdeployment/installconfigvalidation.gopkg/controller/clusterdeployment/installconfigvalidation_test.gopkg/controller/clusterpool/clusterpool_controller.gopkg/controller/machinepool/vsphereactuator.gopkg/controller/machinepool/vsphereactuator_test.gopkg/controller/utils/credentials.gopkg/controller/utils/vsphereutils/vsphere.gopkg/controller/utils/vsphereutils/vsphereutils_test.gopkg/creds/vsphere/vsphere.gopkg/install/generate.gopkg/installmanager/installmanager.gopkg/validating-webhooks/hive/v1/clusterdeployment_validating_admission_hook.gopkg/validating-webhooks/hive/v1/clusterdeployment_validating_admission_hook_test.gopkg/validating-webhooks/hive/v1/clusterpool_validating_admission_hook.gopkg/validating-webhooks/hive/v1/machinepool_validating_admission_hook_test.go
| vCenter: | ||
| description: VCenter is the vSphere vCenter hostname. | ||
| description: |- | ||
| DeprecatedVCenter is the vSphere vCenter hostname. | ||
| Deprecated: use VCenters instead. | ||
| type: string | ||
| vCenters: | ||
| description: VCenters are potentially multiple vCenter hostnames. Prefer this field over VCenter. | ||
| items: | ||
| type: string | ||
| type: array | ||
| required: | ||
| - certificatesSecretRef | ||
| - credentialsSecretRef | ||
| - vCenter | ||
| - vCenters |
There was a problem hiding this comment.
Don't make vCenters required while vCenter is still a supported compatibility field.
The type/controller still carry vCenter for backward compatibility, but this schema now rejects any ClusterDeprovision that only has the legacy field. That can break updates or status writes for older in-flight deprovisions after the CRD rolls out, and an empty vCenters array would still satisfy the new requirement. Please keep vCenters optional here and validate “legacy vCenter or non-empty vCenters” at a higher layer.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/crds/hive.openshift.io_clusterdeprovisions.yaml` around lines 355 -
368, The schema currently marks vCenters as required which blocks legacy objects
that only set vCenter; make vCenters optional in the CRD so backward-compatible
ClusterDeprovision objects using the legacy vCenter field are accepted, and move
the validation that requires either a legacy vCenter or a non-empty vCenters
array out of the CRD and into higher-level validation logic
(controller/validator) that checks "vCenter OR non-empty vCenters" at runtime;
update/remove vCenters from the required list and ensure the controller enforces
the combined validation.
There was a problem hiding this comment.
Good find, will fix.
@jianping-shu this should flush out in your upgrade testing as well. It would be a little tricky to reproduce though: you would need to have a ClusterDeprovision present on the cluster through the upgrade. This probably means making a deprov get "stuck" before performing the upgrade, and then making it "unstuck" afterward so it hits one of these four status condition updates: 1, 2, 3, 4. Since the vCenters field is marked required but the legacy ClusterDeprovision doesn't have it, the status update should fail and emit a log message accordingly.
...that's assuming you get around to testing this before I fix it :)
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
| apiVIPs: | ||
| description: |- | ||
| APIVIPs contains the VIP(s) for the api endpoint. In dual stack clusters | ||
| it contains an IPv4 and IPv6 address, otherwise only one VIP | ||
| format: ip | ||
| items: | ||
| type: string | ||
| maxItems: 2 | ||
| type: array |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Kubernetes CRD OpenAPI v3 schemas, does setting formaton anarrayvalidate each array element, or must the format be defined underitems? Please cite the relevant Kubernetes/OpenAPI documentation.
💡 Result:
In Kubernetes CRD OpenAPI v3 schemas, format does not validate each element when placed on the array schema itself. Array element validation (including format) must be defined under items.
- Kubernetes structural schema rules require that for an array node, the schema for each item is specified via
items(that’s where the item type/constraints live). [1] - OpenAPI/JSON Schema keywords and
formatvalues apply to the instance at that schema location and many formats only apply to specific JSON types (e.g.,date-timeapplies to strings; other types are treated as automatically valid). So puttingformat: date-timeon a schema whose instance is an array won’t validate the elements. [2] - OpenAPI documentation for arrays states that
itemsis required and that the value ofitemsis a schema that describes the type and format of array items. [3]
So for “array of date-time strings”, use:
type: array
items:
type: string
format: date-timeSources:
[1] Kubernetes CRD structural schema rules (arrays use items): (kubernetes.io)
[2] OpenAPI 3.1.2 (format applicability by JSON type; non-applicable types are valid): (spec.openapis.org)
[3] OpenAPI arrays (items required; items describes type/format of array items): (swagger.io)
Citations:
- 1: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/
- 2: https://spec.openapis.org/oas/v3.1.2.html
- 3: https://swagger.io/docs/specification/v3_0/data-models/data-types/
Move format: ip from array to items level for proper per-element validation.
The apiVIPs and ingressVIPs fields attach format: ip to the array schema itself, but array elements are still unconstrained strings. Per OpenAPI/Kubernetes CRD specification, format constraints on the array node do not validate individual elements; the format must be defined under items to constrain each entry. This leaves both fields effectively unvalidated at CRD admission time. Update the source API type markers so controller-gen emits the constraint at the correct schema level:
items:
type: string
format: ipAffects: spec.platform.vsphere.infrastructure.apiVIPs (lines 955–963) and spec.platform.vsphere.infrastructure.ingressVIPs (lines 1310–1318).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/crds/hive.openshift.io_clusterpools.yaml` around lines 955 - 963, The
OpenAPI `format: ip` is applied to the array node for apiVIPs and ingressVIPs so
individual entries aren’t validated; update the CRD source markers for
spec.platform.vsphere.infrastructure.apiVIPs and
spec.platform.vsphere.infrastructure.ingressVIPs so that `format: ip` is moved
from the array level into the `items` schema (i.e., ensure items declares type:
string and format: ip) and remove the top-level format from the array
declaration so controller-gen emits per-element IP validation.
| gateway: | ||
| description: |- | ||
| gateway is an IPv4 or IPv6 address which represents the subnet gateway, | ||
| for example, 192.168.1.1. | ||
| format: ipv6 | ||
| type: string | ||
| ipAddrs: | ||
| description: |- | ||
| ipAddrs is a list of one or more IPv4 and/or IPv6 addresses and CIDR to assign to | ||
| this device, for example, 192.168.1.100/24. IP addresses provided via ipAddrs are | ||
| intended to allow explicit assignment of a machine's IP address. | ||
| example: 2001:DB8:0000:0000:244:17FF:FEB6:D37D/64 | ||
| format: ipv6 | ||
| items: | ||
| type: string | ||
| type: array | ||
| nameservers: | ||
| description: |- | ||
| nameservers is a list of IPv4 and/or IPv6 addresses used as DNS nameservers, for example, | ||
| 8.8.8.8. a nameserver is not provided by a fulfilled IPAddressClaim. If DHCP is not the | ||
| source of IP addresses for this network device, nameservers should include a valid nameserver. | ||
| example: 8.8.8.8 | ||
| format: ipv6 | ||
| items: | ||
| type: string | ||
| type: array |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For Kubernetes CRD/OpenAPI schemas, what is the correct way to represent fields that accept IPv4 or IPv6 addresses and IPv4/IPv6 CIDR strings? Does format: ipv6 allow IPv4 values or CIDR notation?
💡 Result:
format: ipv6is for a single IPv6 address only (per RFC 4291) and does not include IPv4 values and does not allow CIDR notation (no/64, etc.). [1]- In Kubernetes CRD OpenAPI v3 schemas, the API server validates these formats as:
ipv4/ipv6via Gonet.ParseIP, andcidrvia Gonet.ParseCIDR. [2]- So CIDR strings should use
format: cidr, notipv4/ipv6. [2]
- So CIDR strings should use
Recommended CRD schema patterns
1) Field accepts “IPv4 or IPv6 address” (no CIDR):
type: string
anyOf:
- format: ipv4
- format: ipv6(Or, if you’re using Kubernetes’ declarative validation markers, use the Kubernetes-specific format k8s-ip for “IPv4 or IPv6”.) [3]
2) Field accepts “IPv4 or IPv6 CIDR”:
type: string
format: cidrIf you need tighter rules (e.g., require IPv4-only CIDR vs IPv6-only CIDR), use an additional validation rule (CEL). Kubernetes CEL includes CIDR parsing helpers (for example, cidr('192.168.0.0/16'), cidr('::1/128')). [4]
Answer to your direct question: No—format: ipv6 does not allow IPv4 values, and it does not allow CIDR notation; use format: cidr for CIDRs. [1] [2]
Sources: [1] [2] [3] [4]
Fix host-networking field formats to accept documented value types.
The gateway, ipAddrs, and nameservers fields use format: ipv6, which validates only single IPv6 addresses. However:
gatewayis documented to accept "IPv4 or IPv6 address" with example192.168.1.1(IPv4)nameserversis documented to accept "IPv4 and/or IPv6 addresses" with example8.8.8.8(IPv4)ipAddrsis documented to accept "IPv4 and/or IPv6 addresses and CIDR" with examples including IPv4 CIDR (192.168.1.100/24)
Use anyOf: [format: ipv4, format: ipv6] for gateway and nameservers, and format: cidr for ipAddrs to match the documented behavior and accept valid static networking configurations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/crds/hive.openshift.io_clusterpools.yaml` around lines 1263 - 1288,
The schema for the host-networking fields is too restrictive: update the gateway
and nameservers properties to accept both IPv4 and IPv6 by replacing their
single "format: ipv6" with an anyOf combining "format: ipv4" and "format: ipv6"
for the string items (refer to the gateway and nameservers property
definitions), and change ipAddrs from "format: ipv6" to a CIDR-aware format by
using "format: cidr" for the array items (refer to the ipAddrs property) so the
schema matches the documented examples and accepts IPv4, IPv6 and CIDR values.
| o.metadata = &types.ClusterMetadata{ | ||
| InfraID: args[0], | ||
| ClusterPlatformMetadata: types.ClusterPlatformMetadata{ | ||
| VSphere: &typesvsphere.Metadata{ | ||
| // This entire slice will be replaced via Unmarshal() if new-style creds Secret is in play. | ||
| // Otherwise it will be populated with copies of the single username/password from the | ||
| // old-style creds Secret. | ||
| VCenters: make([]typesvsphere.VCenters, len(o.vCenters)), | ||
| }, | ||
| }, | ||
| } | ||
| for i, vCenter := range o.vCenters { | ||
| o.metadata.VSphere.VCenters[i] = typesvsphere.VCenters{ | ||
| VCenter: vCenter, | ||
| } | ||
| } | ||
|
|
||
| client, err := utils.GetClient("hiveutil-deprovision-vsphere") | ||
| if err != nil { | ||
| return errors.Wrap(err, "failed to get client") | ||
| } | ||
| vspherecreds.ConfigureCreds(client, nil) | ||
|
|
||
| return nil | ||
| } | ||
| vspherecreds.ConfigureCreds(client, o.metadata) |
There was a problem hiding this comment.
Fail fast when the legacy credentials path still has no vCenter list.
ConfigureCreds() only copies username/password onto the metadata.VSphere.VCenters entries that already exist. If the Secret is still old-style and --vsphere-vcenters was omitted, this slice stays empty and the destroyer fails much later with a much less actionable error. Please validate that at least one vCenter is present immediately after ConfigureCreds().
Suggested guard
vspherecreds.ConfigureCreds(client, o.metadata)
+
+ if o.metadata.VSphere == nil || len(o.metadata.VSphere.VCenters) == 0 {
+ return errors.New("must provide at least one vCenter via --vsphere-vcenters or the credentials secret")
+ }
return nil
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@contrib/pkg/deprovision/vsphere.go` around lines 50 - 72, After calling
vspherecreds.ConfigureCreds(client, o.metadata) ensure the legacy-path guard:
check o.metadata.ClusterPlatformMetadata.VSphere (or o.metadata.VSphere) and
that o.metadata.VSphere.VCenters has length > 0; if not, return a clear error
(e.g., wrap with "no vCenter configured: provide --vsphere-vcenters or use
new-style creds Secret") so the command fails fast when the legacy credentials
path produced an empty VCenters slice; place this validation immediately after
ConfigureCreds() and reference the
o.vCenters/ConfigureCreds/metadata.VSphere.VCenters symbols when making the
change.
There was a problem hiding this comment.
@coderabbitai Please elaborate on
the destroyer fails much later with a much less actionable error.
What does this failure look like?
There was a problem hiding this comment.
🧩 Analysis chain
Let me look at both the vsphere destroyer and ConfigureCreds to trace the failure path.
🏁 Script executed:
#!/bin/bash
# Find the vsphere creds ConfigureCreds function
rg -n "ConfigureCreds" --type go -A 30
# Also look for the vsphere destroyer New function if vendored
fd -t f "vsphere.go" vendor/github.com/openshift/installer/pkg/destroy/vsphere/ 2>/dev/null | head -5
fd -t f "*.go" vendor/github.com/openshift/installer/pkg/destroy/vsphere/ 2>/dev/null | head -10
Length of output: 27966
🏁 Script executed:
#!/bin/bash
# Find ConfigureCreds implementation
ast-grep --pattern 'func ConfigureCreds($$$) {
$$$
}'
# Find vsphere destroyer New function in vendored code
fd -t f "*.go" --search-path vendor/github.com/openshift/installer/pkg/destroy 2>/dev/null | xargs grep -l "func New" 2>/dev/null | head -5
Length of output: 4692
🏁 Script executed:
#!/bin/bash
# Look for the vsphere destroy New function in vendor
find vendor/github.com/openshift/installer/pkg/destroy/vsphere -name "*.go" 2>/dev/null | head -10
# Try to find the vsphere New implementation
rg -n "func New" vendor/github.com/openshift/installer/pkg/destroy/vsphere/ 2>/dev/null || echo "Not vendored"
# Search for where VCenters is iterated/used in destroy logic
rg -n "VCenters" vendor/github.com/openshift/installer/ 2>/dev/null | grep -i "destroy\|new\|session\|connect\|login" | head -20
Length of output: 12
🏁 Script executed:
#!/bin/bash
# Search across vendor for vsphere session/connect using VCenters
rg -n "VCenters" --type go -C 3 | grep -A3 -B3 "session\|connect\|login\|NewManager\|vclient\|govmomi" | head -60
# Also check what vsphere.New does in installer (if vendored somewhere)
fd "vsphere.go" vendor/ 2>/dev/null | grep destroy | head -5
fd "*.go" vendor/ 2>/dev/null | xargs grep -l "func New.*ClusterMetadata" 2>/dev/null | grep vsphere | head -5
Length of output: 0
The installer's vsphere destroyer isn't vendored. Let me look it up directly.
@2uasimojo — here's the exact failure path, traced from the source.
The installer's vsphere.New() (pkg/destroy/vsphere/vsphere.go) handles metadata in two branches:
If len(metadata.VSphere.VCenters) > 0, it iterates the slice and calls NewClient(vsphere.VCenter, vsphere.Username, vsphere.Password, logger) for each entry.
Otherwise, it falls back to the old single-vcenter path and calls NewClient(metadata.VSphere.VCenter, metadata.VSphere.Username, metadata.VSphere.Password, logger).
So here's what happens in the empty-VCenters scenario you asked about:
-
--vsphere-vcentersis omitted →o.vCentersis[]string{}→metadata.VSphere.VCentersis an empty slice. -
ConfigureCreds()runs in the legacy branch: it copiesusername/passwordintometadata.VSphere.VCenters[i]entries — but since the slice is empty, the loop is a no-op.VCentersstays empty. -
In
vsphere.New(), thelen(...VCenters) > 0check is false, so it falls through to the legacy single-vcenter path and callsNewClient("", "", "", logger)— an empty hostname. -
NewClientattempts to dial a govmomi/vim25 SOAP connection to the empty vCenter URL, which produces something like:Post "https:///sdk": dial tcp: missing addressor an HTTP/TLS error with an empty host — returned from
vsphere.New()and fatally logged as"Runtime error".
The error message is about a network-level connection failure to an empty host, not "no vCenter configured" or "credentials missing." Someone debugging this would first suspect a cert/TLS issue or a bad network configuration, rather than immediately realising that --vsphere-vcenters was forgotten. That's the less-actionable error the original comment was referring to.
So the guard I suggested would short-circuit with a clear message before handing off to the installer destroyer, rather than letting it surface as a cryptic dial error from inside vsphere.New().
🌐 Web query:
openshift installer pkg/destroy/vsphere New function VCenters empty ClusterMetadata
💡 Result:
In pkg/destroy/vsphere, the destroyer’s New() currently assumes metadata.VSphere is non-nil and then chooses between:
- new-style multi-vCenter metadata:
metadata.VSphere.VCenters(array) - legacy single vCenter fields:
metadata.VSphere.VCenter/Username/Passwordwhen the array is empty (raw.githubusercontent.com)
But ClusterMetadata defines VSphere as a pointer (*vsphere.Metadata), so it can be nil (e.g., non-vSphere clusters / incomplete metadata.json) (pkg.go.dev). That’s where an “empty ClusterMetadata” / “VCenters empty” case can blow up.
A small helper function (often what people add as “New function VCenters”) should normalize these cases and let New() fail cleanly:
// VCenters returns all vCenter connection entries implied by ClusterMetadata,
// supporting both legacy single-vCenter and new vcenters[] formats.
func VCenters(metadata *installertypes.ClusterMetadata) ([]vsphere.VCenter, error) {
if metadata == nil || metadata.VSphere == nil {
return nil, fmt.Errorf("no vsphere platform configured in metadata")
}
if len(metadata.VSphere.VCenters) > 0 {
return metadata.VSphere.VCenters, nil
}
if metadata.VSphere.VCenter == "" {
return nil, fmt.Errorf("vsphere vCenter not set in metadata")
}
return []vsphere.VCenter{{
VCenter: metadata.VSphere.VCenter,
Username: metadata.VSphere.Username,
Password: metadata.VSphere.Password,
}}, nil
}Then New() can call VCenters(metadata) and iterate the returned list (instead of directly touching metadata.VSphere.VCenters). (raw.githubusercontent.com)
Citations:
- 1: https://raw.githubusercontent.com/openshift/installer/v1.4.17/pkg/destroy/vsphere/vsphere.go
- 2: https://pkg.go.dev/github.com/openshift/installer/pkg/types?utm_source=openai
- 3: https://raw.githubusercontent.com/openshift/installer/v1.4.17/pkg/destroy/vsphere/vsphere.go
🏁 Script executed:
#!/bin/bash
# Fetch the actual vsphere destroyer source to see the exact New() implementation
curl -s "https://raw.githubusercontent.com/openshift/installer/v1.4.17/pkg/destroy/vsphere/vsphere.go" | head -150
Length of output: 4407
Now let me check what NewClient does with an empty hostname.
🏁 Script executed:
#!/bin/bash
curl -s "https://raw.githubusercontent.com/openshift/installer/v1.4.17/pkg/destroy/vsphere/vsphere.go" | grep -A 30 "func NewClient"
Length of output: 0
🏁 Script executed:
#!/bin/bash
# NewClient is likely in another file in the same package
curl -s "https://api.github.com/repos/openshift/installer/contents/pkg/destroy/vsphere?ref=v1.4.17" | python3 -c "import json,sys; [print(f['name']) for f in json.load(sys.stdin)]"
Length of output: 67
🏁 Script executed:
#!/bin/bash
curl -s "https://raw.githubusercontent.com/openshift/installer/v1.4.17/pkg/destroy/vsphere/client.go" | grep -A 40 "func NewClient"
Length of output: 1166
🏁 Script executed:
#!/bin/bash
curl -s "https://api.github.com/repos/openshift/installer/contents/pkg/types/vsphere?ref=v1.4.17" | python3 -c "import json,sys; [print(f['name']) for f in json.load(sys.stdin) if isinstance(f, dict)]" 2>/dev/null
# Also search for CreateVSphereClients
curl -s "https://api.github.com/search/code?q=CreateVSphereClients+repo:openshift/installer" | python3 -c "import json,sys; d=json.load(sys.stdin); [print(i['path']) for i in d.get('items',[])]"
Length of output: 83
🏁 Script executed:
#!/bin/bash
# Search for CreateVSphereClients in the installer package
curl -s "https://raw.githubusercontent.com/openshift/installer/v1.4.17/pkg/destroy/vsphere/client.go" | head -30
# Also check what package vsphere is imported from in client.go
curl -s "https://raw.githubusercontent.com/openshift/installer/v1.4.17/pkg/destroy/vsphere/client.go" | grep -E "import|vsphere\." | head -20
Length of output: 1006
🏁 Script executed:
#!/bin/bash
curl -s "https://raw.githubusercontent.com/openshift/installer/v1.4.17/pkg/asset/installconfig/vsphere/client.go" 2>/dev/null | grep -A 40 "func CreateVSphereClients" || \
curl -s "https://api.github.com/repos/openshift/installer/contents/pkg/asset/installconfig/vsphere?ref=v1.4.17" | python3 -c "import json,sys; [print(f['name']) for f in json.load(sys.stdin) if isinstance(f,dict)]"
Length of output: 1252
@2uasimojo, here's the exact failure, traced through the full call stack.
The scenario: old-style creds Secret (no vcenters key), --vsphere-vcenters flag omitted.
The call chain:
o.vCentersis[]string{}→metadata.VSphere.VCentersis an empty slice.ConfigureCreds()enters the legacy branch and loops overmetadata.VSphere.VCenters[i]to stamp in username/password — but the loop body never executes because the slice is empty.vsphere.New()in the installer checkslen(metadata.VSphere.VCenters) > 0→ false, so it falls through to the legacy single-vCenter path:client, err := NewClient(metadata.VSphere.VCenter, metadata.VSphere.Username, metadata.VSphere.Password) // → NewClient("", "", "")
NewClient("", "", "")callsCreateVSphereClients(ctx, "", "", "")(pkg/asset/installconfig/vsphere/client.go):u, err := soap.ParseURL("") // "" → "https://" → URL{Host: ""} // ... c, err := govmomi.NewClient(ctx, u, false) // tries POST https:///sdk
- govmomi attempts a TCP connection to an empty hostname and returns an error like:
Post "https:///sdk": dial tcp: missing address - This propagates back to
Run()and is fatally logged as:FATAL msg="Runtime error" error="Post \"https:///sdk\": dial tcp: missing address"
Why it's less actionable: the error looks like a network/connectivity problem (bad URL, missing TLS cert, DNS failure), not a "you forgot to pass --vsphere-vcenters" problem. Someone debugging this in a deprovision job would first check network policies, certs, or the vCenter hostname — not the CLI invocation.
The guard I suggested fires before vsphere.New() is even called, with the message:
"must provide at least one vCenter via --vsphere-vcenters or the credentials secret"
That said, it's worth noting that this path only triggers if both (a) the creds Secret has no vcenters key and (b) --vsphere-vcenters was omitted. If you consider that combination a misconfiguration that's already handled upstream (e.g., the deprovision job always passes the flag or always uses new-style Secrets), the guard is a nice-to-have rather than a critical fix.
| } else { | ||
| // Old style creds Secret with flat username/password: copy the creds across all vcenters | ||
| username := strings.TrimSpace(string(credsSecret.Data[constants.UsernameSecretKey])) | ||
| password := strings.TrimSpace(string(credsSecret.Data[constants.PasswordSecretKey])) | ||
| // Caller pre-populated this with just the VCenter names | ||
| for i := range metadata.VSphere.VCenters { | ||
| metadata.VSphere.VCenters[i].Username = username | ||
| metadata.VSphere.VCenters[i].Password = password | ||
| } |
There was a problem hiding this comment.
Potential nil pointer dereference in old-style credentials path.
The post-zonal path (lines 28-30) defensively checks metadata.VSphere == nil before accessing VCenters, but this else branch lacks the same protection. If metadata.VSphere is nil when using old-style credentials, accessing metadata.VSphere.VCenters on line 39 will cause a panic.
🐛 Proposed fix to add consistent nil check
} else {
// Old style creds Secret with flat username/password: copy the creds across all vcenters
username := strings.TrimSpace(string(credsSecret.Data[constants.UsernameSecretKey]))
password := strings.TrimSpace(string(credsSecret.Data[constants.PasswordSecretKey]))
+ if metadata.VSphere == nil {
+ log.Fatal("old-style credentials require metadata.VSphere to be pre-populated with VCenter names")
+ }
// Caller pre-populated this with just the VCenter names
for i := range metadata.VSphere.VCenters {
metadata.VSphere.VCenters[i].Username = username
metadata.VSphere.VCenters[i].Password = password
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } else { | |
| // Old style creds Secret with flat username/password: copy the creds across all vcenters | |
| username := strings.TrimSpace(string(credsSecret.Data[constants.UsernameSecretKey])) | |
| password := strings.TrimSpace(string(credsSecret.Data[constants.PasswordSecretKey])) | |
| // Caller pre-populated this with just the VCenter names | |
| for i := range metadata.VSphere.VCenters { | |
| metadata.VSphere.VCenters[i].Username = username | |
| metadata.VSphere.VCenters[i].Password = password | |
| } | |
| } else { | |
| // Old style creds Secret with flat username/password: copy the creds across all vcenters | |
| username := strings.TrimSpace(string(credsSecret.Data[constants.UsernameSecretKey])) | |
| password := strings.TrimSpace(string(credsSecret.Data[constants.PasswordSecretKey])) | |
| if metadata.VSphere == nil { | |
| log.Fatal("old-style credentials require metadata.VSphere to be pre-populated with VCenter names") | |
| } | |
| // Caller pre-populated this with just the VCenter names | |
| for i := range metadata.VSphere.VCenters { | |
| metadata.VSphere.VCenters[i].Username = username | |
| metadata.VSphere.VCenters[i].Password = password | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/creds/vsphere/vsphere.go` around lines 34 - 42, The old-style credentials
branch assings username/password into metadata.VSphere.VCenters without checking
metadata.VSphere for nil, risking a panic; update the code handling the
old-style creds (the block that reads
credsSecret.Data[constants.UsernameSecretKey] and constants.PasswordSecretKey
and loops over metadata.VSphere.VCenters) to first verify metadata.VSphere is
non-nil (and if it is nil either return a clear error or initialize
metadata.VSphere with an appropriate VCenters slice, matching the post-zonal
defensiveness), then proceed to set each VCenter's Username and Password.
|
|
||
| vcenters := make([]installertypesvsphere.VCenters, 0, len(cd.Spec.Platform.VSphere.Infrastructure.VCenters)) | ||
| for _, vcenter := range cd.Spec.Platform.VSphere.Infrastructure.VCenters { | ||
| vcenters = append(vcenters, installertypesvsphere.VCenters{ | ||
| VCenter: vcenter.Server, | ||
| Username: vcenter.Username, | ||
| Password: vcenter.Password, | ||
| }) | ||
| } | ||
| metadata := &installertypes.ClusterMetadata{ | ||
| InfraID: infraID, | ||
| ClusterPlatformMetadata: installertypes.ClusterPlatformMetadata{ | ||
| VSphere: &installertypesvsphere.Metadata{ | ||
| VCenter: cd.Spec.Platform.VSphere.VCenter, | ||
| Username: vSphereUsername, | ||
| Password: vSpherePassword, | ||
| VCenters: vcenters, |
There was a problem hiding this comment.
Potential nil pointer dereference on Infrastructure field.
The code accesses cd.Spec.Platform.VSphere.Infrastructure.VCenters without verifying that Infrastructure is non-nil. If a ClusterDeployment uses deprecated fields and hasn't been migrated yet, this will panic.
🛡️ Proposed fix to add nil check
vSpherePassword := os.Getenv(constants.VSpherePasswordEnvVar)
if vSpherePassword == "" {
return fmt.Errorf("no %s env var set, cannot proceed", constants.VSpherePasswordEnvVar)
}
+ if cd.Spec.Platform.VSphere.Infrastructure == nil {
+ return errors.New("VSphere ClusterDeployment missing Infrastructure configuration")
+ }
+
vcenters := make([]installertypesvsphere.VCenters, 0, len(cd.Spec.Platform.VSphere.Infrastructure.VCenters))
for _, vcenter := range cd.Spec.Platform.VSphere.Infrastructure.VCenters {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| vcenters := make([]installertypesvsphere.VCenters, 0, len(cd.Spec.Platform.VSphere.Infrastructure.VCenters)) | |
| for _, vcenter := range cd.Spec.Platform.VSphere.Infrastructure.VCenters { | |
| vcenters = append(vcenters, installertypesvsphere.VCenters{ | |
| VCenter: vcenter.Server, | |
| Username: vcenter.Username, | |
| Password: vcenter.Password, | |
| }) | |
| } | |
| metadata := &installertypes.ClusterMetadata{ | |
| InfraID: infraID, | |
| ClusterPlatformMetadata: installertypes.ClusterPlatformMetadata{ | |
| VSphere: &installertypesvsphere.Metadata{ | |
| VCenter: cd.Spec.Platform.VSphere.VCenter, | |
| Username: vSphereUsername, | |
| Password: vSpherePassword, | |
| VCenters: vcenters, | |
| if cd.Spec.Platform.VSphere.Infrastructure == nil { | |
| return errors.New("VSphere ClusterDeployment missing Infrastructure configuration") | |
| } | |
| vcenters := make([]installertypesvsphere.VCenters, 0, len(cd.Spec.Platform.VSphere.Infrastructure.VCenters)) | |
| for _, vcenter := range cd.Spec.Platform.VSphere.Infrastructure.VCenters { | |
| vcenters = append(vcenters, installertypesvsphere.VCenters{ | |
| VCenter: vcenter.Server, | |
| Username: vcenter.Username, | |
| Password: vcenter.Password, | |
| }) | |
| } | |
| metadata := &installertypes.ClusterMetadata{ | |
| InfraID: infraID, | |
| ClusterPlatformMetadata: installertypes.ClusterPlatformMetadata{ | |
| VSphere: &installertypesvsphere.Metadata{ | |
| VCenters: vcenters, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/installmanager/installmanager.go` around lines 739 - 752, The code
iterates over cd.Spec.Platform.VSphere.Infrastructure.VCenters without checking
that Infrastructure is non-nil which can panic; update the block that builds
vcenters (where vcenters is made and populated and where
installertypes.ClusterMetadata / installertypesvsphere.Metadata are constructed)
to first check if cd.Spec.Platform.VSphere.Infrastructure != nil and only range
over Infrastructure.VCenters when non-nil, otherwise leave vcenters as an empty
slice (or set VCenters to nil/empty) so creation of
installertypes.ClusterMetadata and installertypesvsphere.Metadata is safe.
| // We need to have at least one VCenter; but we have to allow both the legacy | ||
| // (pre-zonal) and new shapes. We'll upconvert the former, but only after the CR | ||
| // has alreday been accepted and stored in etcd once. | ||
| if vsphere.DeprecatedVCenter == "" && (vsphere.Infrastructure == nil || len(vsphere.Infrastructure.VCenters) == 0) { | ||
| allErrs = append(allErrs, field.Required(vspherePath.Child("vSphere").Child("vcenters").Index(0), "must specify at least one vSphere vCenter")) | ||
| } |
There was a problem hiding this comment.
Keep validating the legacy topology fields when Infrastructure is absent.
This now accepts old-shape vSphere specs with only vCenter populated. The controller migration path still has to synthesize Infrastructure from the deprecated fields, so objects missing values like datacenter or defaultDatastore can get admitted and only fail later in reconcile. Please keep the legacy required-field checks for the pre-zonal shape, and use the new Infrastructure check only when that shape is actually provided.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@pkg/validating-webhooks/hive/v1/clusterdeployment_validating_admission_hook.go`
around lines 535 - 540, The validation currently conflates legacy (pre-zonal)
and new shaped vSphere specs; change the logic so that when
vsphere.Infrastructure is nil or empty you validate the legacy topology fields
(e.g. require vsphere.DeprecatedVCenter and its legacy fields), and only when
vsphere.Infrastructure is provided/has entries enforce the new shape (require
vsphere.Infrastructure.VCenters). Update the conditional around
vsphere.DeprecatedVCenter / vsphere.Infrastructure to branch: if Infrastructure
== nil or len(VCenters)==0 -> run legacy-required checks (using
vsphere.DeprecatedVCenter and legacy field paths via vspherePath), else ->
require vsphere.Infrastructure.VCenters; keep references to
vsphere.DeprecatedVCenter, vsphere.Infrastructure,
vsphere.Infrastructure.VCenters and vspherePath to locate and adjust the checks.
| if oldObject.Spec.Platform.VSphere != nil && cd.Spec.Platform.VSphere != nil { | ||
| // HIVE-2391: Moving from a non-zonal to a zonal shape is permitted. | ||
| // NOTE: Existing deprecated fields may be left populated, but will be ignored. | ||
| if oldObject.Spec.Platform.VSphere.Infrastructure == nil && cd.Spec.Platform.VSphere.Infrastructure != nil { | ||
| contextLogger.Debug("Allowing vsphere zonal conversion") | ||
| // copy over the value to spoof the immutability checker | ||
| oldObject.Spec.Platform.VSphere = cd.Spec.Platform.VSphere | ||
| } |
There was a problem hiding this comment.
Don't mask unrelated immutable edits during the zonal migration.
Replacing oldObject.Spec.Platform.VSphere with the incoming struct makes hasChangedImmutableField() blind to credential/certificate secret changes and any other immutable vSphere edits bundled into the same update. Please normalize the old object into the new shape first, but keep comparing the rest of the vSphere fields against their original values.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@pkg/validating-webhooks/hive/v1/clusterdeployment_validating_admission_hook.go`
around lines 671 - 678, The current change overwrites
oldObject.Spec.Platform.VSphere with the incoming struct which masks unrelated
immutable edits; instead, create a normalized copy of the old vSphere struct
(e.g., normalizedOldVSphere := oldObject.Spec.Platform.VSphere.DeepCopy() or
equivalent), set only its Infrastructure field to
cd.Spec.Platform.VSphere.Infrastructure, and pass that normalized copy into the
immutability check (hasChangedImmutableField or whatever call uses the old
value) rather than assigning cd.Spec.Platform.VSphere back onto
oldObject.Spec.Platform.VSphere; this preserves original credential/certificate
and other immutable fields for comparison while allowing the zonal
Infrastructure migration.
| // HIVE-2391 | ||
| if oldObject.Spec.Platform.VSphere != nil && newObject.Spec.Platform.VSphere != nil { | ||
| // Moving from a non-zonal to a zonal shape is permitted. | ||
| // This check is faster than checking all the fields individually | ||
| if oldObject.Spec.Platform.VSphere.Infrastructure == nil && newObject.Spec.Platform.VSphere.Infrastructure != nil { | ||
| contextLogger.Debug("Passed validation: HIVE-2391") | ||
| return &admissionv1beta1.AdmissionResponse{ | ||
| Allowed: true, | ||
| } |
There was a problem hiding this comment.
Don't short-circuit update validation here.
This returns Allowed: true before validateClusterPoolPlatform(...) runs, so any legacy→zonal vSphere update also skips the normal validation of the new object. That means an invalid spec.platform.vsphere can be admitted as long as Infrastructure changed from nil to non-nil.
Suggested change
- // HIVE-2391
- if oldObject.Spec.Platform.VSphere != nil && newObject.Spec.Platform.VSphere != nil {
- // Moving from a non-zonal to a zonal shape is permitted.
- // This check is faster than checking all the fields individually
- if oldObject.Spec.Platform.VSphere.Infrastructure == nil && newObject.Spec.Platform.VSphere.Infrastructure != nil {
- contextLogger.Debug("Passed validation: HIVE-2391")
- return &admissionv1beta1.AdmissionResponse{
- Allowed: true,
- }
- }
- }
+ allowVSphereZonalMigration := oldObject.Spec.Platform.VSphere != nil &&
+ newObject.Spec.Platform.VSphere != nil &&
+ oldObject.Spec.Platform.VSphere.Infrastructure == nil &&
+ newObject.Spec.Platform.VSphere.Infrastructure != nil
+ if allowVSphereZonalMigration {
+ contextLogger.Debug("Allowing legacy vSphere -> zonal vSphere transition")
+ }
allErrs := field.ErrorList{}
specPath := field.NewPath("spec")
allErrs = append(allErrs, validateClusterPoolPlatform(specPath, newObject)...)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // HIVE-2391 | |
| if oldObject.Spec.Platform.VSphere != nil && newObject.Spec.Platform.VSphere != nil { | |
| // Moving from a non-zonal to a zonal shape is permitted. | |
| // This check is faster than checking all the fields individually | |
| if oldObject.Spec.Platform.VSphere.Infrastructure == nil && newObject.Spec.Platform.VSphere.Infrastructure != nil { | |
| contextLogger.Debug("Passed validation: HIVE-2391") | |
| return &admissionv1beta1.AdmissionResponse{ | |
| Allowed: true, | |
| } | |
| allowVSphereZonalMigration := oldObject.Spec.Platform.VSphere != nil && | |
| newObject.Spec.Platform.VSphere != nil && | |
| oldObject.Spec.Platform.VSphere.Infrastructure == nil && | |
| newObject.Spec.Platform.VSphere.Infrastructure != nil | |
| if allowVSphereZonalMigration { | |
| contextLogger.Debug("Allowing legacy vSphere -> zonal vSphere transition") | |
| } | |
| allErrs := field.ErrorList{} | |
| specPath := field.NewPath("spec") | |
| allErrs = append(allErrs, validateClusterPoolPlatform(specPath, newObject)...) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/validating-webhooks/hive/v1/clusterpool_validating_admission_hook.go`
around lines 237 - 245, The early return inside the HIVE-2391 block
short-circuits update validation and skips validateClusterPoolPlatform; instead,
after detecting the legacy→zonal transition
(oldObject.Spec.Platform.VSphere.Infrastructure == nil &&
newObject.Spec.Platform.VSphere.Infrastructure != nil) keep the debug log but
call validateClusterPoolPlatform for the new object and return its
AdmissionResponse (or combine its result) rather than immediately returning
Allowed: true — update the logic around the HIVE-2391 check so
validateClusterPoolPlatform(newObject, ...) is invoked before allowing the
change.
|
@2uasimojo: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Co-Authored-By: @dlom
Summary by CodeRabbit
Release Notes
New Features
Deprecated
Documentation