Require valid aux driver version #21594

nirs · 2025-09-19T18:51:42Z

This change makes driver version checking more strict. Previously we failed only if the driver was not found or running "driver version" failed. For all other errors we logged a warning and use the currently installed driver. With this change all the errors are fatal and you must have a valid driver to use the kvm or hyperkit driver.

Checking the driver version is more correct and strict. We separate driver version stdout and stderr, parse the driver output using yaml parser, and have more detailed logging and error messages.

The tests for extracting driver version from "driver version" output were replaced with test running driver version command and parsing the version.

This change makes it easy to validate the driver commit hash for addressing #21582.

Based on #21597 for testing to avoid the vfkit test failures.

k8s-ci-robot · 2025-09-19T18:51:44Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2025-09-19T18:51:48Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: nirs
Once this PR has been reviewed and has the lgtm label, please assign prezha for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Rename the test after the function it tests, and change to failure message to match the function name.

Make struct fields and helper function names more readable.

We complained that "driver --version" failed but the real command is "driver version".

Previously we failed only if we could not find the driver in the PATH, or running "driver version" failed, assuming that an old driver may work. Hoping that using a invalid driver that does not return a version will work is not a good error handling strategy. Now will will fail loudly helping the user to fix the installation, or failing tests that run with invalid driver.

Previously we had special treatment for driver not found, and drive version failed. Since we treat all errors as fatal errors now, we don't need the special errors. Previously we logged the same errors at least twice; once in validateDriver() and then later in the callers. This create more noise in the log and makes it harder to debug issues. All errors are wrapped now with more context using the modern way (%w) instead of the legacy errors.Wrap(). The context was improved to describe the issue better. Handling of the special errors was also not idiomatic and not thread safe; we modified the global Err* variables at the time of the error. These special variables are removed now.

Rename arguments and temporary variable to make the code more clear.

Add a driverVersion helper and auxdriver.Version type for parsing driver version yaml output. The helper read the output of the "driver version", parses the yaml, and validate that both version and commit are set. This will make it possible to validate the driver commit hash during the tests to ensure we test the driver built from the current code. We log or include in the error message both the driver version and commit hash, to make it easier to debug issues related to using the wrong driver. This changes fixes possible issue if "driver version" command logs errors or warnings. Previously this could break the code parsing the version since we combined stdout and stderr. Now we extract stderr from the command ExitError on errors.

nirs · 2025-09-19T21:51:02Z

/ok-to-test

k8s-ci-robot · 2025-09-19T21:51:41Z

@nirs: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-minikube-integration	`532dacb`	link	true	`/test pull-minikube-integration`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

nirs · 2025-09-19T22:03:36Z

Testing with function tests

Started functional tests in one shell

env MINIKUBE_HOME=/tmp TEST_ARGS="'-minikube-start-args=--driver=kvm'" make functional

Looking at the logs in another shell:

$ MINIKUBE_HOME=/tmp minikube logs
...
I0920 00:57:34.743726  141622 start.go:304] selected driver: kvm2
I0920 00:57:34.743738  141622 start.go:918] validating driver "kvm2" against <nil>
I0920 00:57:34.743765  141622 start.go:929] status for kvm2: {Installed:true Healthy:true Running:true NeedsImprovement:false Error:<nil> Reason: Fix: Doc: Version:}
I0920 00:57:34.746915  141622 install.go:51] acquiring lock: {Name:mk900956b073697a4aa6c80a27c6bb0742a99a53 Clock:{} Delay:500ms Timeout:10m0s Cancel:<nil>}
I0920 00:57:34.747205  141622 install.go:123] Validating docker-machine-driver-kvm2, PATH=/tmp/.minikube/bin:/home/nsoffer/go/pkg/mod/golang.org/[email protected]/bin:/home/nsoffer/bin:/home/nsoffer/.krew/bin:/home/nsoffer/sdk/go1.23.2/bin:/home/nsoffer/go/bin:/home/nsoffer/.local/bin:/home/nsoffer/bin:/usr/local/bin:/usr/bin
W0920 00:57:34.747320  141622 install.go:61] docker-machine-driver-kvm2: failed to find driver "docker-machine-driver-kvm2": exec: "docker-machine-driver-kvm2": executable file not found in $PATH
I0920 00:57:34.747443  141622 out.go:179] * Downloading driver docker-machine-driver-kvm2:
I0920 00:57:34.747571  141622 download.go:108] Downloading: https://github.com/kubernetes/minikube/releases/download/v1.37.0/docker-machine-driver-kvm2-amd64?checksum=file:https://github.com/kubernetes/minikube/releases/download/v1.37.0/docker-machine-driver-kvm2-amd64.sha256 -> /tmp/.minikube/bin/docker-machine-driver-kvm2
I0920 00:57:38.135430  141622 install.go:123] Validating docker-machine-driver-kvm2, PATH=/tmp/.minikube/bin:/home/nsoffer/go/pkg/mod/golang.org/[email protected]/bin:/home/nsoffer/bin:/home/nsoffer/.krew/bin:/home/nsoffer/sdk/go1.23.2/bin:/home/nsoffer/go/bin:/home/nsoffer/.local/bin:/home/nsoffer/bin:/usr/local/bin:/usr/bin
I0920 00:57:38.166004  141622 install.go:134] /tmp/.minikube/bin/docker-machine-driver-kvm2 version is {Version:v1.37.0 Commit:1af8bdc072232de4b1fec3b6cc0e8337e118bc83}

We used:

out/minikube start -p functional-945141 --memory=4096 --apiserver-port=8441 --wait=all --driver=kvm --auto-update-drivers=false

But the driver was downloaded during the test!

$ tree /tmp/.minikube/bin/
/tmp/.minikube/bin/
└── docker-machine-driver-kvm2

I would expect that with --auto-update-drivers=false the driver will not be downloaded and the tests will fail very early.

This seems to be the issue - we install the driver if it does not exist even if autoUpdate is false:

	if !exists || (err != nil && autoUpdate) {
		klog.Warningf("%s: %v", executable, err)
		path = filepath.Join(directory, executable)
		if err := download.Driver(executable, path, v); err != nil {
			return err
		}
	}

The driver must not be installed when autoUpdate is false.

medyagh · 2025-09-19T22:05:28Z

cmd/minikube/cmd/start.go

-			exit.Error(reason.DrvAuxNotHealthy, "Aux driver"+driverName, err)
-		} //if failed to update but not a fatal error, log it and continue (old version might still work)
-		out.WarningT("Unable to update {{.driver}} driver: {{.error}}", out.V{"driver": driverName, "error": err})
+		exit.Error(reason.DrvAuxNotHealthy, fmt.Sprintf("Auxiliary driver %q", driverName), err)


we barely ever make a new kvm aux driver, may releases can just work with old one,
plus many Embedded users (like cloud code users for example) download minikube aux drivers ones, and some of them need root (for hyperkit) and they Prefer to give this root permission once to that binary and not be forced to get a new aux version on every single minikube update.

it is too much to exit when we can run. warning is fine. no need to disrupt ppl when not needed.

The kvm driver is built from minikube source. The docker-machine-driver-kvm2 is just a small wrapper around minikube code. It does not make sense to run an old driver from previous release with new minikube since we don't know if it will work or not. Hoping that an old driver will work is a not a valid release engineering strategy.

On system with a package manager (dnf, apt) this driver should be installed by the package manager in the standard location in the same way minikube is installed. This will ensure that we always run with the right driver tested by the CI. There is no need to install a driver dynamically when software is managed by a package manager.

When minikube is installed by downloading the minikube executable, installing the driver on the first use is a nice feature, but we want to make sure the driver is the driver released with minikube.

If the driver version command does not produce the expected output, something is very wrong and it is better to fail early and loudly, helping the user to fix the issue with minimal debugging.

If you have an old driver and minkube fail to download the driver, the user can download it manually in the same way they downloaded minikube itself. If we think that downloading the driver is not reliable enough, we can provide a tarball with minikube and the driver to avoid the unreliable automatic install. If we think the automatic install is reliable enough, there is no reason to continue if minikube cannot install the right driver.

Hyperkit is depcated and should be removed in minikube 1.38 (#21601) so we can ignore it.

You could consider converting the kvm driver from an external driver into an internal driver, then when hyperkit is gone there wouldn't be any more "aux" drivers and you wouldn't need the extra build... The virsh command is already required by the driver, and most communication with libvirt is done through XML anyway...

That is, replace the libvirt-go calls with the matching exec.Command - like with all other minikube drivers

Using virsh instead of the libvirt api is not great since it is not designed for machines. But given the complexity and trouble caused by auxiliary drivers it sounds like a good plan.

Linux: Convert the external kvm driver to internal #21618

minikube-pr-bot · 2025-09-19T22:43:07Z

kvm2 driver with docker runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 21594 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 44.1s    │ 44.4s                  │
│ enable ingress │ 16.0s    │ 16.7s                  │
└────────────────┴──────────┴────────────────────────┘

Times for minikube start: 45.8s 41.1s 45.0s 46.1s 42.6s
Times for minikube (PR 21594) start: 44.1s 42.7s 45.1s 44.7s 45.2s

Times for minikube ingress: 16.3s 15.8s 16.3s 15.8s 15.8s
Times for minikube (PR 21594) ingress: 19.8s 16.3s 15.8s 15.8s 15.8s

docker driver with docker runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 21594 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 22.6s    │ 21.9s                  │
│ enable ingress │ 12.4s    │ 12.0s                  │
└────────────────┴──────────┴────────────────────────┘

Times for minikube start: 25.9s 22.2s 21.9s 20.7s 22.6s
Times for minikube (PR 21594) start: 22.4s 21.8s 20.8s 22.4s 22.1s

Times for minikube ingress: 13.6s 13.6s 10.6s 13.6s 10.6s
Times for minikube (PR 21594) ingress: 12.6s 12.6s 10.6s 10.6s 13.6s

docker driver with containerd runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 21594 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 20.7s    │ 20.6s                  │
│ enable ingress │ 26.5s    │ 29.7s                  │
└────────────────┴──────────┴────────────────────────┘

Times for minikube start: 20.1s 20.7s 22.7s 19.7s 20.3s
Times for minikube (PR 21594) start: 20.4s 20.5s 22.1s 20.4s 19.7s

Times for minikube ingress: 23.1s 39.1s 23.1s 23.1s 24.1s
Times for minikube (PR 21594) ingress: 39.1s 23.1s 24.1s 39.1s 23.1s

minikube-pr-bot · 2025-09-19T23:55:41Z

Here are the number of top 10 failed tests in each environments with lowest flake rate.

Environment	Test Name	Flake Rate
Docker_Linux_crio_arm64 (7 failed)	TestAddons/serial/GCPAuth/FakeCredentials(gopogh)	0.00% (chart)
Docker_Linux_crio_arm64 (7 failed)	TestFunctional/parallel/ServiceCmdConnect(gopogh)	0.00% (chart)
Docker_Linux_crio_arm64 (7 failed)	TestFunctional/parallel/ServiceCmd/DeployApp(gopogh)	0.00% (chart)
Docker_Linux_crio_arm64 (7 failed)	TestFunctional/parallel/ServiceCmd/HTTPS(gopogh)	0.00% (chart)
Docker_Linux_crio_arm64 (7 failed)	TestFunctional/parallel/ServiceCmd/Format(gopogh)	0.00% (chart)
Docker_Linux_crio_arm64 (7 failed)	TestFunctional/parallel/ServiceCmd/URL(gopogh)	0.00% (chart)

Besides the following environments also have failed tests:

Docker_Linux: 18 failed (gopogh)
Docker_Linux_crio: 16 failed (gopogh)
Docker_Linux_containerd: 14 failed (gopogh)
KVM_Linux_crio: 12 failed (gopogh)

To see the flake rates of all tests by environment, click here.

nirs · 2025-09-20T14:19:33Z

/cc @prezha
/cc @afbjorklund

nirs · 2025-09-23T18:25:47Z

I think this is the wrong direction - fixing the aux driver is hard, and we don't have a reason to use aux driver for libvirt. I'll work instead on making the #21618.

nirs · 2025-09-23T21:23:57Z

Replaced by #21625

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 19, 2025

k8s-ci-robot requested review from ComradeProgrammer and medyagh September 19, 2025 18:51

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 19, 2025

nirs force-pushed the auxdriver-version branch 4 times, most recently from 74d6c94 to 0d8a2c2 Compare September 19, 2025 20:09

nirs added 7 commits September 20, 2025 00:49

auxdriver: Unify test name and message

10f7750

Rename the test after the function it tests, and change to failure message to match the function name.

auxdriver: Improve names

04f7209

Make struct fields and helper function names more readable.

auxdriver: Fix error message

58c26c4

We complained that "driver --version" failed but the real command is "driver version".

auxdriver: Improve names for readability

b53d659

Rename arguments and temporary variable to make the code more clear.

nirs force-pushed the auxdriver-version branch from 0d8a2c2 to 532dacb Compare September 19, 2025 21:49

k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Sep 19, 2025

medyagh changed the title ~~Require valid driver version~~ Require valid aux driver version Sep 19, 2025

medyagh reviewed Sep 19, 2025

View reviewed changes

k8s-ci-robot requested review from afbjorklund and prezha September 20, 2025 14:19

nirs closed this Sep 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Require valid aux driver version #21594

Require valid aux driver version #21594

Uh oh!

nirs commented Sep 19, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Sep 19, 2025

Uh oh!

k8s-ci-robot commented Sep 19, 2025

Uh oh!

nirs commented Sep 19, 2025

Uh oh!

k8s-ci-robot commented Sep 19, 2025

Uh oh!

nirs commented Sep 19, 2025 •

edited

Loading

Uh oh!

medyagh Sep 19, 2025 •

edited

Loading

Uh oh!

nirs Sep 20, 2025

Uh oh!

afbjorklund Sep 22, 2025

Uh oh!

nirs Sep 22, 2025

Uh oh!

afbjorklund Sep 23, 2025

Uh oh!

minikube-pr-bot commented Sep 19, 2025

Uh oh!

minikube-pr-bot commented Sep 19, 2025

Uh oh!

nirs commented Sep 20, 2025

Uh oh!

nirs commented Sep 23, 2025

Uh oh!

nirs commented Sep 23, 2025

Uh oh!

Uh oh!

Require valid aux driver version #21594

Require valid aux driver version #21594

Uh oh!

Conversation

nirs commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Sep 19, 2025

Uh oh!

k8s-ci-robot commented Sep 19, 2025

Uh oh!

nirs commented Sep 19, 2025

Uh oh!

k8s-ci-robot commented Sep 19, 2025

Uh oh!

nirs commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing with function tests

Uh oh!

medyagh Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nirs Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

afbjorklund Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

nirs Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

afbjorklund Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

minikube-pr-bot commented Sep 19, 2025

Uh oh!

minikube-pr-bot commented Sep 19, 2025

Uh oh!

nirs commented Sep 20, 2025

Uh oh!

nirs commented Sep 23, 2025

Uh oh!

nirs commented Sep 23, 2025

Uh oh!

Uh oh!

nirs commented Sep 19, 2025 •

edited

Loading

nirs commented Sep 19, 2025 •

edited

Loading

medyagh Sep 19, 2025 •

edited

Loading