-
Notifications
You must be signed in to change notification settings - Fork 430
Refactor driver discovery #825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8ad7493 to
a561a05
Compare
a561a05 to
841c7d1
Compare
| const: | ||
| - {action: accept, from: "^NVSANDBOXUTILS_"} | ||
| - {action: accept, from: "^nvSandboxUtils"} | ||
| - {action: replace, from: "^NVSANDBOXUTILS_255_MASK_", to: "MASK255_" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this was required.
Signed-off-by: Evan Lezar <[email protected]>
Signed-off-by: Evan Lezar <[email protected]>
This change aligns the driver file discovery with device discovery and allows other sources such as nvsandboxutils to be added. Signed-off-by: Evan Lezar <[email protected]>
841c7d1 to
be25223
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the driver discovery workflow to centralize version detection and entity lookup in a new dgpu package and integrate libnvsandboxutils for sandbox utils discovery. Key changes include:
- Replaced bespoke CUDA and NVML version lookups with a unified
getDriverVersionfallback chain. - Introduced
internal/platform-support/dgpuwith option-basedNewDriverDiscoverer. - Updated all
nvcdiconsumers to use the newdgpu.NewDriverDiscovererand removed old discovery helpers.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| pkg/nvcdi/wrapper.go | No functional change; remains the CDI spec wrapper. |
| pkg/nvcdi/management.go | Swapped CUDA version lookup for getDriverVersion and new dgpu discoverer. |
| pkg/nvcdi/lib.go | Refactored driver-version helpers into nvcdilib. |
| pkg/nvcdi/lib-nvml.go | Updated common edits to call new getDriverVersion and pass version to NVML discoverer. |
| pkg/nvcdi/common-nvml.go | Signature change to include version in the common NVML discoverer. |
| internal/platform-support/dgpu/options.go | Added new option setters (WithDriver, WithLdconfigPath, WithVersion). |
| internal/platform-support/dgpu/driver.go | Implemented NewDriverDiscoverer using option-driven construction. |
| internal/platform-support/dgpu/driver-nvsandboxutils.go | Stub for nvsandboxutils discovery. |
| internal/platform-support/dgpu/driver-nvml.go | Moved NVML-based driver discovery into options-based methods. |
| internal/nvsandboxutils/gen/.../nvsandboxutils.yml | Allowed MASK255_ constant renaming for sandbox utils. |
Comments suppressed due to low confidence (4)
internal/platform-support/dgpu/options.go:89
- There is a typo in
WithNvsandboxuitilsLib(should beWithNvsandboxutilsLib). Correcting this will avoid confusion and maintain consistency.
func WithNvsandboxuitilsLib(nvsandboxutilslib nvsandboxutils.Interface) Option {
internal/platform-support/dgpu/driver.go:26
- This new
NewDriverDiscovererlogic is complex and currently untested—adding unit tests to cover option parsing and fallback behavior will help catch future regressions.
// NewDriverDiscoverer creates a discoverer for the libraries and binaries associated with a driver installation.
internal/platform-support/dgpu/driver.go:28
- The call to
new(opts...)is invalid Go syntax and won’t initialize youroptionsstruct. Consider creating a pointer tooptions(e.g.o := &options{}) and then applying each opt function.
o := new(opts...)
internal/platform-support/dgpu/driver-nvml.go:19
- The code references
lookup.NewFileLocator,lookup.WithLogger, etc., but thelookuppackage is not imported. Add"github.com/NVIDIA/nvidia-container-toolkit/internal/lookup"(andlookup/rootif needed).
import (
|
|
||
| nvsandboxutilsDiscoverer, err := o.newNvsandboxutilsDriverDiscoverer() | ||
| if err != nil { | ||
| // TODO: Log a warning |
Copilot
AI
Jul 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Leftover TODO comments for logging failure of sub-discoverers. Consider implementing real logging or removing the TODO if not needed.
| // TODO: Log a warning | |
| o.logger.Warn(fmt.Sprintf("Failed to create nvsandboxutils driver discoverer: %v", err)) |
|
This PR is stale because it has been open 90 days with no activity. This PR will be closed in 30 days unless new comments are made or the stale label is removed. To skip these checks, apply the "lifecycle/frozen" label. |
This change refactors driver discovery to prepare for the discovery of additional entities using libnvsandboxutils.