virtio: make VirtioDevice::enable return Result#2947
virtio: make VirtioDevice::enable return Result#2947jstarks wants to merge 3 commits intomicrosoft:mainfrom
Conversation
Change the VirtioDevice::enable trait method to return anyhow::Result<()> so that device activation errors are propagated to the transport layer instead of panicking. Previously, several device implementations used .unwrap() or .expect() on fallible operations inside enable(), which could crash the VMM on malformed guest input (violating the trust boundary). Now: - The trait method returns Result, allowing devices to use ? for error propagation. - Both PCI and MMIO transports handle the error by logging it (rate limited) and leaving DRIVER_OK unset, so the device remains inert. - virtio_blk: simplified from verbose match blocks to ? operators. - virtio_net: replaced if-let-Err + unwrap pattern with ? operators. - virtio_p9: replaced .expect() with .context()?. - virtio_pmem: replaced .unwrap() on shared_memory_region and map() result with proper error propagation. - virtiofs, tests: trivially updated (always return Ok(())).
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
There was a problem hiding this comment.
Pull request overview
This PR changes the VirtioDevice::enable lifecycle hook to return anyhow::Result<()>, allowing virtio device activation failures to propagate to the MMIO/PCI transport layer instead of panicking across the guest trust boundary.
Changes:
- Update
VirtioDevice::enabletrait signature to returnanyhow::Result<()>and document expected transport behavior on failure. - Convert several virtio device implementations to return structured errors (replacing
unwrap/expect/ignored results during activation). - Update MMIO/PCI transports to set
DRIVER_OKonly whenenable()succeeds, and to log errors on failure.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| vm/devices/virtio/virtio/src/common.rs | Change VirtioDevice::enable to return anyhow::Result<()> and document failure semantics. |
| vm/devices/virtio/virtio/src/transport/pci.rs | Gate DRIVER_OK on successful enable() and log enable failures. |
| vm/devices/virtio/virtio/src/transport/mmio.rs | Gate DRIVER_OK on successful enable() and log enable failures. |
| vm/devices/virtio/virtio/src/tests.rs | Update TestDevice::enable to return Result to match the trait. |
| vm/devices/virtio/virtio_blk/src/lib.rs | Propagate queue event/queue creation errors from enable() via Result. |
| vm/devices/virtio/virtio_net/src/lib.rs | Propagate queue event/queue creation errors from enable() via Result. |
| vm/devices/virtio/virtio_p9/src/lib.rs | Replace expect with anyhow::Context and return Result from enable(). |
| vm/devices/virtio/virtio_pmem/src/lib.rs | Require shared memory region and propagate mapping failures via Result. |
| vm/devices/virtio/virtiofs/src/virtio.rs | Update enable() signature and return Ok(()). |
You can also share your feedback on Copilot code review. Take the survey.
| match self.device.enable(Resources { | ||
| features: self.driver_feature.clone(), | ||
| queues, | ||
| shared_memory_region: self.shared_memory_region.clone(), | ||
| shared_memory_size: self.shared_memory_size, | ||
| }); | ||
|
|
||
| self.device_status.set_driver_ok(true); | ||
| }) { | ||
| Ok(()) => { | ||
| self.device_status.set_driver_ok(true); | ||
| } | ||
| Err(err) => { | ||
| // FUTURE: consider setting DEVICE_NEEDS_RESET and | ||
| // delivering a config change interrupt so the guest | ||
| // can detect the failure proactively instead of | ||
| // waiting for IO timeouts. | ||
| tracelimit::error_ratelimited!( | ||
| error = &*err as &dyn std::error::Error, | ||
| "virtio device enable failed" | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
In the DRIVER_OK transition, doorbells are registered before calling device.enable(). If enable() returns Err, driver_ok is left unset, so the normal reset/disable paths that clear doorbells (which are gated on driver_ok/disabling) won’t run. This can leave stale/duplicate doorbell registrations and grow self.doorbells on repeated DRIVER_OK writes. Consider registering doorbells only after enable() succeeds, or clearing doorbells (and any other partial transport state) in the Err path.
| match self.device.enable(Resources { | ||
| features: self.driver_feature.clone(), | ||
| queues, | ||
| shared_memory_region: None, | ||
| shared_memory_size: 0, | ||
| }); | ||
|
|
||
| self.device_status.set_driver_ok(true); | ||
| }) { | ||
| Ok(()) => { | ||
| self.device_status.set_driver_ok(true); | ||
| } | ||
| Err(err) => { | ||
| // FUTURE: consider setting DEVICE_NEEDS_RESET and | ||
| // delivering a config change interrupt so the guest | ||
| // can detect the failure proactively instead of | ||
| // waiting for IO timeouts. | ||
| tracelimit::error_ratelimited!( | ||
| error = &*err as &dyn std::error::Error, | ||
| "virtio device enable failed" | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
In the DRIVER_OK transition, doorbells are registered before calling device.enable(). If enable() returns Err, driver_ok is left unset, so guest reset/status=0 won’t clear doorbells (that path only clears when driver_ok was set). This can leak/duplicate doorbell registrations and cause self.doorbells to grow on repeated DRIVER_OK writes. Consider registering doorbells only after enable() succeeds, or clearing doorbells (and any other partial transport state) in the Err path.
| @@ -843,6 +843,7 @@ impl VirtioDevice for TestDevice { | |||
| )) | |||
| }) | |||
| .collect(); | |||
| Ok(()) | |||
There was a problem hiding this comment.
There are tests covering the successful DRIVER_OK path, but none exercising the new failure path where VirtioDevice::enable() returns Err. Adding a test device whose enable() returns an error and asserting that the transport does not set DRIVER_OK (and remains inert) would prevent regressions in this error propagation behavior for both MMIO and PCI transports.
- Clear doorbells in Err arm of both MMIO and PCI transports to prevent stale/duplicate registrations on repeated DRIVER_OK writes when enable() fails. - Add FailingTestDevice and tests verifying DRIVER_OK is not set when enable() returns an error, for both MMIO and PCI transports. - Fix CI: apply rustfmt, remove unused tracing dep from virtio_blk.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 11 changed files in this pull request and generated no new comments.
You can also share your feedback on Copilot code review. Take the survey.
Change the VirtioDevice::enable trait method to return anyhow::Result<()> so that device activation errors are propagated to the transport layer instead of panicking.
Previously, several device implementations used .unwrap() or .expect() on fallible operations inside enable(), which could crash the VMM on malformed guest input (violating the trust boundary).