-
Notifications
You must be signed in to change notification settings - Fork 173
Guide: document the storage stack, expand device pages, and add rustdoc #2941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
2405dab
a4e9284
040dc29
e8e0524
c1eaa06
0a253b2
9a1d7ee
48a3ff7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| # Device architecture | ||
|
|
||
| This section covers the internal architecture of device emulators and | ||
| their backends — the shared machinery that both OpenVMM and OpenHCL | ||
| use to connect guest-visible storage, networking, and other devices to | ||
| their backing implementations. | ||
|
|
||
| ## Pages | ||
|
|
||
| - [Storage pipeline](./devices/storage.md) — how guest I/O flows from | ||
| a storage frontend (NVMe, SCSI, IDE) through the | ||
| [`DiskIo`](https://openvmm.dev/rustdoc/linux/disk_backend/trait.DiskIo.html) | ||
| abstraction to a concrete backing store. |
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| # Storage backends | ||
|
|
||
| Storage backends implement the | ||
| [`DiskIo`](https://openvmm.dev/rustdoc/linux/disk_backend/trait.DiskIo.html) | ||
| trait, the shared abstraction that all storage frontends use to read | ||
| and write data. A frontend holds a | ||
| [`Disk`](https://openvmm.dev/rustdoc/linux/disk_backend/struct.Disk.html) | ||
| handle and doesn't know what kind of backend is behind it — the same | ||
| frontend code works with a local file, a Linux block device, a remote | ||
| blob, or a layered composition of multiple backends. | ||
|
|
||
| ## Backend catalog | ||
|
|
||
| | Backend | Crate | Wraps | Platform | Key characteristic | | ||
| |---------|-------|-------|----------|--------------------| | ||
| | FileDisk | [`disk_file`](https://openvmm.dev/rustdoc/linux/disk_file/index.html) | Host file | Cross-platform | Simplest backend. Blocking I/O via `unblock()`. | | ||
| | Vhd1Disk | [`disk_vhd1`](https://openvmm.dev/rustdoc/linux/disk_vhd1/index.html) | VHD1 fixed file | Cross-platform | Parses VHD footer for geometry. | | ||
| | VhdmpDisk | `disk_vhdmp` | Windows vhdmp driver | Windows | Dynamic and differencing VHD/VHDX. | | ||
| | BlobDisk | [`disk_blob`](https://openvmm.dev/rustdoc/linux/disk_blob/index.html) | HTTP / Azure Blob | Cross-platform | Read-only. HTTP range requests. | | ||
| | BlockDeviceDisk | [`disk_blockdevice`](https://openvmm.dev/rustdoc/linux/disk_blockdevice/index.html) | Linux block device | Linux | io_uring, resize via uevent, PR passthrough. | | ||
| | NvmeDisk | [`disk_nvme`](https://openvmm.dev/rustdoc/linux/disk_nvme/index.html) | Physical NVMe (VFIO) | Linux/Windows | User-mode NVMe driver. Resize via AEN. | | ||
| | StripedDisk | [`disk_striped`](https://openvmm.dev/rustdoc/linux/disk_striped/index.html) | Multiple Disks | Cross-platform | Stripes data across underlying disks. | | ||
|
|
||
| ## Decorators | ||
|
|
||
| Decorators wrap another | ||
| [`Disk`](https://openvmm.dev/rustdoc/linux/disk_backend/struct.Disk.html) | ||
| and transform I/O in transit. Features compose by stacking decorators | ||
| without modifying the backends underneath. | ||
|
|
||
| | Decorator | Crate | Transform | | ||
| |-----------|-------|-----------| | ||
| | CryptDisk | [`disk_crypt`](https://openvmm.dev/rustdoc/linux/disk_crypt/index.html) | XTS-AES-256 encryption. Encrypts on write, decrypts on read. | | ||
| | DelayDisk | [`disk_delay`](https://openvmm.dev/rustdoc/linux/disk_delay/index.html) | Adds configurable latency to each I/O operation. | | ||
| | DiskWithReservations | [`disk_prwrap`](https://openvmm.dev/rustdoc/linux/disk_prwrap/index.html) | In-memory SCSI persistent reservation emulation. | | ||
|
|
||
| ## Layered disks | ||
|
|
||
| A [`LayeredDisk`](https://openvmm.dev/rustdoc/linux/disk_layered/index.html) | ||
| composes multiple layers into a single `DiskIo` implementation. Each | ||
| layer tracks which sectors it has; reads fall through from top to | ||
| bottom until a layer has the requested data. This powers the | ||
| `memdiff:` and `mem:` CLI options. | ||
|
|
||
| Two layer implementations exist today: | ||
|
|
||
| - **RamDiskLayer** ([`disklayer_ram`](https://openvmm.dev/rustdoc/linux/disklayer_ram/index.html)) — ephemeral, in-memory. | ||
| - **SqliteDiskLayer** ([`disklayer_sqlite`](https://openvmm.dev/rustdoc/linux/disklayer_sqlite/index.html)) — persistent, file-backed (dev/test only). | ||
|
|
||
| The [storage pipeline](../architecture/devices/storage.md) page covers | ||
| the full architecture: how frontends, backends, decorators, and the | ||
| layered disk model connect, plus cross-cutting concerns like online | ||
| disk resize and virtual optical media. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| # StorVSP | ||
|
|
||
| StorVSP is the VMBus SCSI controller emulator. It presents a virtual | ||
| SCSI adapter to the guest over a VMBus channel and translates SCSI | ||
| requests into calls against the shared disk backend abstraction. | ||
|
|
||
| ## Overview | ||
|
|
||
| StorVSP implements the Hyper-V synthetic SCSI protocol — a | ||
| VMBus-based transport that carries SCSI CDBs (Command Descriptor | ||
| Blocks) between the guest's `storvsc` driver and the host. This | ||
| isn't a standard SCSI transport (like iSCSI or SAS); it's a | ||
| Hyper-V-specific wire format defined in | ||
| [`storvsp_protocol`](https://openvmm.dev/rustdoc/linux/storvsp_protocol/index.html). | ||
| The guest side (`storvsc`) is in the Linux kernel and Windows inbox | ||
| drivers. | ||
|
|
||
| Each SCSI path (channel / target / LUN) maps to an | ||
| [`AsyncScsiDisk`](https://openvmm.dev/rustdoc/linux/scsi_core/trait.AsyncScsiDisk.html) | ||
| implementation — typically | ||
| [`SimpleScsiDisk`](https://openvmm.dev/rustdoc/linux/scsidisk/struct.SimpleScsiDisk.html) | ||
| for hard drives or | ||
| [`SimpleScsiDvd`](https://openvmm.dev/rustdoc/linux/scsidisk/scsidvd/struct.SimpleScsiDvd.html) | ||
| for optical media. Those implementations parse the SCSI CDB and | ||
| translate it into | ||
| [`DiskIo`](https://openvmm.dev/rustdoc/linux/disk_backend/trait.DiskIo.html) | ||
| calls (read, write, flush, unmap). | ||
|
|
||
| ## Key characteristics | ||
|
|
||
| - **Transport.** VMBus ring buffers with GPADL-backed memory. | ||
| - **Protocol.** Hyper-V SCSI (SRB-based), with version negotiation | ||
| (Win6 through Blue). | ||
| - **Sub-channels.** StorVSP supports multiple VMBus sub-channels | ||
| for parallel I/O, one worker per channel. | ||
| - **Hot-add / hot-remove.** SCSI devices can be attached and | ||
| detached at runtime via `ScsiControllerRequest`. | ||
| - **Performance.** Poll-mode optimization — when pending I/O count | ||
| exceeds `poll_mode_queue_depth`, switches from interrupt-driven | ||
| to busy-poll for new requests, reducing guest exit frequency. | ||
| - **Crate.** [`storvsp`](https://openvmm.dev/rustdoc/linux/storvsp/index.html) | ||
|
|
||
| The [storage pipeline](../../architecture/devices/storage.md) page | ||
| covers the full frontend-to-backend architecture, including the SCSI | ||
| adapter layer and how `SimpleScsiDisk` translates CDB opcodes to | ||
| `DiskIo` calls. |
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,8 +1,8 @@ | ||||||||||||||||
| # NVMe Emulator | ||||||||||||||||
| # NVMe emulator | ||||||||||||||||
|
|
||||||||||||||||
| Among the devices that OpenVMM emulates, an NVMe controller is one. The OpenVMM NVMe emulator comes in two flavors: | ||||||||||||||||
|
|
||||||||||||||||
| - An NVMe emulator that can be used to serve IO workloads (but pragmatically is only used by OpenVMM for test scenarios today) | ||||||||||||||||
| - An NVMe emulator used to test OpenHCL (`nvme_test`), which allows test authors to inject faults and inspect the state of NVMe devices used by the guest, and | ||||||||||||||||
| - An NVMe emulator used to test OpenHCL ([`nvme_test`](https://openvmm.dev/rustdoc/linux/nvme_test/index.html)), which allows test authors to inject faults and inspect the state of NVMe devices used by the guest. | ||||||||||||||||
|
|
||||||||||||||||
| This guide provides a brief overview of the architecture shared by the NVMe emulators. | ||||||||||||||||
| This guide provides a brief overview of the architecture shared by the NVMe emulators. For how NVMe fits into the broader storage pipeline — including how namespaces map to [`DiskIo`](https://openvmm.dev/rustdoc/linux/disk_backend/trait.DiskIo.html) backends, online disk resize via AEN, and the layered disk model — see the [storage pipeline](../../architecture/devices/storage.md) page. | ||||||||||||||||
|
||||||||||||||||
| This guide provides a brief overview of the architecture shared by the NVMe emulators. For how NVMe fits into the broader storage pipeline — including how namespaces map to [`DiskIo`](https://openvmm.dev/rustdoc/linux/disk_backend/trait.DiskIo.html) backends, online disk resize via AEN, and the layered disk model — see the [storage pipeline](../../architecture/devices/storage.md) page. | |
| This guide provides a brief overview of the architecture shared by the NVMe | |
| emulators. For how NVMe fits into the broader storage pipeline — including how | |
| namespaces map to | |
| [`DiskIo`](https://openvmm.dev/rustdoc/linux/disk_backend/trait.DiskIo.html) | |
| backends, online disk resize via AEN, and the layered disk model — see the | |
| [storage pipeline](../../architecture/devices/storage.md) page. |
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,81 @@ | ||||||||
| # Floppy | ||||||||
|
|
||||||||
| The floppy controller emulates an | ||||||||
| [Intel 82077AA](https://en.wikipedia.org/wiki/Intel_82077AA) CHMOS | ||||||||
| single-chip floppy disk controller. It connects to the storage stack | ||||||||
| through | ||||||||
| [`Disk`](https://openvmm.dev/rustdoc/linux/disk_backend/struct.Disk.html) | ||||||||
| — the same backend abstraction used by NVMe and SCSI. Data transfers | ||||||||
| use ISA DMA channel 2; interrupts use IRQ 6. | ||||||||
|
|
||||||||
| Two variants exist: | ||||||||
|
|
||||||||
| - [`FloppyDiskController`](https://openvmm.dev/rustdoc/linux/floppy/struct.FloppyDiskController.html) | ||||||||
| — full emulator with disk I/O. | ||||||||
| - [`StubFloppyDiskController`](https://openvmm.dev/rustdoc/linux/floppy_pcat_stub/struct.StubFloppyDiskController.html) | ||||||||
| — reports "no drives" for PCAT BIOS compatibility when no floppy is | ||||||||
| configured. | ||||||||
|
|
||||||||
| ## Supported media | ||||||||
|
|
||||||||
| The controller auto-detects the floppy format from the disk image byte | ||||||||
| size. See | ||||||||
| [Wikipedia's list of floppy disk formats](https://en.wikipedia.org/wiki/List_of_floppy_disk_formats) | ||||||||
| for background on these formats. | ||||||||
|
|
||||||||
| | Format | Capacity | Sectors/track | Notes | | ||||||||
| |--------|----------|---------------|-------| | ||||||||
| | Low density (SS) | 360 KB | 9 | Single-sided (one head) | | ||||||||
| | Low density | 720 KB | 9 | | | ||||||||
| | Medium density | 1.2 MB | 15 | | | ||||||||
| | High density | 1.44 MB | 18 | Most common format | | ||||||||
| | [DMF](https://en.wikipedia.org/wiki/Distribution_Media_Format) | 1.68 MB | 21 | Microsoft Distribution Media Format | | ||||||||
| | XDF | 1.72 MB | 23 | Extended density (fixed 23 SPT variant) | | ||||||||
|
|
||||||||
| All formats use 512-byte sectors, 80 cylinders, CHS addressing. The | ||||||||
| controller rejects images that don't match a known format size. | ||||||||
|
|
||||||||
| ## I/O port layout | ||||||||
|
|
||||||||
| Register offsets from base (typically 0x3F0): | ||||||||
|
|
||||||||
| | Offset | Read | Write | Purpose | | ||||||||
| |--------|------|-------|---------| | ||||||||
| | +0 | STATUS_A | — | Fixed 0xFF (not emulated) | | ||||||||
| | +1 | STATUS_B | — | Fixed 0xFC (no tape drives) | | ||||||||
| | +2 | DOR | DOR | Motor control, drive select, DMA gate, reset | | ||||||||
| | +4 | MSR | DSR | Main status (busy, direction, RQM) / data rate select | | ||||||||
| | +5 | DATA | DATA | Command/parameter/result FIFO (16-byte) | | ||||||||
| | +7 | DIR | CCR | Disk change signal / config control | | ||||||||
|
|
||||||||
| The controller claims port 0x3F7 for DIR/CCR separately from the | ||||||||
| 6-byte base region, because 0x3F6 is shared with the IDE controller's | ||||||||
| alternate status register. | ||||||||
|
|
||||||||
| ## Limitations and deviations | ||||||||
|
|
||||||||
| The real 82077AA supports four drives; OpenVMM supports one. The | ||||||||
| emulator implements a pragmatic subset of the command set — enough for | ||||||||
| MS-DOS, Windows, and Linux floppy drivers to detect the controller, | ||||||||
| identify media, and perform read/write/format operations. Commands that | ||||||||
| interact with physical media timing (perpendicular recording mode, | ||||||||
| power management) are accepted but largely no-op'd. | ||||||||
|
|
||||||||
| Key differences from real hardware: | ||||||||
|
|
||||||||
| - No multi-drive support (real hardware supports drives 0–3). | ||||||||
| - Physical media timing (step rate, head load/unload from SPECIFY) is | ||||||||
| accepted but doesn't affect I/O timing. | ||||||||
| - CHS-to-LBA translation is straightforward — the controller doesn't | ||||||||
| emulate track-level interleave or skew. | ||||||||
| - STATUS_A and STATUS_B registers return fixed values rather than reflecting physical drive state. | ||||||||
|
||||||||
| - STATUS_A and STATUS_B registers return fixed values rather than reflecting physical drive state. | |
| - STATUS_A and STATUS_B registers return fixed values | |
| rather than reflecting physical drive state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These bullets are long prose lines; please wrap to ~80 characters per the style guide to keep diffs readable (tables/code blocks are exempt).