[DRAFT] Support new IGVMAgent contract with Skip HW Unsealing signal#2921
[DRAFT] Support new IGVMAgent contract with Skip HW Unsealing signal#2921mingweishih wants to merge 13 commits intomicrosoft:mainfrom
Conversation
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
This reverts commit 7bdcc4d.
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
There was a problem hiding this comment.
Pull request overview
This PR adds support for the skip_hw_unsealing signal in the new IGVMAgent contract. When the IGVM agent fails a key release request with the skip_hw_unsealing bit set in the response, OpenHCL should skip the hardware unsealing fallback path entirely instead of attempting it. This is important for security scenarios where the agent explicitly signals that hardware unsealing should not be attempted.
Changes:
- Adds the
skip_hw_unsealingsignal to the IGVM attestation protocol (IgvmCapabilityBitMapandIgvmSignalbitfields inget.rs) - Implements handling in
underhill_attestationto detect and propagate theskip_hw_unsealingsignal through the key release flow, blocking hardware unsealing fallback and logging aDEK_HARDWARE_UNSEALING_SKIPPEDevent - Adds per-VM test agent infrastructure in
test_igvm_agent_rpc_server, replacing the shared singleton with aHashMap-keyed registry, plus new integration and unit tests for the new contract
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
openhcl/openhcl_attestation_protocol/src/igvm_attest/get.rs |
Adds skip_hw_unsealing bit to IgvmCapabilityBitMap and IgvmSignal protocol structs |
openhcl/underhill_attestation/src/igvm_attest/mod.rs |
Advertises the new capability bit in requests; parses and propagates skip_hw_unsealing from responses; adds unit tests |
openhcl/underhill_attestation/src/secure_key_release.rs |
Logs and propagates the skip_hw_unsealing flag from key release error responses |
openhcl/underhill_attestation/src/lib.rs |
Extracts skip_hw_unsealing signal and blocks hardware unsealing fallback when set; adds unit test init_sec_secure_key_release_skip_hw_unsealing |
vm/devices/get/get_protocol/src/lib.rs |
Adds DEK_HARDWARE_UNSEALING_SKIPPED = 23 to EventLogId enum |
vm/devices/get/get_resources/src/lib.rs |
Adds KeyReleaseFailureSkipHwUnsealing and KeyReleaseFailure to IgvmAttestTestConfig enum |
vm/devices/get/test_igvm_agent_lib/src/lib.rs |
Adds RespondFailureSkipHwUnsealing action and corresponding plans for new test configs |
vm/devices/get/test_igvm_agent_rpc_server/src/rpc/igvm_agent.rs |
Replaces singleton agent with per-VM registry keyed by VM name; adds config_for_vm_name pattern matching |
vm/devices/get/test_igvm_agent_rpc_server/src/rpc/handlers.rs |
Passes VM name to process_igvm_attest for per-VM routing |
vm/devices/get/test_igvm_agent_rpc_server/src/main.rs |
Adds new KeyReleaseFailureSkipHwUnsealing and KeyReleaseFailure CLI test config variants |
vmm_tests/vmm_tests/tests/tests/multiarch/tpm.rs |
Adds four new integration tests: ak_cert_cache, ak_cert_retry, skip_hw_unseal, and use_hw_unseal |
vm/devices/get/test_igvm_agent_rpc_server/src/rpc/igvm_agent.rs
Outdated
Show resolved
Hide resolved
| tracing::info!("first boot"); | ||
|
|
||
| agent.reboot().await?; | ||
|
|
||
| tracing::info!("second boot"); | ||
|
|
||
| let host_binary_path = tpm_guest_tests_artifact.get(); | ||
| let tpm_guest_tests = | ||
| TpmGuestTests::send_tpm_guest_tests(&agent, host_binary_path, guest_binary_path, os_flavor) | ||
| .await?; |
There was a problem hiding this comment.
After calling agent.reboot().await?, the current PipetteClient instance is no longer guaranteed to be usable (it only sends the reboot request; it does not wait for reset/reconnect). The subsequent send_tpm_guest_tests and AK cert reads are likely to fail against a stale connection. Update this flow to wait for the VM reset and obtain a fresh agent (e.g., make vm/agent mutable and call agent = vm.wait_for_reset().await? after reboot) before continuing.
| /// | ||
| /// The substring must be short enough to survive Hyper-V's 100-char name | ||
| /// limit even on the longest image prefixes (~85 chars). Keep patterns | ||
| /// ≤ 15 characters. | ||
| /// |
There was a problem hiding this comment.
config_for_vm_name documents that patterns must be short (≤ 15 chars) to survive Hyper-V VM name truncation, but the current matching strategy uses long substrings that include the image name and other prefix components. Because petri::vm::make_vm_safe_name truncates long VM names to 100 chars, these long patterns are likely to be truncated away and never match, causing the wrong/default config to be applied. Prefer matching only on the (short) test function name portion (e.g., "ak_cert_retry", "ak_cert_cache", "skip_hw_unseal", "use_hw_unseal") or another guaranteed-to-survive unique suffix.
| "ubuntu_2504_server_x64_ak_cert_retry", | ||
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | ||
| ), | ||
| ( | ||
| "windows_datacenter_core_2022_x64_ak_cert_retry", | ||
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | ||
| ), | ||
| ( | ||
| "ubuntu_2504_server_x64_vbs_ak_cert_retry", | ||
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | ||
| ), | ||
| ( | ||
| "windows_datacenter_core_2025_x64_prepped_vbs_ak_cert_retry", | ||
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | ||
| ), | ||
| ( | ||
| "ubuntu_2504_server_x64_snp_ak_cert_retry", | ||
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | ||
| ), | ||
| ( | ||
| "windows_datacenter_core_2025_x64_prepped_snp_ak_cert_retry", |
There was a problem hiding this comment.
Example of the truncation risk: these KNOWN_TEST_CONFIGS patterns are far longer than the ≤15-character guideline stated above, so they are unlikely to be present in Hyper-V’s truncated (100-char) VM names. Shorten these patterns to something that is guaranteed to appear near the end of the VM name (typically just the test fn name suffix).
| "ubuntu_2504_server_x64_ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "windows_datacenter_core_2022_x64_ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "ubuntu_2504_server_x64_vbs_ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "windows_datacenter_core_2025_x64_prepped_vbs_ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "ubuntu_2504_server_x64_snp_ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "windows_datacenter_core_2025_x64_prepped_snp_ak_cert_retry", | |
| "ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "ak_cert_retry", | |
| IgvmAttestTestConfig::AkCertRequestFailureAndRetry, | |
| ), | |
| ( | |
| "ak_cert_retry", |
| } else if self.plan.is_some() { | ||
| // A plan is installed but has no more actions for this request type. | ||
| // Return NoResponse to avoid silently falling back to normal mode. | ||
| tracing::info!(?request.header.request_type, "Plan exhausted: NoResponse"); | ||
| (vec![], 0) |
There was a problem hiding this comment.
The new else if self.plan.is_some() branch makes the agent return NoResponse for any request type that is not explicitly present in the plan (or whose queue is empty). This breaks existing tests that install a plan for only one request type and rely on default behavior for others (e.g., plans that only override WRAPPED_KEY_REQUEST or only KEY_RELEASE_REQUEST). Adjust the logic to distinguish between "no entry for this request type" (should fall back to default behavior) vs "entry exists but queue exhausted" (could reasonably become NoResponse).
| } | ||
| } | ||
| Ok(other) => { | ||
| anyhow::bail!("Expected Reset or VM start failure, got {other:?}"); |
There was a problem hiding this comment.
In the Ok(other) branch, the test bails out without tearing down the VM. Since PetriVm does not have a Drop teardown, an unexpected halt reason could leak a VM/resource in the test environment. Consider tearing down the VM before returning the error (or restructuring to ensure teardown happens on all branches).
| anyhow::bail!("Expected Reset or VM start failure, got {other:?}"); | |
| let error = anyhow::anyhow!( | |
| "Expected Reset or VM start failure, got {other:?}" | |
| ); | |
| vm.teardown().await?; | |
| return Err(error); |
No description provided.