Skip to content

[DRAFT] Support new IGVMAgent contract with Skip HW Unsealing signal#2921

Open
mingweishih wants to merge 13 commits intomicrosoft:mainfrom
mingweishih:skip_hw_sealing
Open

[DRAFT] Support new IGVMAgent contract with Skip HW Unsealing signal#2921
mingweishih wants to merge 13 commits intomicrosoft:mainfrom
mingweishih:skip_hw_sealing

Conversation

@mingweishih
Copy link
Contributor

No description provided.

Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
@mingweishih mingweishih requested a review from a team as a code owner March 10, 2026 06:44
Copilot AI review requested due to automatic review settings March 10, 2026 06:44
@github-actions github-actions bot added the unsafe Related to unsafe code label Mar 10, 2026
@github-actions
Copy link

⚠️ Unsafe Code Detected

This PR modifies files containing unsafe Rust code. Extra scrutiny is required during review.

For more on why we check whole files, instead of just diffs, check out the Rustonomicon

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for the skip_hw_unsealing signal in the new IGVMAgent contract. When the IGVM agent fails a key release request with the skip_hw_unsealing bit set in the response, OpenHCL should skip the hardware unsealing fallback path entirely instead of attempting it. This is important for security scenarios where the agent explicitly signals that hardware unsealing should not be attempted.

Changes:

  • Adds the skip_hw_unsealing signal to the IGVM attestation protocol (IgvmCapabilityBitMap and IgvmSignal bitfields in get.rs)
  • Implements handling in underhill_attestation to detect and propagate the skip_hw_unsealing signal through the key release flow, blocking hardware unsealing fallback and logging a DEK_HARDWARE_UNSEALING_SKIPPED event
  • Adds per-VM test agent infrastructure in test_igvm_agent_rpc_server, replacing the shared singleton with a HashMap-keyed registry, plus new integration and unit tests for the new contract

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
openhcl/openhcl_attestation_protocol/src/igvm_attest/get.rs Adds skip_hw_unsealing bit to IgvmCapabilityBitMap and IgvmSignal protocol structs
openhcl/underhill_attestation/src/igvm_attest/mod.rs Advertises the new capability bit in requests; parses and propagates skip_hw_unsealing from responses; adds unit tests
openhcl/underhill_attestation/src/secure_key_release.rs Logs and propagates the skip_hw_unsealing flag from key release error responses
openhcl/underhill_attestation/src/lib.rs Extracts skip_hw_unsealing signal and blocks hardware unsealing fallback when set; adds unit test init_sec_secure_key_release_skip_hw_unsealing
vm/devices/get/get_protocol/src/lib.rs Adds DEK_HARDWARE_UNSEALING_SKIPPED = 23 to EventLogId enum
vm/devices/get/get_resources/src/lib.rs Adds KeyReleaseFailureSkipHwUnsealing and KeyReleaseFailure to IgvmAttestTestConfig enum
vm/devices/get/test_igvm_agent_lib/src/lib.rs Adds RespondFailureSkipHwUnsealing action and corresponding plans for new test configs
vm/devices/get/test_igvm_agent_rpc_server/src/rpc/igvm_agent.rs Replaces singleton agent with per-VM registry keyed by VM name; adds config_for_vm_name pattern matching
vm/devices/get/test_igvm_agent_rpc_server/src/rpc/handlers.rs Passes VM name to process_igvm_attest for per-VM routing
vm/devices/get/test_igvm_agent_rpc_server/src/main.rs Adds new KeyReleaseFailureSkipHwUnsealing and KeyReleaseFailure CLI test config variants
vmm_tests/vmm_tests/tests/tests/multiarch/tpm.rs Adds four new integration tests: ak_cert_cache, ak_cert_retry, skip_hw_unseal, and use_hw_unseal

x
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
@github-actions
Copy link

x
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
@github-actions
Copy link

Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
@github-actions
Copy link

Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
x
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
@github-actions
Copy link

x
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
Copilot AI review requested due to automatic review settings March 11, 2026 19:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Comment on lines +401 to +410
tracing::info!("first boot");

agent.reboot().await?;

tracing::info!("second boot");

let host_binary_path = tpm_guest_tests_artifact.get();
let tpm_guest_tests =
TpmGuestTests::send_tpm_guest_tests(&agent, host_binary_path, guest_binary_path, os_flavor)
.await?;
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After calling agent.reboot().await?, the current PipetteClient instance is no longer guaranteed to be usable (it only sends the reboot request; it does not wait for reset/reconnect). The subsequent send_tpm_guest_tests and AK cert reads are likely to fail against a stale connection. Update this flow to wait for the VM reset and obtain a fresh agent (e.g., make vm/agent mutable and call agent = vm.wait_for_reset().await? after reboot) before continuing.

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +63
///
/// The substring must be short enough to survive Hyper-V's 100-char name
/// limit even on the longest image prefixes (~85 chars). Keep patterns
/// ≤ 15 characters.
///
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config_for_vm_name documents that patterns must be short (≤ 15 chars) to survive Hyper-V VM name truncation, but the current matching strategy uses long substrings that include the image name and other prefix components. Because petri::vm::make_vm_safe_name truncates long VM names to 100 chars, these long patterns are likely to be truncated away and never match, causing the wrong/default config to be applied. Prefer matching only on the (short) test function name portion (e.g., "ak_cert_retry", "ak_cert_cache", "skip_hw_unseal", "use_hw_unseal") or another guaranteed-to-survive unique suffix.

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +92
"ubuntu_2504_server_x64_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"windows_datacenter_core_2022_x64_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ubuntu_2504_server_x64_vbs_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"windows_datacenter_core_2025_x64_prepped_vbs_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ubuntu_2504_server_x64_snp_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"windows_datacenter_core_2025_x64_prepped_snp_ak_cert_retry",
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example of the truncation risk: these KNOWN_TEST_CONFIGS patterns are far longer than the ≤15-character guideline stated above, so they are unlikely to be present in Hyper-V’s truncated (100-char) VM names. Shorten these patterns to something that is guaranteed to appear near the end of the VM name (typically just the test fn name suffix).

Suggested change
"ubuntu_2504_server_x64_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"windows_datacenter_core_2022_x64_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ubuntu_2504_server_x64_vbs_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"windows_datacenter_core_2025_x64_prepped_vbs_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ubuntu_2504_server_x64_snp_ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"windows_datacenter_core_2025_x64_prepped_snp_ak_cert_retry",
"ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ak_cert_retry",
IgvmAttestTestConfig::AkCertRequestFailureAndRetry,
),
(
"ak_cert_retry",

Copilot uses AI. Check for mistakes.
Comment on lines +458 to +462
} else if self.plan.is_some() {
// A plan is installed but has no more actions for this request type.
// Return NoResponse to avoid silently falling back to normal mode.
tracing::info!(?request.header.request_type, "Plan exhausted: NoResponse");
(vec![], 0)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new else if self.plan.is_some() branch makes the agent return NoResponse for any request type that is not explicitly present in the plan (or whose queue is empty). This breaks existing tests that install a plan for only one request type and rely on default behavior for others (e.g., plans that only override WRAPPED_KEY_REQUEST or only KEY_RELEASE_REQUEST). Adjust the logic to distinguish between "no entry for this request type" (should fall back to default behavior) vs "entry exists but queue exhausted" (could reasonably become NoResponse).

Copilot uses AI. Check for mistakes.
}
}
Ok(other) => {
anyhow::bail!("Expected Reset or VM start failure, got {other:?}");
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Ok(other) branch, the test bails out without tearing down the VM. Since PetriVm does not have a Drop teardown, an unexpected halt reason could leak a VM/resource in the test environment. Consider tearing down the VM before returning the error (or restructuring to ensure teardown happens on all branches).

Suggested change
anyhow::bail!("Expected Reset or VM start failure, got {other:?}");
let error = anyhow::anyhow!(
"Expected Reset or VM start failure, got {other:?}"
);
vm.teardown().await?;
return Err(error);

Copilot uses AI. Check for mistakes.
x
Signed-off-by: Ming-Wei Shih <mishih@microsoft.com>
@github-actions
Copy link

@github-actions
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

unsafe Related to unsafe code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants