Updated `sample` semantics by anpaz · Pull Request #4418 · NVIDIA/cuda-quantum

anpaz · 2026-04-30T04:06:26Z

Summary

Fixes #4153 by updating cudaq::sample / cudaq.sample measurement result semantics so sampled bitstrings follow user measurement order when measurements are present; if the kernel includes no measurements, the current behavior of returning a bitstring based on the allocation order remains.

What Changed

Updated default sample semantics:
- Kernels with no measurements use implicit final sampling over allocated qubits.
- Kernels with measurements normally return __global__ bitstrings in measurement/program order.
- Terminal mz / mx / my measurements that already follow allocation order remain allocation-order compatible.
- explicit_measurements=False is rejected when it would change returned bitstrings.
Added conservative MLIR measurement analysis and metadata to record when a kernel requires explicit measurement-order semantics.
Updated Python sample / sample_async default explicit_measurements to auto mode via None.
Kept C++ sample_options::explicit_measurements as a bool, but documented it as a deprecated compatibility option.
Preserved named measurement registers in sample_result where available for compatibility.
Updated docs to describe the result contract in terms of outcome semantics rather than implementation fast paths.
Moved the QIR Base Profile measurement-order verifier into shared verifier code so both runtime and cudaq-translate paths can use it.
Updated affected tests and added coverage for measurement-order defaults, allocation-order-compatible measurements, rejected legacy requests, no-measurement kernels, Python sync/async, and C++ target tests.

Note: explicit_measurements is now deprecated as a user-facing semantic switch. Users should rely on the default sample behavior: kernels with measurements return results in measurement order when that affects the bitstring, while kernels without measurements use implicit final allocation-order sampling. The flag remains mainly as a compatibility/backend capability mechanism: explicit_measurements=False requests legacy allocation-order behavior and is only accepted when it does not change the returned outcomes, while targets that cannot support measurement-order sampling can reject kernels that require it.

Signed-off-by: Andres Paz <andresp@nvidia.com>

github-actions · 2026-04-30T16:21:18Z

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

schweitzpgi

In general, the compiler makes no guarantee to maintain qubits or maintain their relative order. The correct approach is to tag measurement operations with identifiers (StringAttr) and post-process them as needed.

schweitzpgi · 2026-04-30T19:17:58Z

+/// useful after transformations such as measurement expansion, loop unrolling,
+/// and allocation combining, which can expose a more precise measurement shape
+/// than was available earlier in the pipeline.
+void addQuakeMetadataRefresh(mlir::OpPassManager &pm);


We already have several "metadata" passes. Do we need yet another one?

schweitzpgi · 2026-04-30T19:19:02Z

    pm.addNestedPass<func::FuncOp>(cudaq::opt::createQuakeAddDeallocs());
-    pm.addNestedPass<func::FuncOp>(cudaq::opt::createQuakeAddMetadata());
-    pm.addPass(cudaq::opt::createQuakePropagateMetadata());
+    cudaq::opt::addQuakeMetadataRefresh(pm);


It appears "no" is the answer to my question.

schweitzpgi · 2026-04-30T19:23:59Z

+ * the terms of the Apache License 2.0 which accompanies this distribution.    *
+ ******************************************************************************/
+
+#pragma once


This appears to be in the wrong place.

Is it a pure analysis? It appears to be. Why is it entirely implemented in a header file? Why not provide an API? It looks to be used from both the runtime and the new "refresh metadata" pass. Why?

khalatepradnya

Thanks for working on this. I see that the default sample semantics have been updated so measurement order is preserved automatically when needed, and that seems like the right user-facing direction.

One thing I am still not fully understanding: does this PR address the original performance concern in #4153? My reading is that kernels requiring explicit measurement-order semantics will still use explicitMeasurements internally, and for non-Stim local simulators that do not support buffered explicit sampling, that path still appears to execute one shot at a time. So the default result would now be semantically correct, but the explicit-measurements performance issue may still remain for targets like nvidia / other non-Stim local simulators.

Am I missing something?

github-actions · 2026-05-05T00:07:17Z

CI Summary — ❌ failed

Run #25346790486 · trigger push · ✅ 5 · ⏩ 7 · ❌ 1 · ⛔ 0

❌ Failed or cancelled

Job	Result	Link
`build_and_test`	❌ failure	view

Top-level jobs (13)

Job	Result
`binaries`	⏩ skipped
`build_and_test`	❌ failure
`config_devdeps`	✅ success
`config_source_build`	⏩ skipped
`config_wheeldeps`	✅ success
`devdeps`	✅ success
`docker_image`	⏩ skipped
`gen_code_coverage`	⏩ skipped
`metadata`	✅ success
`python_metapackages`	⏩ skipped
`python_wheels`	⏩ skipped
`source_build`	⏩ skipped
`wheeldeps`	✅ success

⏩ Skipped jobs (7) — intentionally skipped on PR builds; run on merge_group / workflow_dispatch

Job
`binaries`
`config_source_build`
`docker_image`
`gen_code_coverage`
`python_metapackages`
`python_wheels`
`source_build`

All sub-jobs (50) — every matrix leg, with links

Job	Status	Link
Build and test (amd64, clang16, openmpi) / Dev environment (Debug)	❌ failure	view
Build and test (amd64, clang16, openmpi) / Dev environment (Python)	✅ success	view
Build and test (amd64, gcc11, openmpi) / Dev environment (Debug)	❌ failure	view
Build and test (amd64, gcc11, openmpi) / Dev environment (Python)	✅ success	view
Build and test (amd64, gcc12, openmpi) / Dev environment (Debug)	❌ failure	view
Build and test (amd64, gcc12, openmpi) / Dev environment (Python)	✅ success	view
Build and test (arm64, clang16, openmpi) / Dev environment (Debug)	❌ failure	view
Build and test (arm64, clang16, openmpi) / Dev environment (Python)	✅ success	view
CI Summary	❔ in_progress	view
Configure build (devdeps)	✅ success	view
Configure build (source_build)	⏩ skipped	view
Configure build (wheeldeps)	✅ success	view
Create CUDA Quantum installer	⏩ skipped	view
Create Docker images	⏩ skipped	view
Create Python metapackages	⏩ skipped	view
Create Python wheels	⏩ skipped	view
Gen code coverage	⏩ skipped	view
Load dependencies (amd64, clang16) / Caching	✅ success	view
Load dependencies (amd64, clang16) / Finalize	✅ success	view
Load dependencies (amd64, clang16) / Metadata	✅ success	view
Load dependencies (amd64, gcc11) / Caching	✅ success	view
Load dependencies (amd64, gcc11) / Finalize	✅ success	view
Load dependencies (amd64, gcc11) / Metadata	✅ success	view
Load dependencies (amd64, gcc12) / Caching	✅ success	view
Load dependencies (amd64, gcc12) / Finalize	✅ success	view
Load dependencies (amd64, gcc12) / Metadata	✅ success	view
Load dependencies (arm64, clang16) / Caching	✅ success	view
Load dependencies (arm64, clang16) / Finalize	✅ success	view
Load dependencies (arm64, clang16) / Metadata	✅ success	view
Load dependencies (arm64, gcc11) / Caching	✅ success	view
Load dependencies (arm64, gcc11) / Finalize	✅ success	view
Load dependencies (arm64, gcc11) / Metadata	✅ success	view
Load dependencies (arm64, gcc12) / Caching	✅ success	view
Load dependencies (arm64, gcc12) / Finalize	✅ success	view
Load dependencies (arm64, gcc12) / Metadata	✅ success	view
Load source build cache	⏩ skipped	view
Load wheel dependencies (amd64, 12.6) / Caching	✅ success	view
Load wheel dependencies (amd64, 12.6) / Finalize	✅ success	view
Load wheel dependencies (amd64, 12.6) / Metadata	✅ success	view
Load wheel dependencies (amd64, 13.0) / Caching	✅ success	view
Load wheel dependencies (amd64, 13.0) / Finalize	✅ success	view
Load wheel dependencies (amd64, 13.0) / Metadata	✅ success	view
Load wheel dependencies (arm64, 12.6) / Caching	✅ success	view
Load wheel dependencies (arm64, 12.6) / Finalize	✅ success	view
Load wheel dependencies (arm64, 12.6) / Metadata	✅ success	view
Load wheel dependencies (arm64, 13.0) / Caching	✅ success	view
Load wheel dependencies (arm64, 13.0) / Finalize	✅ success	view
Load wheel dependencies (arm64, 13.0) / Metadata	✅ success	view
Prepare cache clean-up	✅ success	view
Retrieve PR info	✅ success	view

⚠️ Required checks (4/8) — 4 missing — declared in .github/required-checks.yml for push

Required check	Status	Link
Build and test (amd64, clang16, openmpi) / Dev environment (Debug)	❌ failure	view
Build and test (amd64, clang16, openmpi) / Dev environment (Python)	✅ success	view
Build and test (amd64, gcc11, openmpi) / Dev environment (Debug)	❌ failure	view
Build and test (amd64, gcc11, openmpi) / Dev environment (Python)	✅ success	view
Build and test (amd64, gcc12, openmpi) / Dev environment (Debug)	❌ failure	view
Build and test (amd64, gcc12, openmpi) / Dev environment (Python)	✅ success	view
Build and test (arm64, clang16, openmpi) / Dev environment (Debug)	❌ failure	view
Build and test (arm64, clang16, openmpi) / Dev environment (Python)	✅ success	view

anpaz added 2 commits April 29, 2026 16:43

sample semantics enforced measurement order

80b9ab8

Signed-off-by: Andres Paz <andresp@nvidia.com>

sample semantics enforced measurement order

ef46475

Signed-off-by: Andres Paz <andresp@nvidia.com>

anpaz marked this pull request as draft April 30, 2026 04:07

anpaz requested a review from khalatepradnya April 30, 2026 16:10

github-actions Bot pushed a commit that referenced this pull request Apr 30, 2026

Docs preview for PR #4418.

35b9a0f

schweitzpgi reviewed Apr 30, 2026

View reviewed changes

Clean up test, changes

5818b6a

anpaz marked this pull request as ready for review May 1, 2026 03:52

khalatepradnya reviewed May 1, 2026

View reviewed changes

khalatepradnya added the breaking change Change breaks backwards compatibility label May 1, 2026

Merge branch 'main' into issue-4153

70a1160

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated `sample` semantics#4418

Updated `sample` semantics#4418
anpaz wants to merge 4 commits intoNVIDIA:mainfrom
anpaz:issue-4153

anpaz commented Apr 30, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

schweitzpgi left a comment •

edited

Loading

Uh oh!

schweitzpgi Apr 30, 2026

Uh oh!

schweitzpgi Apr 30, 2026

Uh oh!

schweitzpgi Apr 30, 2026

Uh oh!

khalatepradnya left a comment •

edited

Loading

Uh oh!

github-actions Bot commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

anpaz commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

schweitzpgi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

schweitzpgi Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

schweitzpgi Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

schweitzpgi Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

khalatepradnya left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 5, 2026

CI Summary — ❌ failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anpaz commented Apr 30, 2026 •

edited

Loading

schweitzpgi left a comment •

edited

Loading

khalatepradnya left a comment •

edited

Loading