Updated sample semantics#4418
Conversation
Signed-off-by: Andres Paz <andresp@nvidia.com>
Signed-off-by: Andres Paz <andresp@nvidia.com>
|
CUDA Quantum Docs Bot: A preview of the documentation can be found here. |
| /// useful after transformations such as measurement expansion, loop unrolling, | ||
| /// and allocation combining, which can expose a more precise measurement shape | ||
| /// than was available earlier in the pipeline. | ||
| void addQuakeMetadataRefresh(mlir::OpPassManager &pm); |
There was a problem hiding this comment.
We already have several "metadata" passes. Do we need yet another one?
| pm.addNestedPass<func::FuncOp>(cudaq::opt::createQuakeAddDeallocs()); | ||
| pm.addNestedPass<func::FuncOp>(cudaq::opt::createQuakeAddMetadata()); | ||
| pm.addPass(cudaq::opt::createQuakePropagateMetadata()); | ||
| cudaq::opt::addQuakeMetadataRefresh(pm); |
There was a problem hiding this comment.
It appears "no" is the answer to my question.
| * the terms of the Apache License 2.0 which accompanies this distribution. * | ||
| ******************************************************************************/ | ||
|
|
||
| #pragma once |
There was a problem hiding this comment.
This appears to be in the wrong place.
Is it a pure analysis? It appears to be. Why is it entirely implemented in a header file? Why not provide an API? It looks to be used from both the runtime and the new "refresh metadata" pass. Why?
There was a problem hiding this comment.
Thanks for working on this. I see that the default sample semantics have been updated so measurement order is preserved automatically when needed, and that seems like the right user-facing direction.
One thing I am still not fully understanding: does this PR address the original performance concern in #4153? My reading is that kernels requiring explicit measurement-order semantics will still use explicitMeasurements internally, and for non-Stim local simulators that do not support buffered explicit sampling, that path still appears to execute one shot at a time. So the default result would now be semantically correct, but the explicit-measurements performance issue may still remain for targets like nvidia / other non-Stim local simulators.
Am I missing something?
CI Summary — ❌ failedRun #25346790486 · trigger ❌ Failed or cancelled
Top-level jobs (13)
⏩ Skipped jobs (7) — intentionally skipped on PR builds; run on merge_group / workflow_dispatch
All sub-jobs (50) — every matrix leg, with links
|
| Required check | Status | Link |
|---|---|---|
| Build and test (amd64, clang16, openmpi) / Dev environment (Debug) | ❌ failure | view |
| Build and test (amd64, clang16, openmpi) / Dev environment (Python) | ✅ success | view |
| Build and test (amd64, gcc11, openmpi) / Dev environment (Debug) | ❌ failure | view |
| Build and test (amd64, gcc11, openmpi) / Dev environment (Python) | ✅ success | view |
| Build and test (amd64, gcc12, openmpi) / Dev environment (Debug) | ❌ failure | view |
| Build and test (amd64, gcc12, openmpi) / Dev environment (Python) | ✅ success | view |
| Build and test (arm64, clang16, openmpi) / Dev environment (Debug) | ❌ failure | view |
| Build and test (arm64, clang16, openmpi) / Dev environment (Python) | ✅ success | view |
Summary
Fixes #4153 by updating
cudaq::sample/cudaq.samplemeasurement result semantics so sampled bitstrings follow user measurement order when measurements are present; if the kernel includes no measurements, the current behavior of returning a bitstring based on the allocation order remains.What Changed
__global__bitstrings in measurement/program order.mz/mx/mymeasurements that already follow allocation order remain allocation-order compatible.explicit_measurements=Falseis rejected when it would change returned bitstrings.sample/sample_asyncdefaultexplicit_measurementsto auto mode viaNone.sample_options::explicit_measurementsas abool, but documented it as a deprecated compatibility option.sample_resultwhere available for compatibility.cudaq-translatepaths can use it.