Additional tracing and resource estimation target improvements by taalexander · Pull Request #4432 · NVIDIA/cuda-quantum

taalexander · 2026-05-01T13:24:09Z

Follow ups as noted by @schweitzpgi and @boschmitt.

Signed-off-by: Thomas Alexander <talexander@nvidia.com>

github-actions · 2026-05-01T20:10:47Z

CI Summary — ✅ passed

Run #25229541107 · trigger push · ✅ 6 · ⏩ 7 · ❌ 0 · ⛔ 0

Top-level jobs (13)

Job	Result
`binaries`	⏩ skipped
`build_and_test`	✅ success
`config_devdeps`	✅ success
`config_source_build`	⏩ skipped
`config_wheeldeps`	✅ success
`devdeps`	✅ success
`docker_image`	⏩ skipped
`gen_code_coverage`	⏩ skipped
`metadata`	✅ success
`python_metapackages`	⏩ skipped
`python_wheels`	⏩ skipped
`source_build`	⏩ skipped
`wheeldeps`	✅ success

⏩ Skipped jobs (7) — intentionally skipped on PR builds; run on merge_group / workflow_dispatch

Job
`binaries`
`config_source_build`
`docker_image`
`gen_code_coverage`
`python_metapackages`
`python_wheels`
`source_build`

All sub-jobs (50) — every matrix leg, with links

Job	Status	Link
Build and test (amd64, clang16, openmpi) / Dev environment (Debug)	✅ success	view
Build and test (amd64, clang16, openmpi) / Dev environment (Python)	✅ success	view
Build and test (amd64, gcc11, openmpi) / Dev environment (Debug)	✅ success	view
Build and test (amd64, gcc11, openmpi) / Dev environment (Python)	✅ success	view
Build and test (amd64, gcc12, openmpi) / Dev environment (Debug)	✅ success	view
Build and test (amd64, gcc12, openmpi) / Dev environment (Python)	✅ success	view
Build and test (arm64, clang16, openmpi) / Dev environment (Debug)	✅ success	view
Build and test (arm64, clang16, openmpi) / Dev environment (Python)	✅ success	view
CI Summary	❔ in_progress	view
Configure build (devdeps)	✅ success	view
Configure build (source_build)	⏩ skipped	view
Configure build (wheeldeps)	✅ success	view
Create CUDA Quantum installer	⏩ skipped	view
Create Docker images	⏩ skipped	view
Create Python metapackages	⏩ skipped	view
Create Python wheels	⏩ skipped	view
Gen code coverage	⏩ skipped	view
Load dependencies (amd64, clang16) / Caching	✅ success	view
Load dependencies (amd64, clang16) / Finalize	✅ success	view
Load dependencies (amd64, clang16) / Metadata	✅ success	view
Load dependencies (amd64, gcc11) / Caching	✅ success	view
Load dependencies (amd64, gcc11) / Finalize	✅ success	view
Load dependencies (amd64, gcc11) / Metadata	✅ success	view
Load dependencies (amd64, gcc12) / Caching	✅ success	view
Load dependencies (amd64, gcc12) / Finalize	✅ success	view
Load dependencies (amd64, gcc12) / Metadata	✅ success	view
Load dependencies (arm64, clang16) / Caching	✅ success	view
Load dependencies (arm64, clang16) / Finalize	✅ success	view
Load dependencies (arm64, clang16) / Metadata	✅ success	view
Load dependencies (arm64, gcc11) / Caching	✅ success	view
Load dependencies (arm64, gcc11) / Finalize	✅ success	view
Load dependencies (arm64, gcc11) / Metadata	✅ success	view
Load dependencies (arm64, gcc12) / Caching	✅ success	view
Load dependencies (arm64, gcc12) / Finalize	✅ success	view
Load dependencies (arm64, gcc12) / Metadata	✅ success	view
Load source build cache	⏩ skipped	view
Load wheel dependencies (amd64, 12.6) / Caching	✅ success	view
Load wheel dependencies (amd64, 12.6) / Finalize	✅ success	view
Load wheel dependencies (amd64, 12.6) / Metadata	✅ success	view
Load wheel dependencies (amd64, 13.0) / Caching	✅ success	view
Load wheel dependencies (amd64, 13.0) / Finalize	✅ success	view
Load wheel dependencies (amd64, 13.0) / Metadata	✅ success	view
Load wheel dependencies (arm64, 12.6) / Caching	✅ success	view
Load wheel dependencies (arm64, 12.6) / Finalize	✅ success	view
Load wheel dependencies (arm64, 12.6) / Metadata	✅ success	view
Load wheel dependencies (arm64, 13.0) / Caching	✅ success	view
Load wheel dependencies (arm64, 13.0) / Finalize	✅ success	view
Load wheel dependencies (arm64, 13.0) / Metadata	✅ success	view
Prepare cache clean-up	❔ in_progress	view
Retrieve PR info	✅ success	view

✅ Required checks (8/8) — declared in .github/required-checks.yml for push

Required check	Status	Link
Build and test (amd64, clang16, openmpi) / Dev environment (Debug)	✅ success	view
Build and test (amd64, clang16, openmpi) / Dev environment (Python)	✅ success	view
Build and test (amd64, gcc11, openmpi) / Dev environment (Debug)	✅ success	view
Build and test (amd64, gcc11, openmpi) / Dev environment (Python)	✅ success	view
Build and test (amd64, gcc12, openmpi) / Dev environment (Debug)	✅ success	view
Build and test (amd64, gcc12, openmpi) / Dev environment (Python)	✅ success	view
Build and test (arm64, clang16, openmpi) / Dev environment (Debug)	✅ success	view
Build and test (arm64, clang16, openmpi) / Dev environment (Python)	✅ success	view

schweitzpgi · 2026-05-04T20:57:07Z

-  jit-low-level-pipeline: "func.func(apply-control-negations,canonicalize,cse),symbol-dce"
+  # Target high-level stage: materialize registered custom-op matrices and
+  # synthesize them to gates before target lowering.
+  jit-high-level-pipeline: "get-concrete-matrix,unitary-synthesis"


Before this pipeline is run, we should always run

pm.addNestedPass<func::FuncOp>(cudaq::opt::createQuakeAddDeallocs()); pm.addNestedPass<func::FuncOp>(cudaq::opt::createQuakeAddMetadata()); pm.addPass(cudaq::opt::createQuakePropagateMetadata()); pm.addNestedPass<func::FuncOp>(cudaq::opt::createUnwindLowering()); pm.addNestedPass<func::FuncOp>(createCanonicalizerPass()); pm.addNestedPass<func::FuncOp>(cudaq::opt::createClassicalMemToReg()); cudaq::opt::createClassicalOptimizationPipeline(pm, std::nullopt, {options.allowEarlyExit}); pm.addPass(cudaq::opt::createGlobalizeArrayValues()); pm.addNestedPass<func::FuncOp>(createCanonicalizerPass()); pm.addPass(cudaq::opt::createUnitarySynthesis()); pm.addNestedPass<func::FuncOp>(createCanonicalizerPass()); pm.addPass(cudaq::opt::createApplySpecialization( {.constantPropagation = options.applyConstProp})); cudaq::opt::addAggressiveInlining(pm);

as well as some other passes.

So unitary-synthesis is already part of that pipeline and GetConcreteMatrix realy only matters if you're using user-defined custom ops.

schweitzpgi · 2026-05-04T21:02:32Z

+  # Target mid-level stage: inline synthesized helper calls, decompose to a CX
+  # routing basis, prepare wire metadata, optionally route, then lower routed
+  # two-qubit work to CZ. With no device argument, routing is bypassed.
+  jit-mid-level-pipeline: "apply-op-specialization,aggressive-inlining,decomposition{basis=h,rx,ry,rz,x,x(1)},func.func(add-dealloc,combine-quantum-alloc,canonicalize,factor-quantum-alloc,memtoreg),add-wireset,func.func(assign-wire-indices),qubit-mapping{device=%DEVICE:bypass%},func.func(delay-measurements,regtomem),decomposition{basis=h,rx,ry,rz,x,z(1)}"


Before this pipeline, we'll run

cudaq::opt::createClassicalOptimizationPipeline(pm); cudaq::opt::addDecomposition(pm, {std::string("U3ToRotations")}); pm.addNestedPass<func::FuncOp>(createCanonicalizerPass()); pm.addNestedPass<func::FuncOp>(cudaq::opt::createMultiControlDecomposition());

The mid-level pipeline is where we normally would put custom (per target) gate set mappings, routings, etc. So this is good. Do you really want delay-measurements? That's sort of a weird hack pass for the IQM target.

Here, apply op specialization and inlining should have already been done in the high-level pipeline. If we're doing them here again and they are actually transforming code in some way, it would be good to have an example and understand what's wrong.

schweitzpgi · 2026-05-04T21:08:38Z

+  # two-qubit work to CZ. With no device argument, routing is bypassed.
+  jit-mid-level-pipeline: "apply-op-specialization,aggressive-inlining,decomposition{basis=h,rx,ry,rz,x,x(1)},func.func(add-dealloc,combine-quantum-alloc,canonicalize,factor-quantum-alloc,memtoreg),add-wireset,func.func(assign-wire-indices),qubit-mapping{device=%DEVICE:bypass%},func.func(delay-measurements,regtomem),decomposition{basis=h,rx,ry,rz,x,z(1)}"
+  # Target low-level cleanup stage: remove wire-set symbols left by routing prep.
+  jit-low-level-pipeline: "symbol-dce"


And before this pipeline string

pm.addNestedPass<func::FuncOp>(createCanonicalizerPass()); pm.addNestedPass<func::FuncOp>(createCSEPass()); pm.addPass(createSymbolDCEPass());

So, here we're re-running symbol-dce.

Unfortunately, there is the caveat that instead of simply applying the pipeline as constructed, the pipelines that are strings can be mutated by the launch code in ad hoc ways, which is a bad idea.

There is another user-defined pipeline that can be specified to run after this low-level one called post-codegen-passes. I think we can ignore that one and just use low-level for our purposes here.

taalexander added 2 commits May 1, 2026 09:22

Add builder AOT and frontend tracing

ce8419c

Signed-off-by: Thomas Alexander <talexander@nvidia.com>

Prune circuit-opt-bench target pipeline

a58919b

Signed-off-by: Thomas Alexander <talexander@nvidia.com>

taalexander requested review from boschmitt and schweitzpgi May 1, 2026 13:24

taalexander and others added 6 commits May 1, 2026 10:47

Restore SU4 synthesis for circuit-opt-bench

2bd5998

Signed-off-by: Thomas Alexander <talexander@nvidia.com>

Document circuit-opt-bench pass groups

fd1175c

Signed-off-by: Thomas Alexander <talexander@nvidia.com>

Refine circuit-opt-bench target comments

d7660ac

Signed-off-by: Thomas Alexander <talexander@nvidia.com>

Trim circuit-opt-bench low-level cleanup

35afd2b

Signed-off-by: Thomas Alexander <talexander@nvidia.com>

Test circuit-opt-bench negated controls

44efa19

Signed-off-by: Thomas Alexander <talexander@nvidia.com>

Merge branch 'main' into feature/benchmarking-improvements

272ed89

schweitzpgi reviewed May 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional tracing and resource estimation target improvements#4432

Additional tracing and resource estimation target improvements#4432
taalexander wants to merge 8 commits intoNVIDIA:mainfrom
taalexander:feature/benchmarking-improvements

taalexander commented May 1, 2026

Uh oh!

github-actions Bot commented May 1, 2026

Uh oh!

schweitzpgi May 4, 2026

Uh oh!

schweitzpgi May 4, 2026

Uh oh!

schweitzpgi May 4, 2026

Uh oh!

schweitzpgi May 4, 2026

Uh oh!

schweitzpgi May 4, 2026

Uh oh!

schweitzpgi May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

taalexander commented May 1, 2026

Uh oh!

github-actions Bot commented May 1, 2026

CI Summary — ✅ passed

Uh oh!

schweitzpgi May 4, 2026

Choose a reason for hiding this comment

Uh oh!

schweitzpgi May 4, 2026

Choose a reason for hiding this comment

Uh oh!

schweitzpgi May 4, 2026

Choose a reason for hiding this comment

Uh oh!

schweitzpgi May 4, 2026

Choose a reason for hiding this comment

Uh oh!

schweitzpgi May 4, 2026

Choose a reason for hiding this comment

Uh oh!

schweitzpgi May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants