Create performance model for reduction operators by gabeweisz · Pull Request #545 · AMD-AGI/TraceLens

gabeweisz · 2026-03-18T19:48:32Z

Fixes #533

The performance model is currently bare-bones, based on O(n) operations being strictly necessary for these reductions.

It could be more tuned but that would depend significantly on the implementation and the theoretical limit is something like n/2 + log(n) at a minimum which is not likely significantly more accurate

Copilot

Pull request overview

Adds a first-pass performance model and categorization for single-GPU reduction-style ATen operators, and updates regression artifacts to reflect the new modeling in generated perf reports.

Changes:

Introduce Reduce / aten_reduce perf model classes to estimate FLOPs/bytes for reduction-like ops.
Map common aten:: reduce ops to the new perf model and categorize them under a new "Reduce" category.
Add a new checked-in MI300 perf report reference .xlsx used by perf-report regression tests.

Reviewed changes

Copilot reviewed 2 out of 9 changed files in this pull request and generated 3 comments.

File	Description
`TraceLens/PerfModel/perf_model.py`	Adds `Reduce` and `aten_reduce` perf model implementation (FLOPs/bytes + param parsing).
`TraceLens/PerfModel/torch_op_mapping.py`	Registers reduce ops → `aten_reduce`, adds `Reduce` category and categorization branch.
`tests/traces/mi300/Qwen_Qwen1.5-0.5B-Chat__1016005_perf_report.xlsx`	New reference perf report for regression testing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

TraceLens/PerfModel/perf_model.py

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

gabeweisz added 3 commits March 18, 2026 07:09

create basic perf model for reduction ops

271ddba

add bwd pass

9f2881e

update references

553b280

gabeweisz requested a review from Copilot March 18, 2026 19:48

Copilot AI reviewed Mar 18, 2026

View reviewed changes

TraceLens/PerfModel/perf_model.py Outdated Show resolved Hide resolved

TraceLens/PerfModel/perf_model.py Show resolved Hide resolved

TraceLens/PerfModel/perf_model.py Outdated Show resolved Hide resolved

gabeweisz and others added 3 commits March 19, 2026 09:13

Apply suggestions from code review

f9ee6ca

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Merge branch 'main' into feat/gw_perf_model_for_reduction_ops

fff24ff

update regression tests again

a6949cc

gabeweisz added the perf_model Add performance model for calculating TFLOPS/s and TB/s label Mar 19, 2026

gabeweisz added 2 commits March 20, 2026 10:01

update compare test

f4ebb29

fix test

2517301

gabeweisz marked this pull request as ready for review March 20, 2026 20:43

gabeweisz requested a review from ajassani March 20, 2026 20:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create performance model for reduction operators#545

Create performance model for reduction operators#545
gabeweisz wants to merge 8 commits intomainfrom
feat/gw_perf_model_for_reduction_ops

gabeweisz commented Mar 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gabeweisz commented Mar 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants