Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

trivedivivek · 2025-10-29T18:09:41Z

Summary: This diff makes a slight improvement to the performance of 4-bit matrix multiplication by better utilizing the ALU pipes. This is achieved by splitting the uint to float conversion operation.

Differential Revision: D85779855

pytorch-bot · 2025-10-29T18:09:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15447

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Long queue for ROCM runners, also B200 and XPU queueing is observed

✅ No Failures

As of commit ff1b33a with merge base 7bd34b8 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2025-10-29T18:09:50Z

@trivedivivek has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85779855.

…U pipes by splitting uint to float conversion operation. (pytorch#15447) Summary: This diff makes a slight improvement to the performance of 4-bit matrix multiplication by better utilizing the ALU pipes. This is achieved by splitting the `uint` to `float` conversion operation. Differential Revision: D85779855

Summary: This diff fixes a regression in the 8-bit quantized matrix multiplication operation, by reducing number of columns processed to 1 instead of 2 as before. Differential Revision: D85767668

…U pipes by splitting uint to float conversion operation. (pytorch#15447) Summary: This diff makes a slight improvement to the performance of 4-bit matrix multiplication by better utilizing the ALU pipes. This is achieved by splitting the `uint` to `float` conversion operation. Differential Revision: D85779855

trivedivivek requested a review from SS-JIA as a code owner October 29, 2025 18:09

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 29, 2025

meta-codesync bot added fb-exported meta-exported labels Oct 29, 2025

trivedivivek added the release notes: vulkan Changes to the Vulkan backend delegate label Oct 29, 2025

trivedivivek force-pushed the export-D85779855 branch from 9db3ed8 to 35be9f2 Compare October 30, 2025 03:04

trivedivivek force-pushed the export-D85779855 branch from 35be9f2 to 88efba0 Compare October 30, 2025 14:18

trivedivivek added 2 commits October 30, 2025 07:18

Fix regression with 8 bit quant mat mul. (pytorch#15445)

6486ef2

Summary: This diff fixes a regression in the 8-bit quantized matrix multiplication operation, by reducing number of columns processed to 1 instead of 2 as before. Differential Revision: D85767668

trivedivivek force-pushed the export-D85779855 branch from 88efba0 to ff1b33a Compare October 30, 2025 14:19

SS-JIA approved these changes Oct 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

trivedivivek commented Oct 29, 2025

Uh oh!

pytorch-bot bot commented Oct 29, 2025 •

edited

Loading

Uh oh!

meta-codesync bot commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

Are you sure you want to change the base?

Slightly improving 4 bit mat mul performance through better engage ALU pipes by splitting uint to float conversion operation. #15447

Conversation

trivedivivek commented Oct 29, 2025

Uh oh!

pytorch-bot bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15447

❗ 1 Active SEVs

✅ No Failures

Uh oh!

meta-codesync bot commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Oct 29, 2025 •

edited

Loading