[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled #27146

ZJY0516 · 2025-10-18T08:00:21Z

Purpose

Based on #24604, modified activation fusion pass to do op matching w/o needing to enable the custom op.

Test Plan

pytest -s tests/compile/test_silu_mul_quant_fusion.py

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: zjy0516 <[email protected]>

gemini-code-assist

Code Review

This pull request enables the silu_mul_fp8_quant fusion pass to work even when the silu_and_mul custom operator is not enabled, by matching against the native PyTorch implementation. This is achieved by introducing a MatcherSiluAndMul utility that can trace either the custom op or the native implementation. The changes are well-structured and the tests have been updated to cover both scenarios. My review found a minor issue in the test suite where TestSiluMulNvfp4QuantModel is not correctly handled by the new test parameterization, which would cause test failures. I've provided a suggestion to fix this by adding appropriate skip conditions.

tests/compile/test_silu_mul_quant_fusion.py

Signed-off-by: zjy0516 <[email protected]>

ProExpertProg

Looks great! For tests, could you only generate relevant tests, and then skip based on support (right now it's a little bit mixed up)

vllm/compilation/matcher_utils.py

vllm/compilation/activation_quant_fusion.py

tests/compile/test_silu_mul_quant_fusion.py

Signed-off-by: zjy0516 <[email protected]>

ProExpertProg

Great work! Could you post some E2E perf and accuracy numbers? And would you be interested in adding dynamic quant support as a follow-up?

tests/compile/test_silu_mul_quant_fusion.py

Signed-off-by: zjy0516 <[email protected]>

ZJY0516 · 2025-10-20T16:08:24Z

Great work! Could you post some E2E perf and accuracy numbers?

Do you know which model use silu_mul and fp8 quant?

And would you be interested in adding dynamic quant support as a follow-up?

Sure.

ProExpertProg · 2025-10-20T22:07:45Z

Do you know which model use silu_mul and fp8 quant?

silu_mul is used by basically all models. fp8 quant is used by the -FP8 quantized models. For example you can use redhatai/meta-llama3.1-8b-instruct-fp8 or redhatai/meta-llama3.1-70B-instruct-fp8

Signed-off-by: zjy0516 <[email protected]>

silu_mul_fp8_quant

9356596

Signed-off-by: zjy0516 <[email protected]>

ZJY0516 requested review from ProExpertProg, youkaichao and zou3519 as code owners October 18, 2025 08:00

gemini-code-assist bot reviewed Oct 18, 2025

View reviewed changes

tests/compile/test_silu_mul_quant_fusion.py Show resolved Hide resolved

update

6e49f72

Signed-off-by: zjy0516 <[email protected]>

ProExpertProg reviewed Oct 18, 2025

View reviewed changes

ZJY0516 added 2 commits October 19, 2025 14:19

update

6e83e11

Signed-off-by: zjy0516 <[email protected]>

update

0fb6ffe

Signed-off-by: zjy0516 <[email protected]>

ZJY0516 requested a review from ProExpertProg October 19, 2025 06:22

ProExpertProg approved these changes Oct 20, 2025

View reviewed changes

tests/compile/test_silu_mul_quant_fusion.py Outdated Show resolved Hide resolved

tests/compile/test_silu_mul_quant_fusion.py Outdated Show resolved Hide resolved

ProExpertProg added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 20, 2025

ZJY0516 added 2 commits October 21, 2025 00:02

update

41d4313

Signed-off-by: zjy0516 <[email protected]>

Merge branch 'main' into SiluMul_quant

9e3310e

ProExpertProg approved these changes Oct 20, 2025

View reviewed changes

ProExpertProg added this to the vllm==v0.12.0/torch==2.9.0 compilation improvements milestone Oct 20, 2025

ZJY0516 added 2 commits October 21, 2025 11:11

fix nvfp4 test

7162838

Signed-off-by: zjy0516 <[email protected]>

fix nvfp4 test

0e8c6ae

Signed-off-by: zjy0516 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled #27146

[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled #27146

ZJY0516 commented Oct 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

ProExpertProg left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ProExpertProg left a comment

Uh oh!

Uh oh!

Uh oh!

ZJY0516 commented Oct 20, 2025

Uh oh!

ProExpertProg commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled #27146

Are you sure you want to change the base?

[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled #27146

Conversation

ZJY0516 commented Oct 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ZJY0516 commented Oct 20, 2025

Uh oh!

ProExpertProg commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZJY0516 commented Oct 18, 2025 •

edited by github-actions bot

Loading