Skip to content

Commit c7b06b1

Browse files
[https://nvbugs/5488576][fix] Propagate disable_finalize_fusion config flag in WIDEEP MoE backend (cherry-pick #8141) (#8566)
Signed-off-by: Sergey Klevtsov <[email protected]> Co-authored-by: Sergey Klevtsov <[email protected]>
1 parent e86d6db commit c7b06b1

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

tensorrt_llm/_torch/modules/fused_moe/fused_moe_wide_ep.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,8 @@ def __init__(
222222
f"Not available alltoall method type: {self.alltoall_method_type!r}"
223223
)
224224

225+
self.use_fused_finalize = not model_config.moe_disable_finalize_fusion
226+
225227
self._weights_created = False
226228
if not model_config.skip_create_weights_in_init:
227229
self.create_weights()
@@ -689,7 +691,7 @@ def forward_chunk(
689691
input_sf=x_sf,
690692
swizzled_input_sf=False,
691693
min_latency_mode=False,
692-
use_fused_finalize=True,
694+
use_fused_finalize=self.use_fused_finalize,
693695
tuner_num_tokens=tuner_num_tokens,
694696
tuner_top_k=tuner_top_k,
695697
)

0 commit comments

Comments
 (0)