[NOT for land] cleaner workaround for moe + compile + ac graph break issue #1875

danielvegamyhre · 2025-10-14T21:05:41Z

Compile dense layers, run MoE layer in eager.

Commands

mxfp8 grouped mm:

LOG_RANK=4 NGPU=8 CONFIG_FILE="/home/${USER}/torchtitan/torchtitan/models/llama4/train_configs/llama4_17bx16e.toml" ./run_train.sh \
--metrics.log_freq=10 --training.steps=1000  \
--parallelism.data_parallel_shard_degree=4 \
--parallelism.expert_parallel_degree=4 \
--parallelism.tensor_parallel_degree=1 \
--parallelism.expert_tensor_parallel_degree=1 \
--training.seq_len=8192 \
--training.local_batch_size=12 \
--model.print_after_conversion \
--activation_checkpoint.mode="full" \
--parallelism.pipeline_parallel_degree 2 \
--parallelism.pipeline_parallel_schedule "Interleaved1F1B" \
--parallelism.pipeline_parallel_layers_per_stage 1 \
--model.converters="quantize.grouped_mm.mx,quantize.linear.mx" \
--quantize.grouped_mm.mx.fqns="experts" \
--quantize.linear.mx.filter_fqns="output,moe,wk,wv" \
--compile.enable

bf16 baseline:

LOG_RANK=4 NGPU=8 CONFIG_FILE="/home/${USER}/torchtitan/torchtitan/models/llama4/train_configs/llama4_17bx16e.toml" ./run_train.sh \
--metrics.log_freq=10 --training.steps=1000  \
--parallelism.data_parallel_shard_degree=4 \
--parallelism.expert_parallel_degree=4 \
--parallelism.tensor_parallel_degree=1 \
--parallelism.expert_tensor_parallel_degree=1 \
--training.seq_len=8192 \
--training.local_batch_size=12 \
--model.print_after_conversion \
--activation_checkpoint.mode="full" \
--parallelism.pipeline_parallel_degree 2 \
--parallelism.pipeline_parallel_schedule "Interleaved1F1B" \
--parallelism.pipeline_parallel_layers_per_stage 1 \
--compile.enable

danielvegamyhre requested review from fegin, tianyu-l, wconstab and wwwjn as code owners October 14, 2025 21:05

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 14, 2025

cleaner workaround for moe + compile + ac graph break issue

90844fc

danielvegamyhre force-pushed the cleanhack branch from 07096f4 to 90844fc Compare October 14, 2025 21:15

run moe in eager

c0d3966

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NOT for land] cleaner workaround for moe + compile + ac graph break issue #1875

[NOT for land] cleaner workaround for moe + compile + ac graph break issue #1875

Uh oh!

danielvegamyhre commented Oct 14, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[NOT for land] cleaner workaround for moe + compile + ac graph break issue #1875

Are you sure you want to change the base?

[NOT for land] cleaner workaround for moe + compile + ac graph break issue #1875

Uh oh!

Conversation

danielvegamyhre commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Commands

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

danielvegamyhre commented Oct 14, 2025 •

edited

Loading