You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Minor fixes in simple_fsdp experiments (pytorch#1853)
pytorch#1850 removed `name` field in
`TrainSpec`. The experiments in simple_fsdp should also be updated.
Otherwise it won't run.
pytorch#1776 added `use_flex_attn`
field to `apply_non_moe_tp()`, which is missing in simple_fsdp
experiments
```
NGPU=8 CONFIG_FILE=./torchtitan/models/llama3/train_configs/debug_model.toml ./run_train.sh --model.name simple_fsdp.llama3 --compile.enable
```
```
NGPU=8 CONFIG_FILE=./torchtitan/models/deepseek_v3/train_configs/debug_model.toml ./run_train.sh --model.name simple_fsdp.deepseek_v3 --compile.enable
```
0 commit comments