Skip to content

Conversation

kshitij12345
Copy link
Collaborator

@kshitij12345 kshitij12345 commented Oct 2, 2025

Changes:

  • Refactor MoE implementation from test_networks.py into a separate file.
  • Adds parallelization plan to the MoE using DTensor and checks that it runs fine with eager.

Time taken by the new test: ~4s

Running 1 items in this shard

thunder/tests/distributed/test_moe.py .                                                                                                                                                      [100%]

================================================================================== 1 passed, 2 warnings in 4.10s ===================================================================================

Almost same as #2478 except for some refactor and the branch lives on lightning-thunder repo instead of a fork to please the CI.

@github-actions github-actions bot added the ci label Oct 2, 2025
@github-actions github-actions bot removed the ci label Oct 2, 2025
@kshitij12345 kshitij12345 changed the title [WIP] MoE TensorParallel with Eager MoE TensorParallel with Eager Oct 2, 2025
@kshitij12345 kshitij12345 marked this pull request as ready for review October 2, 2025 13:28
Copy link
Collaborator

@crcrpar crcrpar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

@kshitij12345 kshitij12345 added the DTensor Issues about DTensor support in Thunder label Oct 3, 2025
Copy link
Collaborator

@t-vi t-vi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshitij12345 kshitij12345 enabled auto-merge (squash) October 3, 2025 14:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DTensor Issues about DTensor support in Thunder
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants