Skip to content

fix(examples): seperate torch distributed training code to two scripts#78

Merged
jingxu9x merged 1 commit intomainfrom
fix/seperate_torch_distributed_trainig_example
Aug 11, 2025
Merged

fix(examples): seperate torch distributed training code to two scripts#78
jingxu9x merged 1 commit intomainfrom
fix/seperate_torch_distributed_trainig_example

Conversation

@jingxu9x
Copy link
Copy Markdown
Contributor

@jingxu9x jingxu9x commented Aug 7, 2025

No description provided.

@jingxu9x jingxu9x requested a review from Copilot August 7, 2025 09:55
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR separates PyTorch distributed training code into two distinct scripts, creating separate implementations for different distributed training backends. The changes extract MPI-based distributed training into a dedicated script while simplifying the original script to use PyTorch's native distributed launcher.

  • Creates a new MPI-specific distributed training script (main_with_mpi.py)
  • Simplifies the original script to use PyTorch's native distributed environment variables
  • Removes manual distributed setup code from the main script in favor of automatic initialization

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
advanced/pytorch-example/main_with_mpi.py New script implementing MPI-based distributed training with OMPI environment variables
advanced/pytorch-example/main.py Simplified to use PyTorch's native distributed launcher with standard environment variables

@jingxu9x jingxu9x merged commit c8e8610 into main Aug 11, 2025
1 check failed
@jingxu9x jingxu9x deleted the fix/seperate_torch_distributed_trainig_example branch August 11, 2025 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants