Fix psum & psum_scatter sharding logic #23

utkarshsharma1 · 2025-08-27T19:53:39Z

This PR introduces critical fixes to the psum & psum_scatter microbenchmark to ensure its correctness and accuracy.

Correct Input Sharding (in_specs) for both collective operation
Problem: The input tensor was fully replicated (in_specs=P(None, None))
Fix: The input is now correctly sharded with in_specs=P(None, "ici")

chishuen · 2025-08-28T05:41:33Z

I think psum and psum_scatter do not require sharding. In fact, it might be better to keep the matrix unsharded. Otherwise, we might also need to tweak the implementation of AllGather such that the message size (x-axis) can align.

This might be helpful: https://jax-ml.github.io/scaling-book/training/#fully-sharded-data-parallelism-fsdp

Fix psum & psum_scatter sharding logic

2f29ba7

utkarshsharma1 requested a review from Perseus14 August 27, 2025 19:54

Fixed all collective operations logic

3d35d21

chishuen requested review from chishuen and removed request for chishuen August 28, 2025 05:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix psum & psum_scatter sharding logic #23

Fix psum & psum_scatter sharding logic #23

Uh oh!

utkarshsharma1 commented Aug 27, 2025

Uh oh!

chishuen commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix psum & psum_scatter sharding logic #23

Are you sure you want to change the base?

Fix psum & psum_scatter sharding logic #23

Uh oh!

Conversation

utkarshsharma1 commented Aug 27, 2025

Uh oh!

chishuen commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants