Add row based sharding support for FeaturedProcessedEBC #3406

iamzainhuda · 2025-10-01T18:43:35Z

Summary:
X-link: #3281

In this diff we introduce row based sharding (TWRW, RW, GRID) type support for feature processors. Previously, feature processors did not support row based sharding since feature processors are data parallel. This means by splitting up the input for row based shards the accessed feature processor weights were in correct. In column/data sharding based approaches, the data is duplicated ensuring the correct weight is accessed across ranks.

The indices/buckets are calculated post input split/distribution, to make it compatible with row based sharding we calculate this pre input split/distribution. This couples the train pipeline and feature processors. For each feature, we preprocess the input and place the calculated indices in KJT.weights, this propagates the indices correctly and indexs into the right weight to use for the final step in the feature processing.

This applies in both pipelined and non pipelined situations - the input modification is done either at the pipelined forward call or in the input dist of the FPEBC. This is determined by the pipelining flag set through rewrite_model in train pipeline.

Differential Revision: D82248545

facebook-github-bot · 2025-10-01T18:43:59Z

@iamzainhuda has exported this pull request. If you are a Meta employee, you can view the originating Diff in D82248545.

…#3406) Summary: X-link: meta-pytorch#3281 In this diff we introduce row based sharding (TWRW, RW, GRID) type support for feature processors. Previously, feature processors did not support row based sharding since feature processors are data parallel. This means by splitting up the input for row based shards the accessed feature processor weights were in correct. In column/data sharding based approaches, the data is duplicated ensuring the correct weight is accessed across ranks. The indices/buckets are calculated post input split/distribution, to make it compatible with row based sharding we calculate this pre input split/distribution. This couples the train pipeline and feature processors. For each feature, we preprocess the input and place the calculated indices in KJT.weights, this propagates the indices correctly and indexs into the right weight to use for the final step in the feature processing. This applies in both pipelined and non pipelined situations - the input modification is done either at the pipelined forward call or in the input dist of the FPEBC. This is determined by the pipelining flag set through rewrite_model in train pipeline. Differential Revision: D82248545

meta-codesync · 2025-10-22T21:16:07Z

@iamzainhuda has exported this pull request. If you are a Meta employee, you can view the originating Diff in D82248545.

…#3406) Summary: X-link: meta-pytorch#3281 In this diff we introduce row based sharding (TWRW, RW, GRID) type support for feature processors. Previously, feature processors did not support row based sharding since feature processors are data parallel. This means by splitting up the input for row based shards the accessed feature processor weights were in correct. In column/data sharding based approaches, the data is duplicated ensuring the correct weight is accessed across ranks. The indices/buckets are calculated post input split/distribution, to make it compatible with row based sharding we calculate this pre input split/distribution. This couples the train pipeline and feature processors. For each feature, we preprocess the input and place the calculated indices in KJT.weights, this propagates the indices correctly and indexs into the right weight to use for the final step in the feature processing. This applies in both pipelined and non pipelined situations - the input modification is done either at the pipelined forward call or in the input dist of the FPEBC. This is determined by the pipelining flag set through rewrite_model in train pipeline. Differential Revision: D82248545

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 1, 2025

facebook-github-bot added fb-exported meta-exported labels Oct 1, 2025

iamzainhuda force-pushed the export-D82248545 branch from 57a557c to 14ca280 Compare October 16, 2025 20:40

iamzainhuda force-pushed the export-D82248545 branch from 14ca280 to 9f3b29f Compare October 22, 2025 21:16

iamzainhuda force-pushed the export-D82248545 branch from 9f3b29f to f77dcf3 Compare October 23, 2025 18:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add row based sharding support for FeaturedProcessedEBC #3406

Add row based sharding support for FeaturedProcessedEBC #3406

iamzainhuda commented Oct 1, 2025

Uh oh!

facebook-github-bot commented Oct 1, 2025

Uh oh!

meta-codesync bot commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add row based sharding support for FeaturedProcessedEBC #3406

Are you sure you want to change the base?

Add row based sharding support for FeaturedProcessedEBC #3406

Conversation

iamzainhuda commented Oct 1, 2025

Uh oh!

facebook-github-bot commented Oct 1, 2025

Uh oh!

meta-codesync bot commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants