Skip to content

Conversation

@Raahul46
Copy link

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/2065

Context:
Currently, RocksDB stores data on row-wise format, to enable optimizer offloading for the Kernel. We will append the optimizer state to its corresponding row.

During initialization, we need to randomly initialize weights while the optimizer values need to initialized to zero.

When optimizer offloading is enabled,

In this diff:

We add two new arguments:

  1. enable_optimizer_offloading: This flag toggles between initializing the last optimizer_D rows to zero
  2. optimizer_D: The number of columns in the table that needs to be initialized to zero. This set of columns represent the optimizer values (w/wo padding).

Scenarios:

  1. Optimizer_offloading is False:
    max_D = Dimensions of weights only,
    optimizer_D = 0

  2. Optimizer_offloading is True:
    max_D = Dimension of weights (w_D) + optimizers (o_D)
    optimizer_D = dimensions of optimizers (o_D)

initialize o_D columns with zero

Differential Revision: D85157732

@netlify
Copy link

netlify bot commented Oct 25, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit b7c013a
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68fc4f660335b800087dfca9
😎 Deploy Preview https://deploy-preview-5055--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla bot added the cla signed label Oct 25, 2025
@meta-codesync
Copy link
Contributor

meta-codesync bot commented Oct 25, 2025

@Raahul46 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D85157732.

Summary:

X-link: facebookresearch/FBGEMM#2065

**Context:**
Currently, RocksDB stores data on row-wise format, to enable optimizer offloading for the Kernel. We will append the optimizer state to its corresponding row.

During initialization, we need to randomly initialize weights while the optimizer values need to initialized to zero. 

When optimizer offloading is enabled, 

**In this diff:**

We add two new arguments:
1. enable_optimizer_offloading: This flag toggles between initializing the last optimizer_D rows to zero
2. optimizer_D: The number of columns in the table that needs to be initialized to zero. This set of columns represent the optimizer values (w/wo padding). 

**Scenarios:**
1. Optimizer_offloading is False:
max_D = Dimensions of weights only,
optimizer_D = 0

2. Optimizer_offloading is True:
max_D = Dimension of weights (w_D) + optimizers (o_D)
optimizer_D = dimensions of optimizers (o_D)

initialize o_D columns with zero

Differential Revision: D85157732
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant