-
Notifications
You must be signed in to change notification settings - Fork 13
feat: DDP gradient bucketing #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
+1,050
−15
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kilinchange
requested changes
Nov 20, 2025
19c9727 to
ed1a608
Compare
Contributor
Author
|
原先 stream wait 逻辑有误(在某个 bucket 的 allreduce 调用后立刻让 compute stream wait for done_event,这样的话通信计算相当于完全不重叠)。现在把 wait 时机延后至所有 bucket 均发射完 allreduce 后再进行。 为此,需要调用 Work 提供的 wait 操作,同时让 Work 提供 WaitBlocking/WaitNonBlocking 两种操作。前者是 cpu 端的 cudaEventSynchronize 操作,这点与 torch 提供的是对齐的;后者是 cudaStreamWaitEvent 操作,只是在 stream 中插点,不阻塞 cpu 端执行。 |
Collaborator
kilinchange
requested changes
Nov 27, 2025
kilinchange
approved these changes
Nov 27, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

No description provided.