-
Notifications
You must be signed in to change notification settings - Fork 58
[Example] One shot all reduce #245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
stack-info: PR: #245, branch: joydddd/stack/12
8751b7d to
a76965c
Compare
a76965c to
ec93b60
Compare
|
|
||
|
|
||
| @helion.jit( | ||
| config=helion.Config( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we able to autotune this yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the blockers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need support for collaborative autotuning on multiple torchrun initiated processes.
I have event-based benchmarking infra ready in #393 (autotuner/benchmarker) which reports timing results on process 0.
We need to:
- Make sure all processes benchmark the same configs in the same order. (Is there any randomization in the autotuning process?)
- Use the event based benchmarker when in torchrun env inside autotuner. (easy)
- Communicate results from process 0 to all processes, OR process 0 makes a decision and communicate the optimal config to all processes. (Through caching?)
dc63692 to
f618391
Compare
stack-info: PR: #245, branch: joydddd/stack/12
ec93b60 to
15b3f75
Compare
15b3f75 to
d3b2b64
Compare
d3b2b64 to
cee26aa
Compare
cee26aa to
30959b0
Compare
30959b0 to
2c0a1be
Compare
stack-info: PR: #245, branch: joydddd/stack/12
cb9d73e to
4273b27
Compare
stack-info: PR: #245, branch: joydddd/stack/12
4273b27 to
abf1f4b
Compare
Stacked PRs:
[Example] One shot all reduce