-
Notifications
You must be signed in to change notification settings - Fork 69
Enable online writing to WandB in MAST jobs #628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/forge/controller/launcher.py
Outdated
| "VLLM_USE_TRITON_FLASH_ATTN": "0", | ||
| "WANDB_MODE": "offline", | ||
| "WANDB_MODE": "online", | ||
| "WANDB_BASE_URL": "https://meta.wandb.io", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder if you think moving these MAST controller stuff to a separate src/forge/controller/fb/ make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think we can move MastLauncher to a separate file, I can do that after this Pr
|
What's the steps to set up the team keychain for wandb? |
| if wandb_api_key is not None: | ||
| os.environ["WANDB_API_KEY"] = wandb_api_key | ||
| os.environ["WANDB_BASE_URL"] = WANDB_HOST | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should terminate the launcher if wandb_api_key is None.
Otherwise, it just delays the crash in MAST
Description
In this PR we use the secrets API to extract the WandB API key and add it to the environment.
Test Plan
From logs: