-
Notifications
You must be signed in to change notification settings - Fork 12
Feature: Support downloading model weights on-the-fly from HuggingFace (#166) #167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature: Support downloading model weights on-the-fly from HuggingFace (#166) #167
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #167 +/- ##
==========================================
- Coverage 90.83% 90.80% -0.04%
==========================================
Files 14 14
Lines 1342 1359 +17
==========================================
+ Hits 1219 1234 +15
- Misses 123 125 +2
🚀 New features to boost your workflow:
|
XkunW
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rohan-uiuc, thanks for opening this, I left a few comments. Another thing worth considering is adding a check in the API to see if a model needs to be downloaded, and if that's the case, only allow the download if the HF cache directory env var is set, so that users wouldn't accidentally download a model to their home directory and use up all the quota
| ], | ||
| "imports": "source {src_dir}/find_port.sh", | ||
| "bind_path": f"export {CONTAINER_MODULE_NAME.upper()}_BINDPATH=${CONTAINER_MODULE_NAME.upper()}_BINDPATH,/dev,/tmp,{{model_weights_path}}{{additional_binds}}", | ||
| "bind_path": f"export {CONTAINER_MODULE_NAME.upper()}_BINDPATH=${CONTAINER_MODULE_NAME.upper()}_BINDPATH,$(echo /dev/infiniband* | sed -e 's/ /,/g'),/dev,/tmp{{model_weights_path}}{{additional_binds}}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like an error from merging the code? The /dev directory is already binded so the extra handling for /dev/infiniband is no longer needed
| """ | ||
| launcher_script = ["\n"] | ||
|
|
||
| vllm_args_copy = self.params["vllm_args"].copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is necessary, as the model name should be parsed with launch command not part of --vllm-args
| if self.model_weights_exists | ||
| else self.params["model_name"] | ||
| ) | ||
| self.model_bind_option = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this member variable is never used anywhere?
PR Type
Feature
Short Description
Implements support for on-the-fly model weight downloads from HuggingFace when local model weights directory doesn't exist. This allows users to launch models without manually downloading and mounting weight directories.
The code now checks if the model weights directory exists before attempting to bind mount it. If the directory doesn't exist, it skips the bind mount and uses the model identifier from
--modelin vllm_args (or falls back to model_name). Users must pass the full HuggingFace model identifier (e.g.,Qwen/Qwen2.5-7B-Instruct) via--modelin vllm_args for automatic downloads to work.Fixes #166
Tests Added
test_generate_server_setup_singularity_no_weights: Verifies server setup doesn't include model weights path when directory doesn't existtest_generate_launch_cmd_singularity_no_local_weights: Verifies launch command uses HF model identifier when local weights are missingtest_generate_model_launch_script_singularity_no_weights: Verifies batch mode correctly handles missing model weights--modelis specified in vllm_args