Conversation
geoalgo
left a comment
There was a problem hiding this comment.
LGTM but they are some conflicts to solve.
| - task: include_base_44_ukrainian | ||
| subset: Ukrainian | ||
|
|
||
| dclm-core-22: |
There was a problem hiding this comment.
could you add the dataset field to either the task group or tasks? otherwise dataset pre-downloading (before job submission) will fail/not be available and then the jobs will fail unless you have internet access on the compute nodes. You can check global-mmlu-eu or generic-multilingual for an example of how this looks
|
|
||
| build-push: | ||
| needs: setup-lambda | ||
| strategy: |
There was a problem hiding this comment.
we moved to SkyPilot + Lambda Labs for the container build workflow so the actual workflows are now split between .github/workflows and .github/sky
|
Thanks for the contribution and making this work for Jupiter @harshraj172! Could you update your branch with the latest changes from main? We changed a few things about the github actions workflows (left you a comment about that) and testing. |
| SINGULARITY_ARGS: "--nv --contain --env PYTHONNOUSERSITE=1" | ||
| EVAL_CONTAINER_IMAGE: "lm-eval-jupiter.sif" | ||
| LIGHTEVAL_CONTAINER_IMAGE: "lighteval-jupiter.sif" | ||
| SINGULARITY_ARGS: "--nv --contain --env PYTHONNOUSERSITE=1 --env SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt" |
There was a problem hiding this comment.
You'll probably also have to integrate this into the container downloading logic in oellm/utils.py::_ensure_singularity_image otherwise it'll only download the lm-eval container
23c87ea to
35a1f83
Compare
|
Some changes in this PR. @timurcarstensen @geoalgo @JeniaJitsev
|
There was a problem hiding this comment.
thank you @harshraj172
can you rebase so that we can merge to mainline?
my main comment is on not tying a specific version just for jupiter, otherwise it looks good to go once updated with mainline.
core.jpbl-s02-02.582022
Outdated
There was a problem hiding this comment.
are those intended? If not lets remove them.
containers/jupiter.def
Outdated
| uv pip install --system --break-system-packages "lm-eval==0.4.9.2" \ | ||
| "transformers>=4.43.2,<5.0.0" "datasets<4.0.0" wandb sentencepiece tiktoken accelerate |
There was a problem hiding this comment.
This part is tricky, we do not want to tie the version per containers as we will get different results accross clusters. If you need a specific version for a fixed benchmark, the best would be to use a custom environment (which can be done by following https://github.com/OpenEuroLLM/oellm-cli/blob/main/docs/VENV.md happy to help if there is any issue with this, I tested it and it worked on lumi for instance).
| uv pip install --system --break-system-packages "lm-eval==0.4.9.2" \ | |
| "transformers>=4.43.2,<5.0.0" "datasets<4.0.0" wandb sentencepiece tiktoken accelerate | |
| uv pip install --system --break-system-packages lm-eval \ | |
| "transformers<=4.53.0" "datasets<4.0.0" wandb sentencepiece tiktoken accelerate |
There was a problem hiding this comment.
Yes, this is tricky. If we don’t pin the versions, agieval_lsat_ar will break. Do you think we should mention somewhere in docs/VENV.md that people should use a venv if they want to run all 21 tasks (otherwise it will only run 20)?
There was a problem hiding this comment.
Thanks for rebasing, my only remaining point is about not pinning the version on the jupiter container.
What you suggest is a good idea, we can also possibly add another requirements.txt with those versions named with something like requirements-venv-dclm.txt (in addition to that one https://github.com/OpenEuroLLM/oellm-cli/blob/main/requirements-venv.txt) and add an entry in the readme.md (or VENV.md) which explains how to evaluate this.
There was a problem hiding this comment.
Thanks for the review. I resolved the comments.
core.jpbl-s02-02.582022
Outdated
There was a problem hiding this comment.
those binaries are probably not intended, can you remove them?
| uv pip install --system --break-system-packages lm-eval \ | ||
| wandb sentencepiece tiktoken accelerate |
There was a problem hiding this comment.
Isnt it missing datasets and some other dependencies?
uv pip install --system --break-system-packages \
lm-eval \
transformers \
"datasets<4.0.0" \
wandb \
sentencepiece \
tiktoken \
accelerate \
nltk
If so lets merge this PR to add DCLM support and update the jupiter container in a follow-up since the PR has been open for a long time.
There was a problem hiding this comment.
@geoalgo I found that lm-eval installs these already. Also nltk isn't required now as it was used with lighteval earlier.
To add lighteval for dclm on JUPITER we had to add another
jupiter-lighteval.defas including the installations in the same.deffile for jupiter was creating conflicting issues. Because Jupiter uses ARM64, and many of lighteval's runtime dependencies (spacy, underthesea, pyvi, etc) lack pre-built aarch64 wheels