-
Notifications
You must be signed in to change notification settings - Fork 59
Transformers version 4.55 upgrade, Update PyTorch to 2.7.0+cpu, Torchvision to 0.22.0+cpu, and Python Requirement to >=3.9 #542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d36c124
to
a514d36
Compare
e15d548
to
3643fee
Compare
69ec2a4
to
6ad267b
Compare
dd8b38e
to
940dfcf
Compare
4f44dd4
to
d8cf0a1
Compare
# Apply the attention mask | ||
attn_weights = torch.where(attention_mask, mask_value, attn_weights) | ||
|
||
attn_weights = attn_weights / self.scale_attn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it has been moved from line 51?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was made equivalent to new TF code; they have moved it down since its placement whether at line 50 or 58, won't affect the performance. Should I move it back to 50?
|
||
EXTERNAL_MODELS = { | ||
"hpcai-tech/grok-1", | ||
"hpcai-tech/grok-1": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Do we need this?
8217cb5
to
7bf2298
Compare
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Mamta Singh <[email protected]>
c9fd8ea
to
178650f
Compare
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
# use local attention mask for ROPE layers | ||
if self.use_chunked_attention: | ||
attention_mask = chunk_causal_mask | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this removed?
logger.warning( | ||
"Current version output doesn't match with HF output due to a bug in TF v_4.55. Switch to branch release/v_1.20 for TF match." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put link to this PR in the warning: https://github.com/huggingface/transformers/pull/39501/files#diff-e668ec07f78afdb2cb805d939e47453757f0b9437436cb860fcb7cb2431c9cf5R140
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks 👍
…vision to 0.22.0+cpu, and Python Requirement to >=3.9 (quic#542) Update Transformers to 4.55.0 Update PyTorch to 2.7.0+cpu Torchvision to 0.22.0+cpu and Python Requirement to >=3.9 Updated modeling files and Cache Utils for transformers 4.55.0 Updated models : 1. codegen 2. falcon 3. gemma 4. gemma2 5. gptj 6. gpt2 7. granite 8. granite_moe 9. grok1 10. llama 11. llama_swiftkv 12. mistral 13. mixtral_moe 14. mpt 15. phi 16. phi3 17. qwen2 18. starcoder2 19. gpt_bigcode 20. internvl 21. llava 22. llava_next 23. whisper 24. gemma3 25. llama4 26. mllama --------- Update Qeff Documentation to indicate vLLM Support in Validated Models Page Signed-off-by: Asmita Goswami <[email protected]> Signed-off-by: Mamta Singh <[email protected]> Co-authored-by: Mamta Singh <[email protected]> Co-authored-by: Asmita Goswami <[email protected]> Signed-off-by: Varun Gupta <[email protected]>
Update Transformers to 4.55.0
Update PyTorch to 2.7.0+cpu
Torchvision to 0.22.0+cpu
and Python Requirement to >=3.9
Updated modeling files and Cache Utils for transformers 4.55.0
Updated models :