Transformers version 4.55 upgrade, Update PyTorch to 2.7.0+cpu, Torchvision to 0.22.0+cpu, and Python Requirement to >=3.9 #542

quic-mamta · 2025-08-19T19:40:10Z

Update Transformers to 4.55.0
Update PyTorch to 2.7.0+cpu
Torchvision to 0.22.0+cpu
and Python Requirement to >=3.9

Updated modeling files and Cache Utils for transformers 4.55.0

Updated models :

codegen
falcon
gemma
gemma2
gptj
gpt2
granite
granite_moe
grok1
llama
llama_swiftkv
mistral
mixtral_moe
mpt
phi
phi3
qwen2
starcoder2
gpt_bigcode
internvl
llava
llava_next
whisper
gemma3
llama4
mllama

QEfficient/transformers/models/gemma3/modeling_gemma3.py

quic-hemagnih · 2025-10-06T08:56:16Z

QEfficient/transformers/models/codegen/modeling_codegen.py

            # Apply the attention mask
            attn_weights = torch.where(attention_mask, mask_value, attn_weights)

+        attn_weights = attn_weights / self.scale_attn


Why it has been moved from line 51?

It was made equivalent to new TF code; they have moved it down since its placement whether at line 50 or 58, won't affect the performance. Should I move it back to 50?

vbaddi · 2025-10-06T09:05:50Z

QEfficient/utils/test_utils.py


    EXTERNAL_MODELS = {
-        "hpcai-tech/grok-1",
+        "hpcai-tech/grok-1": {


nit: Do we need this?

Signed-off-by: Asmita Goswami <[email protected]>

Signed-off-by: Mamta Singh <[email protected]>

Signed-off-by: Asmita Goswami <[email protected]>

quic-hemagnih

LGTM

Signed-off-by: Asmita Goswami <[email protected]>

ochougul · 2025-10-13T08:51:35Z

QEfficient/transformers/models/llama4/modeling_llama4.py

-        # use local attention mask for ROPE layers
-        if self.use_chunked_attention:
-            attention_mask = chunk_causal_mask
-


Why is this removed?

ochougul · 2025-10-13T08:52:31Z

QEfficient/transformers/models/llama4/modeling_llama4.py

+        logger.warning(
+            "Current version output doesn't match with HF output due to a bug in TF v_4.55. Switch to branch release/v_1.20 for TF match."
+        )


put link to this PR in the warning: https://github.com/huggingface/transformers/pull/39501/files#diff-e668ec07f78afdb2cb805d939e47453757f0b9437436cb860fcb7cb2431c9cf5R140

Signed-off-by: Asmita Goswami <[email protected]>

vbaddi

LGTM, thanks 👍

…vision to 0.22.0+cpu, and Python Requirement to >=3.9 (quic#542) Update Transformers to 4.55.0 Update PyTorch to 2.7.0+cpu Torchvision to 0.22.0+cpu and Python Requirement to >=3.9 Updated modeling files and Cache Utils for transformers 4.55.0 Updated models : 1. codegen 2. falcon 3. gemma 4. gemma2 5. gptj 6. gpt2 7. granite 8. granite_moe 9. grok1 10. llama 11. llama_swiftkv 12. mistral 13. mixtral_moe 14. mpt 15. phi 16. phi3 17. qwen2 18. starcoder2 19. gpt_bigcode 20. internvl 21. llava 22. llava_next 23. whisper 24. gemma3 25. llama4 26. mllama --------- Update Qeff Documentation to indicate vLLM Support in Validated Models Page Signed-off-by: Asmita Goswami <[email protected]> Signed-off-by: Mamta Singh <[email protected]> Co-authored-by: Mamta Singh <[email protected]> Co-authored-by: Asmita Goswami <[email protected]> Signed-off-by: Varun Gupta <[email protected]>

quic-mamta requested review from ochougul, quic-amitraj, quic-hemagnih and quic-rishinr as code owners August 19, 2025 19:40

quic-mamta changed the title ~~Tf version 4.55 upgrade~~ Transformers version 4.55 upgrade Aug 19, 2025

quic-mamta marked this pull request as draft August 19, 2025 19:42

asmigosw force-pushed the TF_version_4.55_upgrade branch from d36c124 to a514d36 Compare September 2, 2025 08:29

quic-mamta commented Sep 9, 2025

View reviewed changes

QEfficient/transformers/models/gemma3/modeling_gemma3.py Outdated Show resolved Hide resolved

quic-mamta force-pushed the TF_version_4.55_upgrade branch 2 times, most recently from e15d548 to 3643fee Compare September 23, 2025 08:45

quic-mamta marked this pull request as ready for review September 24, 2025 05:27

quic-mamta requested a review from vbaddi September 24, 2025 05:27

quic-mamta marked this pull request as draft September 24, 2025 19:31

quic-mamta force-pushed the TF_version_4.55_upgrade branch from 69ec2a4 to 6ad267b Compare September 25, 2025 07:46

abukhoy mentioned this pull request Sep 25, 2025

Update PyTorch to 2.7.1+cpu, Torchvision to 0.22.1+cpu, and Python Requirement to >=3.9 #524

Closed

2 tasks

quic-mamta changed the title ~~Transformers version 4.55 upgrade~~ Transformers version 4.55 upgrade, Update PyTorch to 2.7.0+cpu, Torchvision to 0.22.0+cpu, and Python Requirement to >=3.9 Sep 25, 2025

quic-mamta force-pushed the TF_version_4.55_upgrade branch 3 times, most recently from dd8b38e to 940dfcf Compare September 26, 2025 11:44

quic-mamta marked this pull request as ready for review September 26, 2025 11:44

quic-mamta force-pushed the TF_version_4.55_upgrade branch 2 times, most recently from 4f44dd4 to d8cf0a1 Compare September 28, 2025 13:10

quic-hemagnih reviewed Oct 6, 2025

View reviewed changes

vbaddi reviewed Oct 6, 2025

View reviewed changes

QEfficient/utils/test_utils.py

EXTERNAL_MODELS = {

"hpcai-tech/grok-1",

"hpcai-tech/grok-1": {

Copy link

Contributor

vbaddi Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Do we need this?

quic-mamta force-pushed the TF_version_4.55_upgrade branch 3 times, most recently from 8217cb5 to 7bf2298 Compare October 8, 2025 10:49

mamtsing and others added 3 commits October 10, 2025 08:30

Update modeling files

de577a5

Signed-off-by: Asmita Goswami <[email protected]>

Updated Grok-1 pytorch_hf_tokens

8e251ab

Signed-off-by: Asmita Goswami <[email protected]>

Updated test for InternVL

0b5c6fd

Signed-off-by: Asmita Goswami <[email protected]>

asmigosw and others added 3 commits October 10, 2025 08:31

Updated causal LM Test for Grok1

490ab29

Signed-off-by: Asmita Goswami <[email protected]>

Ruff format

420c0fc

Signed-off-by: Asmita Goswami <[email protected]>

fix llama4

178650f

Signed-off-by: Mamta Singh <[email protected]>

asmigosw force-pushed the TF_version_4.55_upgrade branch from c9fd8ea to 178650f Compare October 10, 2025 08:32

asmigosw added 7 commits October 10, 2025 14:02

Merge branch 'main' into TF_version_4.55_upgrade

384d3a6

Merge branch 'main' into TF_version_4.55_upgrade

8b5ab64

Updated llama4 for token matching with latest TF

e8bb39f

Signed-off-by: Asmita Goswami <[email protected]>

Ruff format

704a330

Signed-off-by: Asmita Goswami <[email protected]>

Updated llama for lora_ids

8bafcd0

Signed-off-by: Asmita Goswami <[email protected]>

Updated position_embeddings pop

8b136ee

Signed-off-by: Asmita Goswami <[email protected]>

Skipping Llama4

87ccade

Signed-off-by: Asmita Goswami <[email protected]>

quic-hemagnih approved these changes Oct 13, 2025

View reviewed changes

asmigosw added 2 commits October 13, 2025 08:36

Updated Llama4 TextMOE to original implementation

9524eee

Signed-off-by: Asmita Goswami <[email protected]>

Ruff format

4366e33

Signed-off-by: Asmita Goswami <[email protected]>

ochougul reviewed Oct 13, 2025

View reviewed changes

asmigosw added 2 commits October 13, 2025 09:10

Comments addressed

c6cd09f

Signed-off-by: Asmita Goswami <[email protected]>

Updated causal_mask_mapping

aa4ad49

Signed-off-by: Asmita Goswami <[email protected]>

vbaddi approved these changes Oct 13, 2025

View reviewed changes

quic-rishinr approved these changes Oct 14, 2025

View reviewed changes

quic-rishinr merged commit a9e404a into quic:main Oct 14, 2025
5 checks passed

quic-hemagnih mentioned this pull request Oct 15, 2025

TF ver 4.55.0, pytorch 2.7.1, hf hub 0.34.0 and diffusers 0.31.0 #551

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transformers version 4.55 upgrade, Update PyTorch to 2.7.0+cpu, Torchvision to 0.22.0+cpu, and Python Requirement to >=3.9 #542

Transformers version 4.55 upgrade, Update PyTorch to 2.7.0+cpu, Torchvision to 0.22.0+cpu, and Python Requirement to >=3.9 #542

Uh oh!

quic-mamta commented Aug 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

quic-hemagnih Oct 6, 2025

Uh oh!

asmigosw Oct 8, 2025

Uh oh!

vbaddi Oct 6, 2025

Uh oh!

quic-hemagnih left a comment

Uh oh!

ochougul Oct 13, 2025

Uh oh!

ochougul Oct 13, 2025

Uh oh!

vbaddi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Transformers version 4.55 upgrade, Update PyTorch to 2.7.0+cpu, Torchvision to 0.22.0+cpu, and Python Requirement to >=3.9 #542

Transformers version 4.55 upgrade, Update PyTorch to 2.7.0+cpu, Torchvision to 0.22.0+cpu, and Python Requirement to >=3.9 #542

Uh oh!

Conversation

quic-mamta commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

quic-hemagnih Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

asmigosw Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

vbaddi Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

quic-hemagnih left a comment

Choose a reason for hiding this comment

Uh oh!

ochougul Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

ochougul Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

vbaddi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

quic-mamta commented Aug 19, 2025 •

edited

Loading