Skip to content

Comments

fix: fix awq tracing fx problem#551

Open
begumcig wants to merge 1 commit intomainfrom
fix/llmcompressor-test
Open

fix: fix awq tracing fx problem#551
begumcig wants to merge 1 commit intomainfrom
fix/llmcompressor-test

Conversation

@begumcig
Copy link
Member

Description

Nightly test for llm compressor was failing with:
torch.fx.proxy.TraceError: symbolically traced variables cannot be used as inputs to control flow

Newer versions of transformers introduced masking utilities that rely on dynamic control flow, which breaks torch.fx tracing. Our LLaMA test model hits a code path that does not guard with is_tracing(), causing quantization to fail.

To disable FX tracing in Oneshot, we introduced a new hyperparameter for the calibration pipeline in the algorithm. This avoids the tracing issue, but loading the saved model then failed due to Hugging Face’s lazy initialization creating meta tensors. The compression logic was attempting to move these empty tensors to a device, which is invalid.

There is also an upstream fix related to this in compressed-tensors:
vllm-project/compressed-tensors#376

After testing multiple combinations, the following versions are stable so we pin them:

compressed-tensors==0.13.0
llm-compressor==0.9.0.2

Related Issue

Fixes #(issue number)

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

LLMCompressor test passes now.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

@begumcig begumcig requested a review from gsprochette February 24, 2026 13:13
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is kicking off a free Cloud Agent to fix this issue. This run is complimentary, but you can enable Autofix for all future PRs in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

@begumcig begumcig force-pushed the fix/llmcompressor-test branch from b4e8a0f to 0d90c81 Compare February 24, 2026 13:19
@cursor
Copy link

cursor bot commented Feb 24, 2026

Bugbot Autofix prepared fixes for 1 of the 1 bugs found in the latest run.

  • ✅ Fixed: New calibration_pipeline hyperparameter is never passed to oneshot
    • Added code to read calibration_pipeline from smash_config and pass it as the pipeline parameter to the oneshot() function call.

Create PR

Or push these changes by commenting:

@cursor push 3e7f39ba6a
Preview (3e7f39ba6a)
diff --git a/pyproject.toml b/pyproject.toml
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -128,7 +128,7 @@
     "whisper-s2t==1.3.1",
     "hqq==0.2.7.post1",
     "torchao>=0.12.0,<0.16.0", # 0.16.0 breaks diffusers 0.36.0, torch+torch: https://github.com/pytorch/ao/issues/2919#issue-3375688762
-    "llmcompressor",
+    "llmcompressor>0.8",
     "gliner; python_version >= '3.10'",
     "piq",
     "opencv-python",
@@ -138,6 +138,7 @@
     "imageio-ffmpeg",
     "jaxtyping",
     "peft>=0.17.1",
+    "compressed-tensors >= 0.13.0"
 ]
 
 [project.optional-dependencies]

diff --git a/src/pruna/algorithms/llm_compressor.py b/src/pruna/algorithms/llm_compressor.py
--- a/src/pruna/algorithms/llm_compressor.py
+++ b/src/pruna/algorithms/llm_compressor.py
@@ -70,6 +70,12 @@
                 default_value="W4A16",
                 meta=dict(desc="Quantization scheme to use. Use symmetric quantization to avoid decompression issues."),
             ),
+            CategoricalHyperparameter(
+                "calibration_pipeline",
+                choices=["independent", "basic", "datafree", "sequential", "layer_sequential"],
+                default_value="independent",
+                meta=dict(desc="Pipeline to use for calibration."),
+            ),
             TargetModules(
                 "target_modules",
                 default_value=None,
@@ -145,6 +151,8 @@
             defaults = self.get_model_dependent_hyperparameter_defaults(model, smash_config)
             target_modules = cast(TARGET_MODULES_TYPE, defaults["target_modules"])
 
+        calibration_pipeline = smash_config["calibration_pipeline"]
+
         def quantize_language_model(
             attr_name: str | None, language_model: torch.nn.Module, subpaths: list[str]
         ) -> torch.nn.Module:
@@ -173,7 +181,9 @@
                     targets=["Linear"],
                 )
             ]
-            return imported["oneshot"](model=language_model, recipe=recipe, dataset=dataset, processor=processor)
+            return imported["oneshot"](
+                model=language_model, recipe=recipe, dataset=dataset, processor=processor, pipeline=calibration_pipeline
+            )
 
         model = map_targeted_nn_roots(quantize_language_model, model, target_modules)
         return model

diff --git a/tests/algorithms/testers/awq.py b/tests/algorithms/testers/awq.py
--- a/tests/algorithms/testers/awq.py
+++ b/tests/algorithms/testers/awq.py
@@ -14,3 +14,6 @@
     allow_pickle_files = False
     algorithm_class = LLMCompressor
     metrics = ["perplexity"]
+    hyperparameters = {
+        "awq_calibration_pipeline": "basic",
+    }
This Bugbot Autofix run was free. To enable Autofix for future PRs, go to the Cursor dashboard.

@begumcig begumcig force-pushed the fix/llmcompressor-test branch from 0d90c81 to 2ca45df Compare February 24, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant