-
Couldn't load subscription status.
- Fork 266
Description
Now that we've dropped Python 3.9 support (#1910), we can modernize our codebase to use Python 3.10+ features for better readability and maintainability. This is a great opportunity for community contributions!
Background
Python 3.10 introduced several powerful features that make code more readable and Pythonic:
- PEP 604: Union types using
|operator instead ofUnion[] - PEP 585: Built-in collection types for generics (e.g.,
listinstead ofList) - PEP 634: Structural pattern matching with
match/case
Our codebase has 500+ instances across 70+ files that can benefit from these modern idioms.
Modernization Categories
1. Type Hints with | Operator
Replace Union[] with the more readable | syntax.
Example:
# Before (Python 3.9 style)
from typing import Union, Optional, List, Dict
def oneshot(
model: Union[str, PreTrainedModel],
recipe: Optional[Union[str, List[str]]] = None,
dataset: Optional[Dict[str, Any]] = None,
) -> PreTrainedModel:
...
# After (Python 3.10+ style)
from typing import TYPE_CHECKING, Any
if TYPE_CHECKING:
from transformers import PreTrainedModel
def oneshot(
model: str | PreTrainedModel,
recipe: Optional[str | list[str]] = None,
dataset: Optional[dict[str, Any]] = None,
) -> PreTrainedModel:
...Benefits:
- More concise and readable
- Aligns with modern Python style guides
- Reduces import overhead
Files affected: ~70 files, ~500 instances
2. Built-in Generic Types
Use built-in types (list, dict, tuple, set) instead of importing from typing.
Example:
# Before
from typing import List, Dict, Tuple, Optional
def process_batch(
items: List[str],
config: Optional[Dict[str, int]] = None,
) -> Tuple[List[str], int]:
...
# After
def process_batch(
items: list[str],
config: Optional[dict[str, int]] = None,
) -> tuple[list[str], int]:
...Benefits:
- Cleaner imports
- Standard library usage
- Better IDE support
3. Structural Pattern Matching
Replace if/elif isinstance chains with match/case for type-based dispatch.
Example 1: Type dispatch
# Before
def log_value(log_tag: str, log_value: Any, epoch: float):
if isinstance(log_value, dict):
logger_manager.log_scalars(tag=log_tag, values=log_value, step=epoch)
elif isinstance(log_value, (int, float)):
logger_manager.log_scalar(tag=log_tag, value=log_value, step=epoch)
else:
logger_manager.log_string(tag=log_tag, string=log_value, step=epoch)
# After
def log_value(log_tag: str, log_value: Any, epoch: float):
match log_value:
case dict():
logger_manager.log_scalars(tag=log_tag, values=log_value, step=epoch)
case int() | float():
logger_manager.log_scalar(tag=log_tag, value=log_value, step=epoch)
case _:
logger_manager.log_string(tag=log_tag, string=log_value, step=epoch)Example 2: Recursive type handling
# Before
def onload_value(value: Any, device: torch.device) -> Any:
if isinstance(value, torch.Tensor):
return value.to(device=device)
if isinstance(value, list):
return [onload_value(v, device) for v in value]
if isinstance(value, tuple):
return tuple(onload_value(v, device) for v in value)
if isinstance(value, dict):
return {k: onload_value(v, device) for k, v in value.items()}
return value
# After
def onload_value(value: Any, device: torch.device) -> Any:
match value:
case torch.Tensor():
return value.to(device=device)
case list():
return [onload_value(v, device) for v in value]
case tuple():
return tuple(onload_value(v, device) for v in value)
case dict():
return {k: onload_value(v, device) for k, v in value.items()}
case _:
return valueExample 3: Configuration handling
# Before
if splits is None:
splits = {"all": None}
elif isinstance(splits, str):
splits = {get_split_name(splits): splits}
elif isinstance(splits, list):
splits = {get_split_name(s): s for s in splits}
# After
match splits:
case None:
splits = {"all": None}
case str():
splits = {get_split_name(splits): splits}
case list():
splits = {get_split_name(s): s for s in splits}Benefits:
- More explicit type handling
- Better readability
- Easier to extend with new types
- Type checkers can provide better analysis
Files affected: ~11 files with isinstance chains
How to Contribute
This is a great opportunity for first-time contributors! Each file can be updated independently.
Getting Started
- Pick a scope: Choose a single file or small module to modernize
- Check existing PRs: Look at example PRs to see the pattern (links will be added)
- Make changes: Update type hints and/or add pattern matching
- Test thoroughly: Run
make qualityand relevant tests - Submit PR: Reference this issue in your PR
Example Contribution Sizes
Small PR (Good First Issue) 🟢
- 1-2 files
- 10-20 type hint changes
- ~30 minutes of work
Medium PR 🟡
- 1 module (3-5 files)
- Add pattern matching to 1-2 functions
- ~1-2 hours of work
Large PR 🔴
- Complete module modernization
- Multiple pattern matching refactors
- ~3-4 hours of work
Suggested Files to Start With
Easy (Type Hints Only):
src/llmcompressor/args/*.py- Dataclass argumentssrc/llmcompressor/core/helpers.py- Helper functionssrc/llmcompressor/recipe/metadata.py- Recipe metadata
Medium (Type Hints + Pattern Matching):
src/llmcompressor/core/helpers.py-_log_model_loggable_itemsfunctionsrc/llmcompressor/datasets/utils.py- Dataset splits handlingsrc/llmcompressor/modifiers/*/base.py- Individual modifier files
Advanced (Complex Pattern Matching):
src/llmcompressor/pipelines/cache.py- Cache system with recursive typessrc/llmcompressor/modifiers/quantization/gptq/gptq_quantize.py- GPTQ logicsrc/llmcompressor/modifiers/pruning/sparsegpt/sgpt_base.py- SparseGPT logic
Requirements for PRs
✅ Must have:
- All
make qualitychecks pass (ruff formatting and linting) - Relevant tests pass (
pytest tests/{module} -v) - No functional changes (type hints/style only)
- Clean commit messages
- Reference this issue (e.g., "Part of #XXX")
✅ Nice to have:
- Updated docstrings if they reference old type syntax
- Multiple files in same module (for consistency)
- Comments explaining complex pattern matches
Testing Guidelines
# Code quality (required)
make quality
# Run tests for your changed module
pytest tests/llmcompressor/core -v # For core/* changes
pytest tests/llmcompressor/modifiers -v # For modifiers/* changes
pytest tests/llmcompressor/transformers -v # For transformers/* changes
# Quick smoke test (recommended)
pytest tests -m smoke
# Full test suite (for core/entrypoints changes)
make testProgress Tracking
We'll update this section as PRs are merged. Track overall progress here:
Overall Progress
- Type hints: 0/513 instances modernized (0%)
- Pattern matching: 0/11 files modernized (0%)
By Module
-
core/(80 instances across 5 files) -
modifiers/(81 instances across 15 files) -
transformers/(42 instances across 8 files) -
entrypoints/(33 instances across 3 files) -
metrics/(137 instances in logger.py) -
args/(27 instances across 4 files) - Other modules...
Resources
Python Enhancement Proposals:
- PEP 604 - Union Types as
X | Y - PEP 634 - Structural Pattern Matching
- PEP 636 - Pattern Matching Tutorial
- PEP 585 - Type Hinting Generics
Guides:
Example PRs
We'll create 2-3 example PRs to demonstrate the patterns:
- Example 1: Type hints in a core module (simple)
- Example 2: Pattern matching in a helper function (medium)
- Example 3: Comprehensive modernization of a modifier (advanced)
Links will be added here once created.
Related Issues
- Drop support for python 3.9 #1910 - Drop Python 3.9 support (merged)
Questions?
Feel free to ask questions in this issue! We're happy to help guide contributions.
Common questions:
- "Which file should I start with?" → Pick any file from the "Easy" list above
- "Can I mix type hints and pattern matching?" → Yes, but keep PRs focused on one module
- "How do I handle forward references?" → Use
TYPE_CHECKINGblocks (see examples above) - "What about breaking changes?" → None - these are syntax-only changes
Acknowledgments
Contributors who help modernize the codebase will be acknowledged in:
- PR reviews and merges
- Release notes
- This issue (we'll maintain a contributors list)
Thank you for helping make LLM Compressor more modern and maintainable! 🚀