⚡️ Speed up function _highlight_value by 46%
#413
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 46% (0.46x) speedup for
_highlight_valueinpandas/io/formats/style.py⏱️ Runtime :
33.7 milliseconds→23.1 milliseconds(best of68runs)📝 Explanation and details
The optimized code achieves a 45% speedup through targeted improvements in pandas' core data manipulation functions:
Key Optimizations:
whereMethod Micro-optimization: Localized global variable lookups (PYPY,REF_COUNT) to avoid repeated attribute access overhead. While minor, this reduces overhead in the reference counting check path.notnaFunction Enhancement: Added an optimized path for NumPy arrays usingnp.logical_not(res)instead of the generic~resoperator. NumPy'slogical_notis more efficient for boolean arrays as it avoids some of the overhead associated with pandas' generic bitwise negation._highlight_valueFunction - Major Performance Win: This is where the biggest gains occur (55% time reduction in the critical path). The optimization replaces:with:
Why This Works:
.where()method creates intermediate pandas objects and involves complex indexing logicwhere()directly, which is much faster for element-wise conditional operationsTest Results Show Consistent Gains:
test_series_min_basic: 304μs → 182μs)test_dataframe_min_basic: 596μs → 446μs)The optimization is particularly effective for styling operations that frequently call
_highlight_value, making conditional formatting significantly faster while preserving all existing behavior and edge case handling.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-_highlight_value-mja24afband push.