Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Dec 17, 2025

📄 83% (0.83x) speedup for is_continuous_axis in keras/src/utils/python_utils.py

⏱️ Runtime : 688 microseconds 376 microseconds (best of 172 runs)

📝 Explanation and details

The optimized code achieves an 82% speedup by eliminating redundant computation and leveraging early exit optimization.

Key optimization: The original code redundantly checked the same condition twice with separate positive_order_flag and negative_order_flag loops. Both loops were testing axis[i + 1] - axis[i] != 1, making one of them completely unnecessary since the function only needs to determine if consecutive elements differ by exactly 1.

What changed:

  • Removed the duplicate second loop that checked identical logic
  • Eliminated unnecessary flag variables and final OR operation
  • Added early exit: returns False immediately when the first non-consecutive pair is found
  • Simplified the logic to a single loop that returns True only if all consecutive differences equal 1

Why it's faster:

  1. 50% fewer loop iterations: Only one loop instead of two identical ones
  2. Early termination: Stops processing as soon as a gap is detected, rather than checking all elements twice
  3. Reduced variable assignments: No flag variables or boolean operations needed
  4. Better cache locality: Single pass through the data instead of two

Performance impact: Based on the function references, is_continuous_axis is called in critical neural network operations like RMS normalization and layer normalization in PyTorch backend optimizations. The test results show particularly strong gains (90%+ speedup) for large continuous arrays and arrays with early discontinuities, which are common in tensor dimension checking scenarios. This optimization will significantly benefit deep learning workloads that frequently validate tensor axis continuity during model execution.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 75 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from keras.src.utils.python_utils import is_continuous_axis

# unit tests

# =========================
# Basic Test Cases
# =========================

def test_single_int_axis():
    # Single integer should always be continuous
    codeflash_output = is_continuous_axis(0) # 501ns -> 457ns (9.63% faster)
    codeflash_output = is_continuous_axis(-1) # 213ns -> 241ns (11.6% slower)
    codeflash_output = is_continuous_axis(5) # 152ns -> 147ns (3.40% faster)

def test_single_element_list():
    # Single-element list should be continuous
    codeflash_output = is_continuous_axis([0]) # 598ns -> 556ns (7.55% faster)
    codeflash_output = is_continuous_axis([-1]) # 218ns -> 217ns (0.461% faster)
    codeflash_output = is_continuous_axis([10]) # 188ns -> 192ns (2.08% slower)

def test_two_element_continuous_axis():
    # Two elements, continuous, positive direction
    codeflash_output = is_continuous_axis([0, 1]) # 2.38μs -> 1.80μs (31.8% faster)
    codeflash_output = is_continuous_axis([1, 2]) # 835ns -> 610ns (36.9% faster)
    codeflash_output = is_continuous_axis([-2, -1]) # 633ns -> 400ns (58.2% faster)

def test_two_element_non_continuous_axis():
    # Two elements, not continuous
    codeflash_output = is_continuous_axis([0, 2]) # 1.89μs -> 1.53μs (23.3% faster)
    codeflash_output = is_continuous_axis([1, 3]) # 871ns -> 601ns (44.9% faster)
    codeflash_output = is_continuous_axis([-2, 0]) # 630ns -> 419ns (50.4% faster)

def test_multiple_element_continuous_axis():
    # Multiple elements, continuous, positive direction
    codeflash_output = is_continuous_axis([0, 1, 2, 3]) # 2.21μs -> 1.66μs (33.0% faster)
    codeflash_output = is_continuous_axis([-3, -2, -1, 0, 1]) # 1.20μs -> 820ns (45.9% faster)

def test_multiple_element_non_continuous_axis():
    # Multiple elements, not continuous
    codeflash_output = is_continuous_axis([0, 2, 3]) # 1.78μs -> 1.36μs (31.2% faster)
    codeflash_output = is_continuous_axis([1, 2, 4]) # 936ns -> 778ns (20.3% faster)
    codeflash_output = is_continuous_axis([-3, -2, 0, 1]) # 791ns -> 530ns (49.2% faster)

# =========================
# Edge Test Cases
# =========================

def test_axis_with_duplicates():
    # Duplicates should not be considered continuous
    codeflash_output = is_continuous_axis([0, 0, 1]) # 2.37μs -> 1.93μs (22.6% faster)
    codeflash_output = is_continuous_axis([1, 2, 2, 3]) # 1.03μs -> 765ns (34.8% faster)

def test_axis_with_negative_and_positive():
    # Negative to positive, continuous
    codeflash_output = is_continuous_axis([-2, -1, 0, 1, 2]) # 2.44μs -> 1.85μs (31.6% faster)
    # Negative to positive, not continuous
    codeflash_output = is_continuous_axis([-2, 0, 1]) # 960ns -> 680ns (41.2% faster)

def test_axis_reverse_order():
    # Reverse order, but not continuous
    codeflash_output = is_continuous_axis([3, 2, 1, 0]) # 1.81μs -> 1.35μs (34.1% faster)
    codeflash_output = is_continuous_axis([2, 1, 0, -1]) # 698ns -> 484ns (44.2% faster)

def test_axis_with_large_gaps():
    # Large gaps, not continuous
    codeflash_output = is_continuous_axis([0, 5, 10]) # 1.76μs -> 1.34μs (32.0% faster)
    codeflash_output = is_continuous_axis([-10, -5, 0]) # 787ns -> 534ns (47.4% faster)

def test_axis_with_non_int_type():
    # Non-integer input should raise TypeError or fail
    with pytest.raises(TypeError):
        is_continuous_axis('abc')
    with pytest.raises(TypeError):
        is_continuous_axis([1.0, 2.0, 3.0])

def test_axis_with_tuple_input():
    # Accept tuple input if it behaves like a list
    codeflash_output = is_continuous_axis((0, 1, 2)) # 2.70μs -> 2.14μs (25.9% faster)
    codeflash_output = is_continuous_axis((0, 2, 3)) # 996ns -> 702ns (41.9% faster)

def test_axis_with_large_negative_values():
    # Large negative values, but continuous
    codeflash_output = is_continuous_axis([-100, -99, -98]) # 2.09μs -> 1.54μs (35.3% faster)

# =========================
# Large Scale Test Cases
# =========================

def test_large_continuous_axis():
    # Large continuous axis (0 to 999)
    axis = list(range(1000))
    codeflash_output = is_continuous_axis(axis) # 100μs -> 51.4μs (96.3% faster)

def test_large_non_continuous_axis():
    # Large non-continuous axis (missing one element)
    axis = list(range(500)) + list(range(501, 1000))
    codeflash_output = is_continuous_axis(axis) # 49.5μs -> 25.7μs (92.9% faster)

def test_large_continuous_axis_negative():
    # Large continuous axis with negatives
    axis = list(range(-500, 500))
    codeflash_output = is_continuous_axis(axis) # 100μs -> 51.2μs (97.0% faster)

def test_large_axis_with_duplicates():
    # Large axis with a duplicate in the middle
    axis = list(range(500)) + [499] + list(range(500, 999))
    codeflash_output = is_continuous_axis(axis) # 49.9μs -> 25.5μs (95.5% faster)

def test_large_axis_reverse_order():
    # Reverse order should not be continuous
    axis = list(range(999, -1, -1))
    codeflash_output = is_continuous_axis(axis) # 1.94μs -> 1.44μs (34.7% faster)

# =========================
# Miscellaneous / Additional Edge Cases
# =========================

def test_axis_with_non_sequence_type():
    # Passing a completely invalid type
    with pytest.raises(TypeError):
        is_continuous_axis(None) # 1.36μs -> 1.41μs (3.27% slower)
    with pytest.raises(TypeError):
        is_continuous_axis({1,2,3}) # 2.09μs -> 2.04μs (2.80% faster)

def test_axis_with_length_one_tuple():
    # Tuple of length 1
    codeflash_output = is_continuous_axis((7,)) # 765ns -> 764ns (0.131% faster)

def test_axis_with_length_one_numpy_array_like():
    # Simulate array-like object with __len__ and __getitem__
    class ArrayLike:
        def __len__(self): return 1
        def __getitem__(self, i): return 42
    codeflash_output = is_continuous_axis(ArrayLike()) # 1.03μs -> 1.09μs (5.25% slower)

def test_axis_with_non_integer_elements():
    # List with a string among integers
    with pytest.raises(TypeError):
        is_continuous_axis([1, 'a', 2]) # 2.94μs -> 2.93μs (0.342% faster)

def test_axis_with_boolean_elements():
    # Boolean values are ints in Python, but test for clarity
    codeflash_output = is_continuous_axis([False, True]) # 2.08μs -> 1.65μs (26.2% faster)
    codeflash_output = is_continuous_axis([True, False]) # 1.06μs -> 789ns (35.0% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest  # used for our unit tests
from keras.src.utils.python_utils import is_continuous_axis

# unit tests

# ------------------- Basic Test Cases -------------------

def test_single_int_axis():
    # Single integer axis should always be continuous
    codeflash_output = is_continuous_axis(0) # 387ns -> 398ns (2.76% slower)
    codeflash_output = is_continuous_axis(-1) # 220ns -> 218ns (0.917% faster)
    codeflash_output = is_continuous_axis(5) # 153ns -> 158ns (3.16% slower)

def test_single_element_list():
    # Single element list should be continuous
    codeflash_output = is_continuous_axis([0]) # 587ns -> 564ns (4.08% faster)
    codeflash_output = is_continuous_axis([-1]) # 274ns -> 268ns (2.24% faster)
    codeflash_output = is_continuous_axis([10]) # 199ns -> 208ns (4.33% slower)

def test_two_element_continuous():
    # Two elements, continuous increasing
    codeflash_output = is_continuous_axis([0, 1]) # 2.20μs -> 1.74μs (26.6% faster)
    codeflash_output = is_continuous_axis([2, 3]) # 838ns -> 660ns (27.0% faster)
    # Two elements, continuous decreasing
    codeflash_output = is_continuous_axis([1, 0]) # 846ns -> 563ns (50.3% faster)

def test_three_element_continuous_increasing():
    # Three elements, continuous increasing
    codeflash_output = is_continuous_axis([1, 2, 3]) # 1.98μs -> 1.52μs (30.0% faster)
    codeflash_output = is_continuous_axis([-2, -1, 0]) # 952ns -> 689ns (38.2% faster)

def test_three_element_continuous_decreasing():
    # Three elements, continuous decreasing
    codeflash_output = is_continuous_axis([3, 2, 1]) # 1.83μs -> 1.35μs (35.2% faster)

def test_non_continuous_axis():
    # Non-continuous axis
    codeflash_output = is_continuous_axis([0, 2]) # 1.98μs -> 1.51μs (31.4% faster)
    codeflash_output = is_continuous_axis([1, 3, 4]) # 905ns -> 639ns (41.6% faster)
    codeflash_output = is_continuous_axis([0, 1, 3]) # 858ns -> 584ns (46.9% faster)

def test_mixed_sign_axis():
    # Axis with positive and negative numbers, continuous
    codeflash_output = is_continuous_axis([-1, 0, 1]) # 1.98μs -> 1.47μs (34.4% faster)
    # Non-continuous
    codeflash_output = is_continuous_axis([-1, 1, 2]) # 910ns -> 641ns (42.0% faster)

# ------------------- Edge Test Cases -------------------

def test_axis_with_duplicates():
    # Duplicates are not continuous
    codeflash_output = is_continuous_axis([0, 0]) # 2.44μs -> 1.97μs (23.9% faster)
    codeflash_output = is_continuous_axis([1, 1, 2]) # 857ns -> 610ns (40.5% faster)

def test_axis_with_large_gaps():
    # Large gaps are not continuous
    codeflash_output = is_continuous_axis([0, 5, 10]) # 1.80μs -> 1.36μs (33.0% faster)

def test_axis_with_negative_indices():
    # Continuous negative indices
    codeflash_output = is_continuous_axis([-3, -2, -1]) # 2.12μs -> 1.54μs (37.4% faster)
    # Non-continuous negative indices
    codeflash_output = is_continuous_axis([-3, -1, 0]) # 929ns -> 639ns (45.4% faster)

def test_axis_with_zero():
    # Zero should work as any other integer
    codeflash_output = is_continuous_axis([0, 1, 2]) # 2.00μs -> 1.54μs (29.9% faster)
    codeflash_output = is_continuous_axis([2, 1, 0]) # 912ns -> 645ns (41.4% faster)

def test_axis_with_unsorted_elements():
    # Unsorted but continuous elements should be False
    codeflash_output = is_continuous_axis([2, 0, 1]) # 1.78μs -> 1.31μs (36.1% faster)

# ------------------- Large Scale Test Cases -------------------

def test_large_continuous_axis():
    # Large continuous axis (increasing)
    axis = list(range(1000))
    codeflash_output = is_continuous_axis(axis) # 101μs -> 51.2μs (97.4% faster)

def test_large_non_continuous_axis():
    # Large axis with a gap in the middle
    axis = list(range(500)) + [501] + list(range(502, 1000))
    codeflash_output = is_continuous_axis(axis) # 49.7μs -> 25.7μs (93.3% faster)

def test_large_axis_with_duplicates():
    # Large axis with a duplicate in the middle
    axis = list(range(500)) + [499] + list(range(500, 999))
    codeflash_output = is_continuous_axis(axis) # 49.6μs -> 25.6μs (93.8% faster)

def test_large_axis_negative_indices():
    # Large negative continuous axis
    axis = list(range(-1000, 0))
    codeflash_output = is_continuous_axis(axis) # 100μs -> 51.0μs (97.1% faster)

def test_large_axis_reverse_order():
    # Large axis, strictly decreasing order (should be False)
    axis = list(range(999, -1, -1))
    codeflash_output = is_continuous_axis(axis) # 1.85μs -> 1.48μs (25.0% faster)

def test_large_axis_with_one_element():
    # Large value, but only one element
    axis = [999]
    codeflash_output = is_continuous_axis(axis) # 583ns -> 595ns (2.02% slower)

# ------------------- Additional Robustness Cases -------------------

def test_axis_with_none():
    # Should raise TypeError for None in axis
    with pytest.raises(TypeError):
        is_continuous_axis([None, 1]) # 3.51μs -> 3.33μs (5.59% faster)

def test_axis_as_string():
    # Passing a string should raise TypeError
    with pytest.raises(TypeError):
        is_continuous_axis("012") # 2.77μs -> 2.66μs (4.25% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-is_continuous_axis-mja5qf5c and push.

Codeflash Static Badge

The optimized code achieves an **82% speedup** by eliminating redundant computation and leveraging early exit optimization.

**Key optimization:** The original code redundantly checked the same condition twice with separate `positive_order_flag` and `negative_order_flag` loops. Both loops were testing `axis[i + 1] - axis[i] != 1`, making one of them completely unnecessary since the function only needs to determine if consecutive elements differ by exactly 1.

**What changed:**
- Removed the duplicate second loop that checked identical logic
- Eliminated unnecessary flag variables and final OR operation
- Added early exit: returns `False` immediately when the first non-consecutive pair is found
- Simplified the logic to a single loop that returns `True` only if all consecutive differences equal 1

**Why it's faster:**
1. **50% fewer loop iterations:** Only one loop instead of two identical ones
2. **Early termination:** Stops processing as soon as a gap is detected, rather than checking all elements twice
3. **Reduced variable assignments:** No flag variables or boolean operations needed
4. **Better cache locality:** Single pass through the data instead of two

**Performance impact:** Based on the function references, `is_continuous_axis` is called in critical neural network operations like RMS normalization and layer normalization in PyTorch backend optimizations. The test results show particularly strong gains (90%+ speedup) for large continuous arrays and arrays with early discontinuities, which are common in tensor dimension checking scenarios. This optimization will significantly benefit deep learning workloads that frequently validate tensor axis continuity during model execution.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 December 17, 2025 15:19
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant