Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Dec 17, 2025

📄 19% (0.19x) speedup for OpenVINOTrainer.make_test_function in keras/src/backend/openvino/trainer.py

⏱️ Runtime : 22.1 microseconds 18.6 microseconds (best of 48 runs)

📝 Explanation and details

The optimization achieves an 18% speedup by eliminating unnecessary function call overhead and streamlining the control flow in make_test_function().

Key optimizations:

  1. Removed nested function calls: The original code created two separate inner functions (one_test_step and multi_test_steps) and then assigned one to test_step based on condition. The optimized version uses a single conditional to directly define the appropriate test_step function, eliminating the intermediate function creation overhead.

  2. Inlined simple logic: For the single-step case, the logic data = data[0]; return self.test_step(data) is kept inline rather than wrapped in a separate one_test_step function, removing one layer of function call indirection.

  3. Simplified multi-step handling: The multi-step case now directly iterates over data without the extra one_test_step wrapper function, reducing function call overhead per iteration during test execution.

Performance impact:
The line profiler shows the optimized version spends less time defining functions (23% vs 27.3% + 11.7% in the original), with the conditional check becoming the dominant operation at 23% vs 11.3% originally. The test results demonstrate consistent 10-27% improvements across various scenarios, with the largest gains in basic function creation cases.

Test case performance:
The optimization is particularly effective for:

  • Function creation scenarios (20-27% faster)
  • Force recreation cases (19-27% faster)
  • Basic callable tests (16-25% faster)

This optimization is valuable since make_test_function() is likely called during model training setup, and the reduced overhead during function creation and potential test step execution can accumulate over multiple training runs.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 33 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import inspect

# imports
import pytest
from keras.src.backend.openvino.trainer import OpenVINOTrainer

class Trainer:
    def __init__(self):
        self._lock = False
        self._run_eagerly = False
        self._jit_compile = None
        self.compiled = False
        self.loss = None
        self.steps_per_execution = 1
        self._initial_epoch = None
        self._compute_loss_has_training_arg = (
            "training" in inspect.signature(self.compute_loss).parameters
        )
        self._compile_loss = None
        self._compile_metrics = None
        self._loss_tracker = None

    # Dummy compute_loss for signature check
    def compute_loss(self, y_true=None, y_pred=None, training=None):
        return 0.0
from keras.src.backend.openvino.trainer import OpenVINOTrainer

# unit tests

# --- BASIC TEST CASES ---

def test_test_function_is_created_and_returns_callable():
    # Test that make_test_function returns a callable
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn = codeflash_output # 1.08μs -> 865ns (24.6% faster)

def test_test_function_returns_same_function_when_already_set():
    # Test that calling make_test_function twice returns the same function
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn1 = codeflash_output # 1.06μs -> 871ns (22.2% faster)
    codeflash_output = trainer.make_test_function(); fn2 = codeflash_output # 421ns -> 425ns (0.941% slower)

def test_force_recreates_test_function():
    # Test that force=True recreates the test function
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn1 = codeflash_output # 1.14μs -> 897ns (26.8% faster)
    codeflash_output = trainer.make_test_function(force=True); fn2 = codeflash_output # 1.13μs -> 948ns (19.6% faster)

def test_test_function_with_non_list_input():
    # Should raise TypeError if input is not subscriptable
    trainer = OpenVINOTrainer()
    trainer.steps_per_execution = 1
    codeflash_output = trainer.make_test_function(); fn = codeflash_output # 1.38μs -> 1.08μs (27.2% faster)
    with pytest.raises(TypeError):
        fn(None)

def test_multi_test_steps_with_non_iterable():
    # Should raise TypeError if data is not iterable in multi_test_steps
    trainer = OpenVINOTrainer()
    trainer.steps_per_execution = 2
    codeflash_output = trainer.make_test_function(); fn = codeflash_output # 1.31μs -> 1.15μs (13.6% faster)
    with pytest.raises(TypeError):
        fn(None)

# --- LARGE SCALE TEST CASES ---

def test_make_test_function_does_not_affect_predict_function():
    # Ensure make_test_function does not set predict_function
    trainer = OpenVINOTrainer()
    trainer.predict_function = "original"
    trainer.make_test_function() # 1.40μs -> 1.17μs (20.3% faster)

# --- DETERMINISM TEST CASE ---
import inspect

# imports
import pytest
from keras.src.backend.openvino.trainer import OpenVINOTrainer

# Minimal base Trainer implementation
class Trainer:
    def __init__(self):
        self._lock = False
        self._run_eagerly = False
        self._jit_compile = None
        self.compiled = False
        self.loss = None
        self.steps_per_execution = 1
        self._initial_epoch = None
        self._compute_loss_has_training_arg = (
            "training" in inspect.signature(self.compute_loss).parameters
        )
        self._compile_loss = None
        self._compile_metrics = None
        self._loss_tracker = None

    def compute_loss(self, *args, **kwargs):
        return 0.0

    def test_step(self, data):
        # Dummy test_step for testing
        return {"result": data}
from keras.src.backend.openvino.trainer import OpenVINOTrainer

# unit tests

# ==== BASIC TEST CASES ====

def test_make_test_function_returns_callable():
    """Test that make_test_function returns a callable."""
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn = codeflash_output # 1.37μs -> 1.13μs (21.4% faster)

def test_make_test_function_assigns_to_attribute():
    """Test that make_test_function assigns the function to self.test_function."""
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn = codeflash_output # 1.21μs -> 964ns (25.1% faster)

def test_make_test_function_returns_same_fn_without_force():
    """Test that make_test_function returns the same function if called twice without force."""
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn1 = codeflash_output # 1.14μs -> 978ns (16.4% faster)
    codeflash_output = trainer.make_test_function(); fn2 = codeflash_output # 432ns -> 392ns (10.2% faster)

def test_make_test_function_recreates_with_force():
    """Test that make_test_function creates a new function if force=True."""
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn1 = codeflash_output # 1.13μs -> 890ns (27.1% faster)
    codeflash_output = trainer.make_test_function(force=True); fn2 = codeflash_output # 1.20μs -> 1.01μs (19.2% faster)

def test_make_test_function_with_non_iterable_data():
    """Test that one_test_step raises if data is not a list."""
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn = codeflash_output # 1.34μs -> 1.08μs (23.9% faster)
    with pytest.raises(TypeError):
        fn(123)  # Not a list, should raise

def test_make_test_function_steps_per_execution_change():
    """Test that changing steps_per_execution after function creation does not affect old function."""
    trainer = OpenVINOTrainer()
    trainer.steps_per_execution = 1
    codeflash_output = trainer.make_test_function(); fn1 = codeflash_output # 1.39μs -> 1.19μs (16.2% faster)
    trainer.steps_per_execution = 2
    codeflash_output = trainer.make_test_function(force=True); fn2 = codeflash_output # 1.24μs -> 1.18μs (4.90% faster)
    # fn1 should still be one_test_step
    called = {}
    def test_step(data):
        called['data'] = data
        return "ok"
    trainer.test_step = test_step

# ==== COVERAGE: test_function is not None but force=True ====

def test_make_test_function_force_true_always_recreates():
    """Test that force=True always recreates the function even if test_function is not None."""
    trainer = OpenVINOTrainer()
    codeflash_output = trainer.make_test_function(); fn1 = codeflash_output # 1.12μs -> 928ns (21.1% faster)
    codeflash_output = trainer.make_test_function(force=True); fn2 = codeflash_output # 1.06μs -> 970ns (9.79% faster)
    codeflash_output = trainer.make_test_function(force=True); fn3 = codeflash_output # 503ns -> 488ns (3.07% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-OpenVINOTrainer.make_test_function-mjalr05z and push.

Codeflash Static Badge

The optimization achieves an 18% speedup by eliminating unnecessary function call overhead and streamlining the control flow in `make_test_function()`.

**Key optimizations:**

1. **Removed nested function calls**: The original code created two separate inner functions (`one_test_step` and `multi_test_steps`) and then assigned one to `test_step` based on condition. The optimized version uses a single conditional to directly define the appropriate `test_step` function, eliminating the intermediate function creation overhead.

2. **Inlined simple logic**: For the single-step case, the logic `data = data[0]; return self.test_step(data)` is kept inline rather than wrapped in a separate `one_test_step` function, removing one layer of function call indirection.

3. **Simplified multi-step handling**: The multi-step case now directly iterates over data without the extra `one_test_step` wrapper function, reducing function call overhead per iteration during test execution.

**Performance impact:**
The line profiler shows the optimized version spends less time defining functions (23% vs 27.3% + 11.7% in the original), with the conditional check becoming the dominant operation at 23% vs 11.3% originally. The test results demonstrate consistent 10-27% improvements across various scenarios, with the largest gains in basic function creation cases.

**Test case performance:**
The optimization is particularly effective for:
- Function creation scenarios (20-27% faster)
- Force recreation cases (19-27% faster) 
- Basic callable tests (16-25% faster)

This optimization is valuable since `make_test_function()` is likely called during model training setup, and the reduced overhead during function creation and potential test step execution can accumulate over multiple training runs.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 December 17, 2025 22:47
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant