Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Dec 17, 2025

📄 120% (1.20x) speedup for _shared_object_disabled in keras/src/legacy/saving/serialization.py

⏱️ Runtime : 1.32 milliseconds 602 microseconds (best of 174 runs)

📝 Explanation and details

The optimization replaces getattr(SHARED_OBJECT_DISABLED, "disabled", False) with SHARED_OBJECT_DISABLED.__dict__.get("disabled", False), achieving a 119% speedup by bypassing Python's built-in attribute lookup mechanism.

Key Performance Improvements:

  • Direct dictionary access: __dict__.get() directly accesses the underlying dictionary storage of the threading.local object, avoiding the overhead of Python's attribute resolution protocol
  • Reduced function call overhead: getattr() involves additional C-level checks and method dispatch, while dictionary .get() is a more direct operation
  • Elimination of descriptor protocol: getattr() triggers the full descriptor protocol chain, whereas direct dictionary access skips these checks

Hot Path Impact:
Based on the function references, _shared_object_disabled() is called in context manager __enter__ methods for both loading and saving scopes. These are likely executed frequently during model serialization/deserialization operations, making this a high-impact optimization for Keras workflows involving model saving/loading.

Test Case Performance:
The optimization shows consistent 90-150% speedups across all test scenarios:

  • Default value lookups: 90.9% faster (most common case)
  • Set attribute lookups: 97-132% faster
  • Edge cases (deletions, type variations): 100-140% faster
  • Large-scale operations: 116-129% faster on bulk operations

The optimization maintains identical behavior and thread-safety guarantees while providing substantial performance gains for this frequently-called utility function in Keras's serialization pipeline.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 3035 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import threading

# imports
import pytest
from keras.src.legacy.saving.serialization import _shared_object_disabled

SHARED_OBJECT_DISABLED = threading.local()
from keras.src.legacy.saving.serialization import _shared_object_disabled

# unit tests

# --------- Basic Test Cases ---------

def test_default_value_is_false():
    """Test that the default value is False when not set."""
    codeflash_output = _shared_object_disabled() # 1.19μs -> 625ns (90.9% faster)

def test_set_disabled_true():
    """Test that setting SHARED_OBJECT_DISABLED.disabled to True returns True."""
    SHARED_OBJECT_DISABLED.disabled = True
    codeflash_output = _shared_object_disabled() # 987ns -> 499ns (97.8% faster)

def test_set_disabled_false():
    """Test that explicitly setting SHARED_OBJECT_DISABLED.disabled to False returns False."""
    SHARED_OBJECT_DISABLED.disabled = False
    codeflash_output = _shared_object_disabled() # 1.18μs -> 535ns (120% faster)

def test_set_disabled_none():
    """Test that setting SHARED_OBJECT_DISABLED.disabled to None returns None."""
    SHARED_OBJECT_DISABLED.disabled = None
    codeflash_output = _shared_object_disabled() # 1.11μs -> 519ns (114% faster)

def test_set_disabled_integer():
    """Test that setting SHARED_OBJECT_DISABLED.disabled to an integer returns that integer."""
    SHARED_OBJECT_DISABLED.disabled = 123
    codeflash_output = _shared_object_disabled() # 1.12μs -> 484ns (132% faster)

def test_set_disabled_string():
    """Test that setting SHARED_OBJECT_DISABLED.disabled to a string returns that string."""
    SHARED_OBJECT_DISABLED.disabled = "yes"
    codeflash_output = _shared_object_disabled() # 1.16μs -> 539ns (115% faster)

# --------- Edge Test Cases ---------

def test_delete_disabled_attribute():
    """Test that deleting the .disabled attribute reverts to default False."""
    SHARED_OBJECT_DISABLED.disabled = True
    del SHARED_OBJECT_DISABLED.disabled
    codeflash_output = _shared_object_disabled() # 1.08μs -> 542ns (100% faster)

def test_disabled_attribute_set_to_empty_list():
    """Test that setting .disabled to an empty list returns an empty list."""
    SHARED_OBJECT_DISABLED.disabled = []
    codeflash_output = _shared_object_disabled() # 1.08μs -> 541ns (100% faster)

def test_disabled_attribute_set_to_object():
    """Test that setting .disabled to a custom object returns that object."""
    class Dummy:
        pass
    obj = Dummy()
    SHARED_OBJECT_DISABLED.disabled = obj
    codeflash_output = _shared_object_disabled() # 1.16μs -> 591ns (96.4% faster)

def test_disabled_attribute_set_to_falsey_value():
    """Test that setting .disabled to a falsey value (0) returns 0."""
    SHARED_OBJECT_DISABLED.disabled = 0
    codeflash_output = _shared_object_disabled() # 1.11μs -> 554ns (101% faster)

def test_disabled_attribute_set_to_truthy_value():
    """Test that setting .disabled to a truthy value (non-empty tuple) returns that value."""
    SHARED_OBJECT_DISABLED.disabled = (1, 2, 3)
    codeflash_output = _shared_object_disabled() # 1.04μs -> 525ns (98.1% faster)

def test_disabled_attribute_set_to_boolean_expression():
    """Test that setting .disabled to a boolean expression result works."""
    SHARED_OBJECT_DISABLED.disabled = (2 + 2 == 4)
    codeflash_output = _shared_object_disabled() # 1.11μs -> 540ns (106% faster)

def test_disabled_attribute_set_to_float():
    """Test that setting .disabled to a float returns that float."""
    SHARED_OBJECT_DISABLED.disabled = 3.14159
    codeflash_output = _shared_object_disabled() # 1.07μs -> 524ns (105% faster)

def test_disabled_attribute_set_to_bytes():
    """Test that setting .disabled to bytes returns those bytes."""
    SHARED_OBJECT_DISABLED.disabled = b"bytes"
    codeflash_output = _shared_object_disabled() # 1.09μs -> 509ns (115% faster)

# --------- Thread Safety and Large Scale Test Cases ---------

def test_threadlocal_is_thread_safe():
    """Test that different threads have independent .disabled values."""
    results = {}

    def worker(thread_id, value_to_set):
        SHARED_OBJECT_DISABLED.disabled = value_to_set
        results[thread_id] = _shared_object_disabled()

    # Main thread: no value set, should be False
    if hasattr(SHARED_OBJECT_DISABLED, "disabled"):
        del SHARED_OBJECT_DISABLED.disabled
    codeflash_output = _shared_object_disabled()

    threads = []
    for i in range(5):
        t = threading.Thread(target=worker, args=(i, i))
        threads.append(t)
        t.start()
    for t in threads:
        t.join()

    # Each thread should have its own value
    for i in range(5):
        pass

    # Main thread should still be default (False)
    codeflash_output = _shared_object_disabled()

def test_massive_attribute_assignment():
    """Test repeatedly assigning and deleting the .disabled attribute (stress test)."""
    for i in range(1000):
        SHARED_OBJECT_DISABLED.disabled = i
        codeflash_output = _shared_object_disabled() # 429μs -> 199μs (116% faster)
        del SHARED_OBJECT_DISABLED.disabled
        codeflash_output = _shared_object_disabled()
import threading

# imports
import pytest  # used for our unit tests
from keras.src.legacy.saving.serialization import _shared_object_disabled

# function to test
# (from keras/src/legacy/saving/serialization.py)
SHARED_OBJECT_DISABLED = threading.local()
from keras.src.legacy.saving.serialization import _shared_object_disabled

# unit tests

def test_default_value_false():
    """Basic: By default, shared object disabled should be False (unset)."""
    # Remove attribute if it exists to simulate fresh threadlocal state
    if hasattr(SHARED_OBJECT_DISABLED, "disabled"):
        del SHARED_OBJECT_DISABLED.disabled
    codeflash_output = _shared_object_disabled() # 1.33μs -> 842ns (58.1% faster)

def test_set_disabled_true():
    """Basic: Setting disabled=True should be reflected by the function."""
    SHARED_OBJECT_DISABLED.disabled = True
    codeflash_output = _shared_object_disabled() # 1.23μs -> 624ns (97.3% faster)

def test_set_disabled_false():
    """Basic: Setting disabled=False should be reflected by the function."""
    SHARED_OBJECT_DISABLED.disabled = False
    codeflash_output = _shared_object_disabled() # 1.18μs -> 582ns (102% faster)

def test_set_disabled_non_bool():
    """Edge: Setting disabled to a non-bool value should return that value."""
    SHARED_OBJECT_DISABLED.disabled = "foobar"
    codeflash_output = _shared_object_disabled() # 1.20μs -> 557ns (116% faster)
    SHARED_OBJECT_DISABLED.disabled = 123
    codeflash_output = _shared_object_disabled() # 548ns -> 219ns (150% faster)
    SHARED_OBJECT_DISABLED.disabled = None
    codeflash_output = _shared_object_disabled() # 473ns -> 197ns (140% faster)

def test_remove_attribute_resets_to_default():
    """Edge: Deleting the attribute should reset function to default False."""
    SHARED_OBJECT_DISABLED.disabled = True
    codeflash_output = _shared_object_disabled() # 1.11μs -> 513ns (117% faster)
    del SHARED_OBJECT_DISABLED.disabled
    codeflash_output = _shared_object_disabled() # 542ns -> 236ns (130% faster)

def test_large_scale_set_and_reset():
    """Large Scale: Rapidly set and delete the attribute many times."""
    for i in range(500):  # 500 iterations, under 1000
        SHARED_OBJECT_DISABLED.disabled = i
        codeflash_output = _shared_object_disabled() # 218μs -> 95.4μs (129% faster)
        del SHARED_OBJECT_DISABLED.disabled
        codeflash_output = _shared_object_disabled()

def test_attribute_shadowing():
    """Edge: Setting an attribute with a different name does not affect the result."""
    SHARED_OBJECT_DISABLED.not_disabled = True
    # Only 'disabled' attribute should be checked
    if hasattr(SHARED_OBJECT_DISABLED, "disabled"):
        del SHARED_OBJECT_DISABLED.disabled
    codeflash_output = _shared_object_disabled() # 1.03μs -> 641ns (60.1% faster)
    SHARED_OBJECT_DISABLED.disabled = True
    codeflash_output = _shared_object_disabled() # 471ns -> 226ns (108% faster)
    # Clean up
    del SHARED_OBJECT_DISABLED.disabled
    del SHARED_OBJECT_DISABLED.not_disabled

def test_unusual_types():
    """Edge: Setting disabled to an unusual type (e.g., list, dict, object)."""
    SHARED_OBJECT_DISABLED.disabled = [1, 2, 3]
    codeflash_output = _shared_object_disabled() # 932ns -> 468ns (99.1% faster)
    SHARED_OBJECT_DISABLED.disabled = {"a": 1}
    codeflash_output = _shared_object_disabled() # 526ns -> 250ns (110% faster)
    class Dummy: pass
    d = Dummy()
    SHARED_OBJECT_DISABLED.disabled = d
    codeflash_output = _shared_object_disabled() # 523ns -> 221ns (137% faster)

def test_repeated_set_and_get():
    """Large Scale: Repeatedly set and get the value to check for state leaks."""
    for v in (True, False, "x", 42, None):
        SHARED_OBJECT_DISABLED.disabled = v
        codeflash_output = _shared_object_disabled() # 2.98μs -> 1.33μs (124% faster)

def test_no_side_effects_between_tests():
    """Edge: Ensure test isolation (pytest runs each test in a new context)."""
    # This test is mostly a marker for pytest's own isolation, but we check default
    if hasattr(SHARED_OBJECT_DISABLED, "disabled"):
        del SHARED_OBJECT_DISABLED.disabled
    codeflash_output = _shared_object_disabled() # 1.00μs -> 499ns (101% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_shared_object_disabled-mjacyxig and push.

Codeflash Static Badge

The optimization replaces `getattr(SHARED_OBJECT_DISABLED, "disabled", False)` with `SHARED_OBJECT_DISABLED.__dict__.get("disabled", False)`, achieving a **119% speedup** by bypassing Python's built-in attribute lookup mechanism.

**Key Performance Improvements:**
- **Direct dictionary access**: `__dict__.get()` directly accesses the underlying dictionary storage of the threading.local object, avoiding the overhead of Python's attribute resolution protocol
- **Reduced function call overhead**: `getattr()` involves additional C-level checks and method dispatch, while dictionary `.get()` is a more direct operation
- **Elimination of descriptor protocol**: `getattr()` triggers the full descriptor protocol chain, whereas direct dictionary access skips these checks

**Hot Path Impact:**
Based on the function references, `_shared_object_disabled()` is called in context manager `__enter__` methods for both loading and saving scopes. These are likely executed frequently during model serialization/deserialization operations, making this a high-impact optimization for Keras workflows involving model saving/loading.

**Test Case Performance:**
The optimization shows consistent 90-150% speedups across all test scenarios:
- Default value lookups: 90.9% faster (most common case)
- Set attribute lookups: 97-132% faster  
- Edge cases (deletions, type variations): 100-140% faster
- Large-scale operations: 116-129% faster on bulk operations

The optimization maintains identical behavior and thread-safety guarantees while providing substantial performance gains for this frequently-called utility function in Keras's serialization pipeline.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 December 17, 2025 18:42
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant