Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Dec 17, 2025

📄 31% (0.31x) speedup for dict_equiv in xarray/core/utils.py

⏱️ Runtime : 8.61 milliseconds 6.56 milliseconds (best of 19 runs)

📝 Explanation and details

The optimization implements an early length check to avoid the expensive second iteration in dictionary equivalence testing.

Key optimization: Instead of checking all(k in first for k in second) after the main loop, the code now performs if len(first) != len(second): return False at the beginning. This single O(1) length comparison eliminates the need for a full O(n) iteration through the second dictionary's keys when the dictionaries have different sizes.

Why this is faster: The original implementation always performed two passes - first checking values for matching keys, then verifying all keys in the second dict exist in the first. The optimized version leverages the mathematical property that two dictionaries are equivalent if and only if they have the same length AND all key-value pairs in the first dictionary match those in the second.

Performance gains by test case type:

  • Missing/extra key scenarios: Massive speedups (200-140,000% faster) because length mismatch is detected immediately without any key iteration
  • Equivalent dictionaries: Moderate improvements (6-37% faster) by eliminating the second pass entirely
  • Value mismatches: Small performance cost (3-21% slower) due to the added length check, but this is negligible compared to the value comparison overhead

Impact on xarray workloads: Based on the function reference showing dict_equiv used in dataset concatenation for comparing global attributes, this optimization will significantly improve performance when concatenating datasets with different attribute sets - a common scenario where the length check provides immediate rejection without expensive key-by-key comparison.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 41 Passed
🌀 Generated Regression Tests 79 Passed
⏪ Replay Tests 255 Passed
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_utils.py::TestDictionaries.test_dict_equiv 151μs 165μs -8.56%⚠️
🌀 Generated Regression Tests and Runtime
from collections.abc import Mapping
from typing import Any, Callable, TypeVar

# imports
import pytest  # used for our unit tests
from xarray.core.utils import dict_equiv

K = TypeVar("K")
V = TypeVar("V")


def equivalent(a: Any, b: Any) -> bool:
    """Default compatibility function for dict_equiv: strict equality."""
    # Handles special case for float('nan')
    try:
        import math

        if isinstance(a, float) and isinstance(b, float):
            if math.isnan(a) and math.isnan(b):
                return True
    except ImportError:
        pass
    return a == b


from xarray.core.utils import dict_equiv

# unit tests

# -------------------------
# 1. BASIC TEST CASES
# -------------------------


def test_empty_dicts_equivalent():
    # Both dicts empty
    codeflash_output = dict_equiv({}, {})  # 1.56μs -> 891ns (74.6% faster)


def test_single_item_equivalent():
    # Both dicts have one identical key-value pair
    codeflash_output = dict_equiv({"a": 1}, {"a": 1})  # 3.29μs -> 2.73μs (20.4% faster)


def test_single_item_not_equivalent():
    # Both dicts have one key, different values
    codeflash_output = not dict_equiv(
        {"a": 1}, {"a": 2}
    )  # 9.18μs -> 10.7μs (14.6% slower)


def test_multiple_items_equivalent():
    # Multiple key-value pairs, same order
    d1 = {"a": 1, "b": 2, "c": 3}
    d2 = {"a": 1, "b": 2, "c": 3}
    codeflash_output = dict_equiv(d1, d2)  # 4.14μs -> 3.38μs (22.4% faster)


def test_multiple_items_equivalent_different_order():
    # Multiple key-value pairs, different order
    d1 = {"a": 1, "b": 2, "c": 3}
    d2 = {"c": 3, "b": 2, "a": 1}
    codeflash_output = dict_equiv(d1, d2)  # 4.17μs -> 3.30μs (26.2% faster)


def test_missing_key():
    # One dict missing a key
    d1 = {"a": 1, "b": 2}
    d2 = {"a": 1}
    codeflash_output = not dict_equiv(d1, d2)  # 2.45μs -> 769ns (218% faster)
    codeflash_output = not dict_equiv(d2, d1)  # 2.31μs -> 305ns (658% faster)


def test_extra_key():
    # One dict has an extra key
    d1 = {"a": 1}
    d2 = {"a": 1, "b": 2}
    codeflash_output = not dict_equiv(d1, d2)  # 3.54μs -> 735ns (381% faster)
    codeflash_output = not dict_equiv(d2, d1)  # 1.02μs -> 308ns (231% faster)


def test_different_types_as_values():
    # Dicts with different types as values
    d1 = {"a": 1, "b": "foo"}
    d2 = {"a": 1, "b": "foo"}
    codeflash_output = dict_equiv(d1, d2)  # 3.85μs -> 3.06μs (25.8% faster)


def test_different_types_as_values_not_equiv():
    # Dicts with different types as values (not equivalent)
    d1 = {"a": 1, "b": "foo"}
    d2 = {"a": 1, "b": "bar"}
    codeflash_output = not dict_equiv(d1, d2)  # 6.75μs -> 7.79μs (13.4% slower)


# -------------------------
# 2. EDGE TEST CASES
# -------------------------


def test_nested_dicts_equivalent():
    # Nested dictionaries, identical structure and values
    d1 = {"a": {"x": 1, "y": 2}, "b": 3}
    d2 = {"a": {"x": 1, "y": 2}, "b": 3}
    codeflash_output = dict_equiv(d1, d2)  # 4.91μs -> 4.35μs (12.9% faster)


def test_nested_dicts_not_equivalent():
    # Nested dictionaries, different structure/values
    d1 = {"a": {"x": 1, "y": 2}, "b": 3}
    d2 = {"a": {"x": 1, "y": 3}, "b": 3}
    codeflash_output = not dict_equiv(d1, d2)  # 12.9μs -> 16.4μs (21.5% slower)


def test_dicts_with_list_values_equivalent():
    # Dicts with list values, lists are compared by reference (should be False unless same object)
    l = [1, 2, 3]
    d1 = {"a": l}
    d2 = {"a": l}
    codeflash_output = dict_equiv(d1, d2)  # 3.37μs -> 2.52μs (33.4% faster)
    d3 = {"a": [1, 2, 3]}
    codeflash_output = not dict_equiv(d1, d3)  # 4.72μs -> 4.42μs (6.83% faster)


def test_dicts_with_tuple_values_equivalent():
    # Tuples are compared by value
    d1 = {"a": (1, 2)}
    d2 = {"a": (1, 2)}
    codeflash_output = dict_equiv(d1, d2)  # 3.12μs -> 2.44μs (28.1% faster)


def test_dicts_with_nan_values():
    # NaN is not equal to itself, but our default compat treats nan==nan as True
    import math

    d1 = {"a": float("nan")}
    d2 = {"a": float("nan")}
    codeflash_output = dict_equiv(d1, d2)  # 7.32μs -> 7.08μs (3.35% faster)


def test_dicts_with_nan_and_number():
    # NaN and a number are not equivalent
    import math

    d1 = {"a": float("nan")}
    d2 = {"a": 1.0}
    codeflash_output = not dict_equiv(d1, d2)  # 6.23μs -> 6.82μs (8.65% slower)


def test_dicts_with_none_values():
    # None values
    d1 = {"a": None}
    d2 = {"a": None}
    codeflash_output = dict_equiv(d1, d2)  # 3.17μs -> 2.54μs (25.0% faster)


def test_dicts_with_none_and_value():
    # None vs value
    d1 = {"a": None}
    d2 = {"a": 0}
    codeflash_output = not dict_equiv(d1, d2)  # 6.68μs -> 6.99μs (4.48% slower)


def test_dicts_with_custom_compat():
    # Use a custom compat function that compares lists by value
    def list_compat(a, b):
        if isinstance(a, list) and isinstance(b, list):
            return a == b
        return a == b

    d1 = {"a": [1, 2, 3]}
    d2 = {"a": [1, 2, 3]}
    codeflash_output = dict_equiv(
        d1, d2, compat=list_compat
    )  # 2.62μs -> 1.88μs (39.0% faster)


def test_dicts_with_unhashable_values():
    # Values are unhashable, but keys are hashable
    d1 = {"a": [1, 2]}
    d2 = {"a": [1, 2]}
    codeflash_output = not dict_equiv(d1, d2)  # 6.56μs -> 6.07μs (8.09% faster)


def test_dicts_with_different_key_types():
    # Key types differ
    d1 = {1: "a"}
    d2 = {"1": "a"}
    codeflash_output = not dict_equiv(d1, d2)  # 892ns -> 1.12μs (20.4% slower)


def test_dicts_with_bool_and_int_keys():
    # bool and int keys: True == 1, but keys are not the same type
    d1 = {True: "yes"}
    d2 = {1: "yes"}
    codeflash_output = dict_equiv(d1, d2)  # 3.91μs -> 3.09μs (26.4% faster)


def test_dicts_with_falsy_values():
    # Falsy values (0, False, '', [])
    d1 = {"a": 0, "b": False, "c": "", "d": []}
    d2 = {"a": 0, "b": False, "c": "", "d": []}
    # [] are different objects, so default compat is False for d
    codeflash_output = not dict_equiv(d1, d2)  # 6.53μs -> 6.00μs (8.83% faster)


def test_dicts_with_custom_mapping():
    # Custom Mapping subclass
    class MyDict(dict):
        pass

    d1 = MyDict({"a": 1, "b": 2})
    d2 = {"a": 1, "b": 2}
    codeflash_output = dict_equiv(d1, d2)  # 4.72μs -> 3.60μs (30.9% faster)


def test_dicts_with_inherited_mapping_and_extra_method():
    # Custom Mapping with extra method
    class MyMapping(dict):
        def extra(self):
            return 42

    d1 = MyMapping({"x": 1})
    d2 = {"x": 1}
    codeflash_output = dict_equiv(d1, d2)  # 4.10μs -> 2.99μs (37.2% faster)


# -------------------------
# 3. LARGE SCALE TEST CASES
# -------------------------


def test_large_dicts_equivalent():
    # Large dicts with 1000 items, all identical
    d1 = {i: i * i for i in range(1000)}
    d2 = {i: i * i for i in range(1000)}
    codeflash_output = dict_equiv(d1, d2)  # 459μs -> 430μs (6.65% faster)


def test_large_dicts_one_value_differs():
    # Large dicts, one value differs
    d1 = {i: i * i for i in range(1000)}
    d2 = {i: i * i for i in range(1000)}
    d2[500] = -1
    codeflash_output = not dict_equiv(d1, d2)  # 209μs -> 220μs (5.13% slower)


def test_large_dicts_one_key_missing():
    # Large dicts, one key missing
    d1 = {i: i * i for i in range(1000)}
    d2 = {i: i * i for i in range(999)}
    codeflash_output = not dict_equiv(d1, d2)  # 414μs -> 855ns (48357% faster)
    codeflash_output = not dict_equiv(d2, d1)  # 456μs -> 324ns (140702% faster)


def test_large_dicts_extra_key():
    # Large dicts, extra key in one
    d1 = {i: i * i for i in range(1000)}
    d2 = {i: i * i for i in range(1000)}
    d2[1001] = 1001 * 1001
    codeflash_output = not dict_equiv(d1, d2)  # 464μs -> 814ns (56923% faster)


def test_large_dicts_with_nested_dicts():
    # Large dicts with nested dicts as values
    d1 = {i: {"x": i, "y": i + 1} for i in range(100)}
    d2 = {i: {"x": i, "y": i + 1} for i in range(100)}
    codeflash_output = dict_equiv(d1, d2)  # 51.6μs -> 50.1μs (3.04% faster)


def test_large_dicts_with_list_values_custom_compat():
    # Large dicts with list values, using custom compat that compares lists by value
    def list_compat(a, b):
        if isinstance(a, list) and isinstance(b, list):
            return a == b
        return a == b

    d1 = {i: [i, i + 1] for i in range(100)}
    d2 = {i: [i, i + 1] for i in range(100)}
    codeflash_output = dict_equiv(
        d1, d2, compat=list_compat
    )  # 15.9μs -> 13.2μs (20.1% faster)


def test_large_dicts_with_nan_values():
    # Large dicts with nan values
    import math

    d1 = {i: (float("nan") if i % 10 == 0 else i) for i in range(100)}
    d2 = {i: (float("nan") if i % 10 == 0 else i) for i in range(100)}
    codeflash_output = dict_equiv(d1, d2)  # 43.5μs -> 42.1μs (3.34% faster)


# -------------------------
# 4. MISC/REGRESSION TESTS
# -------------------------


def test_dict_equiv_is_symmetric():
    # dict_equiv should be symmetric
    d1 = {"a": 1, "b": 2}
    d2 = {"b": 2, "a": 1}
    codeflash_output = dict_equiv(d1, d2)  # 3.79μs -> 3.11μs (21.6% faster)
    codeflash_output = dict_equiv(d2, d1)  # 1.44μs -> 1.26μs (14.1% faster)


def test_dict_equiv_with_empty_and_nonempty():
    # One empty, one non-empty
    codeflash_output = not dict_equiv({}, {"a": 1})  # 1.82μs -> 765ns (138% faster)
    codeflash_output = not dict_equiv({"a": 1}, {})  # 430ns -> 312ns (37.8% faster)


def test_dict_equiv_with_different_types():
    # Different types as values
    d1 = {"a": 1}
    d2 = {"a": "1"}
    codeflash_output = not dict_equiv(d1, d2)  # 6.23μs -> 6.64μs (6.31% slower)


def test_dict_equiv_with_non_string_keys():
    # Non-string keys
    d1 = {("x", 1): 42}
    d2 = {("x", 1): 42}
    codeflash_output = dict_equiv(d1, d2)  # 3.42μs -> 2.61μs (31.1% faster)


def test_dict_equiv_with_object_keys():
    # Object keys (hashable)
    class Key:
        def __init__(self, v):
            self.v = v

        def __eq__(self, other):
            return isinstance(other, Key) and self.v == other.v

        def __hash__(self):
            return hash(self.v)

    k1 = Key(1)
    k2 = Key(1)
    d1 = {k1: "foo"}
    d2 = {k2: "foo"}
    codeflash_output = dict_equiv(d1, d2)  # 4.90μs -> 4.09μs (19.8% faster)


def test_dict_equiv_with_object_values():
    # Object values (with equality)
    class Value:
        def __init__(self, v):
            self.v = v

        def __eq__(self, other):
            return isinstance(other, Value) and self.v == other.v

    d1 = {"a": Value(1)}
    d2 = {"a": Value(1)}
    codeflash_output = dict_equiv(d1, d2)  # 4.80μs -> 4.38μs (9.74% faster)


def test_dict_equiv_with_incompatible_object_values():
    # Object values (not equal)
    class Value:
        def __init__(self, v):
            self.v = v

        def __eq__(self, other):
            return isinstance(other, Value) and self.v == other.v

    d1 = {"a": Value(1)}
    d2 = {"a": Value(2)}
    codeflash_output = not dict_equiv(d1, d2)  # 12.7μs -> 15.0μs (15.8% slower)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from xarray.core.utils import dict_equiv


# function to test
def equivalent(a, b):
    """Default compatibility function for dict_equiv: equality check."""
    # Handles numpy arrays if present, otherwise falls back to ==
    try:
        import numpy as np

        if isinstance(a, np.ndarray) and isinstance(b, np.ndarray):
            return (a == b).all()
    except ImportError:
        pass
    return a == b


from xarray.core.utils import dict_equiv

# unit tests

# -------------------------
# Basic Test Cases
# -------------------------


def test_empty_dicts_equivalent():
    # Two empty dicts are equivalent
    codeflash_output = dict_equiv({}, {})  # 1.63μs -> 927ns (76.1% faster)


def test_single_key_value_equal():
    # Dicts with one identical key-value pair
    codeflash_output = dict_equiv({"a": 1}, {"a": 1})  # 3.44μs -> 2.67μs (28.8% faster)


def test_single_key_value_not_equal():
    # Dicts with one differing value
    codeflash_output = not dict_equiv(
        {"a": 1}, {"a": 2}
    )  # 6.00μs -> 6.65μs (9.77% slower)


def test_multiple_keys_all_equal():
    # Dicts with multiple identical key-value pairs
    d1 = {"a": 1, "b": 2, "c": 3}
    d2 = {"a": 1, "b": 2, "c": 3}
    codeflash_output = dict_equiv(d1, d2)  # 4.25μs -> 3.35μs (26.8% faster)


def test_multiple_keys_one_value_differs():
    # Dicts with one value different
    d1 = {"a": 1, "b": 2, "c": 3}
    d2 = {"a": 1, "b": 99, "c": 3}
    codeflash_output = not dict_equiv(d1, d2)  # 6.63μs -> 7.05μs (5.87% slower)


def test_multiple_keys_one_missing():
    # One dict missing a key
    d1 = {"a": 1, "b": 2}
    d2 = {"a": 1}
    codeflash_output = not dict_equiv(d1, d2)  # 2.45μs -> 748ns (227% faster)
    codeflash_output = not dict_equiv(d2, d1)  # 2.18μs -> 297ns (635% faster)


def test_different_key_order():
    # Dicts with same keys/values but different order
    d1 = {"a": 1, "b": 2}
    d2 = {"b": 2, "a": 1}
    codeflash_output = dict_equiv(d1, d2)  # 3.80μs -> 3.13μs (21.4% faster)


# -------------------------
# Edge Test Cases
# -------------------------


def test_nested_dicts_equivalent():
    # Dicts with nested dicts as values
    d1 = {"a": {"x": 1, "y": 2}, "b": 2}
    d2 = {"b": 2, "a": {"x": 1, "y": 2}}
    codeflash_output = dict_equiv(
        d1,
        d2,
        compat=lambda x, y: (
            dict_equiv(x, y) if isinstance(x, dict) and isinstance(y, dict) else x == y
        ),
    )  # 5.24μs -> 4.71μs (11.2% faster)


def test_nested_dicts_not_equivalent():
    # Dicts with nested dicts, one value different
    d1 = {"a": {"x": 1, "y": 2}, "b": 2}
    d2 = {"a": {"x": 1, "y": 3}, "b": 2}
    codeflash_output = not dict_equiv(
        d1,
        d2,
        compat=lambda x, y: (
            dict_equiv(x, y) if isinstance(x, dict) and isinstance(y, dict) else x == y
        ),
    )  # 7.39μs -> 8.45μs (12.5% slower)


def test_dict_with_list_values_equiv():
    # Dicts with list values, using custom compat
    d1 = {"a": [1, 2], "b": [3, 4]}
    d2 = {"a": [1, 2], "b": [3, 4]}
    codeflash_output = dict_equiv(
        d1, d2, compat=lambda x, y: x == y
    )  # 2.53μs -> 1.60μs (58.1% faster)


def test_dict_with_list_values_not_equiv():
    # Dicts with list values, one list differs
    d1 = {"a": [1, 2], "b": [3, 4]}
    d2 = {"a": [1, 2], "b": [3, 99]}
    codeflash_output = not dict_equiv(
        d1, d2, compat=lambda x, y: x == y
    )  # 1.71μs -> 1.90μs (9.97% slower)


def test_dict_with_tuple_and_list_equiv():
    # Dicts with tuple and list values, should not be equivalent
    d1 = {"a": [1, 2]}
    d2 = {"a": (1, 2)}
    codeflash_output = not dict_equiv(
        d1, d2, compat=lambda x, y: x == y
    )  # 1.50μs -> 1.73μs (13.1% slower)


def test_dict_with_none_values():
    # Dicts with None as values
    d1 = {"a": None, "b": 2}
    d2 = {"a": None, "b": 2}
    codeflash_output = dict_equiv(d1, d2)  # 3.96μs -> 3.16μs (25.2% faster)


def test_dict_with_none_and_missing_key():
    # One dict has None, other is missing the key
    d1 = {"a": None}
    d2 = {}
    codeflash_output = not dict_equiv(d1, d2)  # 891ns -> 784ns (13.6% faster)
    codeflash_output = not dict_equiv(d2, d1)  # 1.55μs -> 327ns (374% faster)


def test_dict_with_unhashable_values():
    # Dicts with sets as values
    d1 = {"a": {1, 2, 3}}
    d2 = {"a": {3, 2, 1}}
    codeflash_output = dict_equiv(
        d1, d2, compat=lambda x, y: x == y
    )  # 2.74μs -> 2.00μs (37.2% faster)


def test_dict_with_different_types():
    # Dicts with same keys but different types as values
    d1 = {"a": 1}
    d2 = {"a": "1"}
    codeflash_output = not dict_equiv(d1, d2)  # 6.00μs -> 6.66μs (9.85% slower)


def test_dict_with_custom_compat():
    # Use a custom compat function that allows off-by-one
    def off_by_one(x, y):
        return abs(x - y) <= 1

    d1 = {"a": 10, "b": 20}
    d2 = {"a": 11, "b": 19}
    codeflash_output = dict_equiv(
        d1, d2, compat=off_by_one
    )  # 3.06μs -> 2.20μs (39.2% faster)


def test_dict_with_numpy_arrays():
    # Dicts with numpy arrays as values
    import numpy as np

    d1 = {"a": np.array([1, 2, 3])}
    d2 = {"a": np.array([1, 2, 3])}
    codeflash_output = dict_equiv(d1, d2)  # 68.3μs -> 79.5μs (14.1% slower)


def test_dict_with_numpy_arrays_different():
    # Dicts with numpy arrays, one element differs
    import numpy as np

    d1 = {"a": np.array([1, 2, 3])}
    d2 = {"a": np.array([1, 2, 99])}
    codeflash_output = not dict_equiv(d1, d2)  # 50.1μs -> 56.5μs (11.3% slower)


def test_dict_with_different_keys():
    # Dicts with completely different keys
    d1 = {"a": 1}
    d2 = {"b": 1}
    codeflash_output = not dict_equiv(d1, d2)  # 877ns -> 1.01μs (13.6% slower)


def test_dict_with_extra_keys():
    # One dict has extra keys
    d1 = {"a": 1, "b": 2}
    d2 = {"a": 1}
    codeflash_output = not dict_equiv(d1, d2)  # 2.55μs -> 728ns (251% faster)
    codeflash_output = not dict_equiv(d2, d1)  # 2.26μs -> 305ns (640% faster)


def test_dict_with_falsy_values():
    # Dicts with falsy values (0, False, '', [])
    d1 = {"a": 0, "b": False, "c": "", "d": []}
    d2 = {"a": 0, "b": False, "c": "", "d": []}
    codeflash_output = dict_equiv(
        d1, d2, compat=lambda x, y: x == y
    )  # 2.75μs -> 2.15μs (27.7% faster)


def test_dict_with_nan_values():
    # Dicts with float('nan') values should not be equal by default
    import math

    d1 = {"a": float("nan")}
    d2 = {"a": float("nan")}
    codeflash_output = not dict_equiv(d1, d2)  # 7.56μs -> 6.84μs (10.6% faster)

    # But with a custom compat that handles nan
    def nan_equal(x, y):
        try:
            return math.isnan(x) and math.isnan(y)
        except Exception:
            return x == y

    codeflash_output = dict_equiv(
        d1, d2, compat=nan_equal
    )  # 2.19μs -> 1.84μs (19.0% faster)


# -------------------------
# Large Scale Test Cases
# -------------------------


def test_large_dicts_equivalent():
    # Large dicts with 1000 elements, all equal
    size = 1000
    d1 = {i: i for i in range(size)}
    d2 = {i: i for i in range(size)}
    codeflash_output = dict_equiv(d1, d2)  # 427μs -> 396μs (7.66% faster)


def test_large_dicts_one_value_differs():
    # Large dicts, one value differs
    size = 1000
    d1 = {i: i for i in range(size)}
    d2 = {i: i for i in range(size)}
    d2[500] = -1
    codeflash_output = not dict_equiv(d1, d2)  # 182μs -> 176μs (3.24% faster)


def test_large_dicts_one_missing_key():
    # Large dicts, one missing key
    size = 1000
    d1 = {i: i for i in range(size)}
    d2 = {i: i for i in range(size) if i != 999}
    codeflash_output = not dict_equiv(d1, d2)  # 396μs -> 768ns (51561% faster)
    codeflash_output = not dict_equiv(d2, d1)  # 399μs -> 336ns (118915% faster)


def test_large_dicts_with_nested_dicts():
    # Large dicts with nested dicts as values
    size = 100
    d1 = {i: {"a": i, "b": i + 1} for i in range(size)}
    d2 = {i: {"a": i, "b": i + 1} for i in range(size)}
    codeflash_output = dict_equiv(
        d1,
        d2,
        compat=lambda x, y: (
            dict_equiv(x, y) if isinstance(x, dict) and isinstance(y, dict) else x == y
        ),
    )  # 109μs -> 83.0μs (32.1% faster)


def test_large_dicts_with_list_values():
    # Large dicts with list values
    size = 100
    d1 = {i: [i, i + 1, i + 2] for i in range(size)}
    d2 = {i: [i, i + 1, i + 2] for i in range(size)}
    codeflash_output = dict_equiv(
        d1, d2, compat=lambda x, y: x == y
    )  # 12.6μs -> 10.0μs (25.9% faster)


def test_large_dicts_with_numpy_arrays():
    # Large dicts with numpy arrays as values
    import numpy as np

    size = 100
    d1 = {i: np.arange(i, i + 5) for i in range(size)}
    d2 = {i: np.arange(i, i + 5) for i in range(size)}
    codeflash_output = dict_equiv(d1, d2)  # 1.03ms -> 1.06ms (2.63% slower)


def test_large_dicts_with_numpy_arrays_one_differs():
    # Large dicts with numpy arrays, one value differs
    import numpy as np

    size = 100
    d1 = {i: np.arange(i, i + 5) for i in range(size)}
    d2 = {i: np.arange(i, i + 5) for i in range(size)}
    d2[50] = np.arange(100, 105)
    codeflash_output = not dict_equiv(d1, d2)  # 525μs -> 545μs (3.66% slower)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
⏪ Replay Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_pytest_xarrayteststest_concat_py_xarrayteststest_computation_py_xarrayteststest_formatting_py_xarray__replay_test_0.py::test_xarray_core_utils_dict_equiv 2.89ms 3.00ms -3.67%⚠️

To edit these changes git checkout codeflash/optimize-dict_equiv-mj9uezxv and push.

Codeflash Static Badge

The optimization implements an early length check to avoid the expensive second iteration in dictionary equivalence testing. 

**Key optimization**: Instead of checking `all(k in first for k in second)` after the main loop, the code now performs `if len(first) != len(second): return False` at the beginning. This single O(1) length comparison eliminates the need for a full O(n) iteration through the second dictionary's keys when the dictionaries have different sizes.

**Why this is faster**: The original implementation always performed two passes - first checking values for matching keys, then verifying all keys in the second dict exist in the first. The optimized version leverages the mathematical property that two dictionaries are equivalent if and only if they have the same length AND all key-value pairs in the first dictionary match those in the second.

**Performance gains by test case type**:
- **Missing/extra key scenarios**: Massive speedups (200-140,000% faster) because length mismatch is detected immediately without any key iteration
- **Equivalent dictionaries**: Moderate improvements (6-37% faster) by eliminating the second pass entirely  
- **Value mismatches**: Small performance cost (3-21% slower) due to the added length check, but this is negligible compared to the value comparison overhead

**Impact on xarray workloads**: Based on the function reference showing `dict_equiv` used in dataset concatenation for comparing global attributes, this optimization will significantly improve performance when concatenating datasets with different attribute sets - a common scenario where the length check provides immediate rejection without expensive key-by-key comparison.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 December 17, 2025 10:02
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Dec 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant