⚡️ Speed up function compute_conv_output_shape by 20%
#214
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 20% (0.20x) speedup for
compute_conv_output_shapeinkeras/src/ops/operation_utils.py⏱️ Runtime :
998 microseconds→834 microseconds(best of132runs)📝 Explanation and details
The optimized code achieves a 19% speedup through several key memory and computation efficiency improvements:
Key Optimizations:
Reduced NumPy array allocations: The original code created
np.array(spatial_shape)early and mutated it for None values. The optimized version uses a mutable Python list (tmp_spatial_shape) for None handling, then creates NumPy arrays only when needed for vectorized math operations.More efficient array creation: Replaced
np.array()calls withnp.fromiter()which is more efficient for creating arrays from iterables when the size is known, avoiding intermediate list creation.Eliminated redundant conversions: The original code converted the entire
output_spatial_shapearray to integers via list comprehension[int(i) for i in output_spatial_shape]. The optimized version uses.tolist()to convert NumPy arrays to Python lists once, then converts to integers in a single pass.Pre-computed dimensions: Cached
len(input_shape)asndimandlen(spatial_shape)asspatial_ndimto avoid repeatedlen()calls.Performance Impact:
The function is called from critical paths in Keras convolutional layers (
base_conv.py,base_depthwise_conv.py,base_separable_conv.py) during shape computation, which happens frequently during model construction and inference. The optimizations are particularly effective for:test_large_2d_conv_channels_last_none_dim)test_edge_negative_output_size)The optimizations maintain identical behavior while reducing memory allocations and computational overhead, making convolution shape computation more efficient across all Keras convolutional layer types.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-compute_conv_output_shape-mjag63o1and push.