⚡️ Speed up function find_last_node by 14,134%
#187
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 14,134% (141.34x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
66.0 milliseconds→464 microseconds(best of250runs)📝 Explanation and details
The optimization transforms an O(n*m) algorithm into O(n+m) by replacing nested iteration with set-based membership testing.
Key optimization: The original code uses
all(e["source"] != n["id"] for e in edges)for each node, creating a nested loop that checks every edge for every node. The optimized version pre-computessources = {e["source"] for e in edges}once, then uses fast set membership (n["id"] not in sources) for each node check.Performance impact:
Why this works: Python sets use hash tables for O(1) average-case membership testing, while the original
all()with generator requires O(edges) time per node. For graphs with many edges, this difference compounds significantly.Test case analysis: The optimization excels particularly on large-scale test cases like
test_large_linear_chainandtest_large_star_topologywith 1000 nodes/edges, where the quadratic behavior of the original becomes prohibitive. Basic cases with few nodes/edges see modest improvements, but the algorithmic advantage scales with input size.This is especially valuable for graph algorithms where edge lists can be substantial, making the function suitable for production graph processing workflows.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-find_last_node-mjar20i0and push.