⚡️ Speed up function find_last_node by 21,817%
#185
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 21,817% (218.17x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
181 milliseconds→826 microseconds(best of132runs)📝 Explanation and details
The optimization dramatically improves performance by eliminating quadratic complexity through a fundamental algorithmic change.
Key Optimization:
The original code uses a nested loop structure: for each node, it checks against ALL edges to verify if that node is a source. This creates O(n × m) complexity where n = nodes and m = edges. The optimized version pre-computes a set of all source IDs once, then performs constant-time lookups.
Specific Changes:
source_ids = {e["source"] for e in edges}creates a hash set of all source node IDs in O(m) timen["id"] not in source_idsuses O(1) hash set membership testing instead of O(m) linear search through all edgesWhy This Works:
in/not in) is O(1) average case vs. O(m) for theall()generatorPerformance Impact:
The 218x speedup (from 181ms to 826μs) demonstrates the dramatic difference between quadratic and linear algorithms. This optimization is particularly effective for:
The optimization maintains identical behavior while being significantly more scalable for real-world graph processing workloads.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-find_last_node-mjamo8dzand push.