Performance Optimization: Path, Trace, Callers, and Analyze Commands#6
Merged
shahariaazam merged 9 commits intomainfrom Feb 1, 2026
Merged
Performance Optimization: Path, Trace, Callers, and Analyze Commands#6shahariaazam merged 9 commits intomainfrom
shahariaazam merged 9 commits intomainfrom
Conversation
Implement Breadth-First Search (BFS) algorithm for finding shortest paths when --shortest flag is used. Performance improvement: - Baseline: 30+ seconds (timeout) - Phase 1: 1.92 seconds (completes successfully) - Speedup: >15x minimum (likely 100-200x vs hypothetical completion) Algorithm change: - Old: DFS O(N^D) - explores all paths then sorts - New: BFS O(V+E) - finds shortest path on first discovery Implementation: - Added find_shortest_path() method to CodeGraph - Uses BFS with queue-based traversal - Parent tracking for path reconstruction - Modified Path command to route --shortest to BFS Test case (VSCode 90K nodes): codenav path --from "_activateExtension" --to "startExtensionHosts" --shortest Result: Found 5-hop path in 1.92s (was timing out)
Implement early stopping in DFS path search to avoid finding all paths when only a limited number is needed. Performance improvement: - Baseline: 30+ seconds (timeout) - Phase 2: 31.3 seconds (completes) - Status: Now completes successfully instead of timing out Algorithm change: - Old: Find ALL paths, sort, truncate to 10 - New: Stop after finding 10 paths, then sort Implementation: - Added find_paths_limited() method with max_paths parameter - Modified find_paths_recursive() to check and early-exit - Path command uses limit of 10 for default mode - Use usize::MAX for --all flag Test case (VSCode 90K nodes): codenav path --from "_activateExtension" --to "startExtensionHosts" Result: Found 10 paths in 31.3s (was timing out) Note: Still needs Phase 3 for optimal performance
Use node indices (usize) instead of strings during path search for better performance. Performance improvement: - Phase 2 (baseline): 31.3 seconds - Phase 3: 9.47 seconds - Speedup: 3.3x faster Overall improvement (all phases): - Original baseline: 30+ seconds (timeout) - Final result: 9.47 seconds (completes successfully) - Total speedup: >3x Key optimizations: - Use Vec<usize> for paths during search (was Vec<String>) - Use HashSet<usize> for visited tracking (was HashSet<String>) - Convert indices to names only at final output - Integer comparisons instead of string comparisons - Eliminated string cloning during traversal - Pre-allocate HashSet with capacity Implementation: - Added find_paths_by_index() for index-based search - Added find_paths_recursive_indexed() for recursive traversal - Added convert_index_path_to_names() for final conversion - Modified find_paths_limited() to use index-based search Test case (VSCode 90K nodes): codenav path --from "_activateExtension" --to "startExtensionHosts" Result: Found 10 paths in 9.47s (was 31.3s)
Changed default behavior to use BFS (shortest path) instead of DFS (10 paths) for better UX and performance. API changes: - Default (no flags): Shortest path using BFS (1.97s) - --limit N: Find first N paths using DFS (8.03s for N=10) - --all: Find all paths using DFS (very slow) - Removed: --shortest flag (now the default) Performance improvement: - Old default: 9.47s (10 paths with DFS) - New default: 1.97s (shortest path with BFS) - Speedup: 4.8x faster for common case Rationale: - Most users want the shortest path, not 10 random paths - Users shouldn't need special flags to get good performance - Advanced users can still get multiple paths with --limit N Breaking change: - Old default behavior (10 paths) now requires --limit 10 - Old --shortest flag removed (now the default) Migration: - Old: codenav path --from A --to B (got 10 paths) - New: codenav path --from A --to B (gets shortest path) - To get old behavior: codenav path --from A --to B --limit 10 Test results (VSCode 90K nodes): - Default: 1.97s (was 9.47s) - 4.8x faster - --limit 10: 8.03s (was 9.47s) - 1.2x faster
4d6de85 to
a33e6c6
Compare
Comprehensive technical documentation covering: - System architecture and data structures - Indexing phase with parallel processing - Query algorithms (Query, Trace, Callers, Path, Analyze) - Performance characteristics and complexity analysis - Key optimizations (v0.3.0 and v0.4.0) - Storage format and backward compatibility Uses ASCII diagrams for clarity and focuses on technical details: algorithms, complexity, and performance tradeoffs. Document enables developers to understand the codebase architecture at a glance.
Revised ARCHITECTURE.md to be architecture-focused: Removed: - Source code snippets - Implementation details - Unnecessary verbosity Enhanced: - High-level algorithm descriptions - System architecture diagrams - Performance characteristics - Design principles - Complexity analysis tables Result: Concise technical document focused on architecture, not implementation details.
Use lowercase for consistency with typical markdown file naming.
Run cargo fmt to fix formatting issues caught by CI.
Added 12 new tests covering core functionality: Path Finding: - test_find_shortest_path: BFS shortest path - test_find_shortest_path_no_path: No path exists - test_find_shortest_path_depth_limit: Depth constraints - test_find_paths_limited: Early termination Trace & Callers: - test_trace_dependencies: DFS dependency traversal - test_find_callers: Reverse edge lookup - test_trace_handles_cycles: Circular dependency handling Analyze: - test_get_complexity: Fan-in/fan-out metrics - test_find_hotspots: Most called functions Graph Operations: - test_graph_merge: Parallel graph merging - test_outgoing_and_incoming_edges: Edge indices - test_multiple_nodes_same_name: Name collision handling Test Results: - Total tests: 24 (was 12) - 100% improvement - All tests passing - 0 failures Code Coverage: - Total: 41.83% - core/graph.rs: 51.76% (main logic) - lib.rs: 100% (tests) Coverage report: target/llvm-cov/html/index.html Next steps: - Consider adding parser integration tests - Add more edge case tests for analyze commands - Benchmark test performance
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Performance Optimization for Navigation Commands
This PR optimizes the path command and validates performance of all navigation commands for large codebases.
✅ Path Command Optimization - COMPLETED
Performance Results (VSCode 90K nodes):
Optimizations Applied:
API Changes:
--limit N: Find N paths--all: Find all paths--shortestflag (now the default)✅ Performance Validation - COMPLETED
Comprehensive benchmarking on VSCode codebase (90K nodes):
Edge cases tested:
Key findings:
Conclusion: NO FURTHER OPTIMIZATION NEEDED for trace, callers, or analyze commands.
📚 Technical Architecture Documentation - ADDED
Added
ARCHITECTURE.mdwith comprehensive technical documentation:Uses ASCII diagrams for clarity. Enables developers to understand the codebase architecture at a glance.
🧪 Testing
All tests passing:
Clippy checks: ✅ No warnings
📊 Summary
Changes:
Result:
Files changed:
src/cli.rs: API changes for path commandsrc/main.rs: Path command routing logicsrc/core/graph.rs: BFS, DFS optimizations, index-based searchARCHITECTURE.md: Technical documentation (new)