Skip to content

Performance Optimization: Path, Trace, Callers, and Analyze Commands#6

Merged
shahariaazam merged 9 commits intomainfrom
investigation
Feb 1, 2026
Merged

Performance Optimization: Path, Trace, Callers, and Analyze Commands#6
shahariaazam merged 9 commits intomainfrom
investigation

Conversation

@shahariaazam
Copy link
Copy Markdown
Member

@shahariaazam shahariaazam commented Feb 1, 2026

Performance Optimization for Navigation Commands

This PR optimizes the path command and validates performance of all navigation commands for large codebases.

✅ Path Command Optimization - COMPLETED

Performance Results (VSCode 90K nodes):

  • Baseline: 30+ seconds (timeout)
  • Optimized: 1.97 seconds
  • Speedup: >15x faster

Optimizations Applied:

  1. Phase 1: BFS for shortest path (>15x speedup)
  2. Phase 2: Early termination for limited results
  3. Phase 3: Optimized data structures - use indices (3.3x speedup)
  4. Phase 4: Smart defaults - shortest path by default (better UX)

API Changes:

  • Default: Shortest path (no flags needed) - FAST ✅
  • --limit N: Find N paths
  • --all: Find all paths
  • Removed: --shortest flag (now the default)

✅ Performance Validation - COMPLETED

Comprehensive benchmarking on VSCode codebase (90K nodes):

Command Time Status
path (default) 1.97s ✅ Optimized
trace (depth 1-3) 1.5s ✅ Already fast
callers 1.5s ✅ Already fast
analyze 1.6s ✅ Already fast

Edge cases tested:

  • 10,707 callers for "push": 1.48s ✅
  • 6,004 callers for "map": 1.50s ✅
  • Depth 5 trace: 1.53s ✅
  • Full 90K node analysis: 2.06s ✅

Key findings:

  • All commands already blazing fast
  • No performance cliffs found
  • Index-based lookups (v0.3.0) work perfectly
  • Load time dominates (~1.08s of ~1.5s total)

Conclusion: NO FURTHER OPTIMIZATION NEEDED for trace, callers, or analyze commands.

📚 Technical Architecture Documentation - ADDED

Added ARCHITECTURE.md with comprehensive technical documentation:

  • System architecture and data structures
  • Indexing phase with parallel processing
  • Query algorithms with complexity analysis
  • Performance characteristics
  • Storage format and optimizations

Uses ASCII diagrams for clarity. Enables developers to understand the codebase architecture at a glance.

🧪 Testing

All tests passing:

cargo test --release
test result: ok. 12 passed; 0 failed; 1 ignored

Clippy checks: ✅ No warnings

📊 Summary

Changes:

  • Path command: 4 optimization phases
  • Architecture documentation added
  • Performance validation completed

Result:

  • Path: 30+s → 2s (>15x faster)
  • All other commands: Already optimal (1.5-2s)
  • Production-ready for large codebases

Files changed:

  • src/cli.rs: API changes for path command
  • src/main.rs: Path command routing logic
  • src/core/graph.rs: BFS, DFS optimizations, index-based search
  • ARCHITECTURE.md: Technical documentation (new)

Implement Breadth-First Search (BFS) algorithm for finding shortest
paths when --shortest flag is used.

Performance improvement:
- Baseline: 30+ seconds (timeout)
- Phase 1: 1.92 seconds (completes successfully)
- Speedup: >15x minimum (likely 100-200x vs hypothetical completion)

Algorithm change:
- Old: DFS O(N^D) - explores all paths then sorts
- New: BFS O(V+E) - finds shortest path on first discovery

Implementation:
- Added find_shortest_path() method to CodeGraph
- Uses BFS with queue-based traversal
- Parent tracking for path reconstruction
- Modified Path command to route --shortest to BFS

Test case (VSCode 90K nodes):
codenav path --from "_activateExtension" --to "startExtensionHosts" --shortest
Result: Found 5-hop path in 1.92s (was timing out)
Implement early stopping in DFS path search to avoid finding all paths
when only a limited number is needed.

Performance improvement:
- Baseline: 30+ seconds (timeout)
- Phase 2: 31.3 seconds (completes)
- Status: Now completes successfully instead of timing out

Algorithm change:
- Old: Find ALL paths, sort, truncate to 10
- New: Stop after finding 10 paths, then sort

Implementation:
- Added find_paths_limited() method with max_paths parameter
- Modified find_paths_recursive() to check and early-exit
- Path command uses limit of 10 for default mode
- Use usize::MAX for --all flag

Test case (VSCode 90K nodes):
codenav path --from "_activateExtension" --to "startExtensionHosts"
Result: Found 10 paths in 31.3s (was timing out)

Note: Still needs Phase 3 for optimal performance
Use node indices (usize) instead of strings during path search for
better performance.

Performance improvement:
- Phase 2 (baseline): 31.3 seconds
- Phase 3: 9.47 seconds
- Speedup: 3.3x faster

Overall improvement (all phases):
- Original baseline: 30+ seconds (timeout)
- Final result: 9.47 seconds (completes successfully)
- Total speedup: >3x

Key optimizations:
- Use Vec<usize> for paths during search (was Vec<String>)
- Use HashSet<usize> for visited tracking (was HashSet<String>)
- Convert indices to names only at final output
- Integer comparisons instead of string comparisons
- Eliminated string cloning during traversal
- Pre-allocate HashSet with capacity

Implementation:
- Added find_paths_by_index() for index-based search
- Added find_paths_recursive_indexed() for recursive traversal
- Added convert_index_path_to_names() for final conversion
- Modified find_paths_limited() to use index-based search

Test case (VSCode 90K nodes):
codenav path --from "_activateExtension" --to "startExtensionHosts"
Result: Found 10 paths in 9.47s (was 31.3s)
Changed default behavior to use BFS (shortest path) instead of DFS
(10 paths) for better UX and performance.

API changes:
- Default (no flags): Shortest path using BFS (1.97s)
- --limit N: Find first N paths using DFS (8.03s for N=10)
- --all: Find all paths using DFS (very slow)
- Removed: --shortest flag (now the default)

Performance improvement:
- Old default: 9.47s (10 paths with DFS)
- New default: 1.97s (shortest path with BFS)
- Speedup: 4.8x faster for common case

Rationale:
- Most users want the shortest path, not 10 random paths
- Users shouldn't need special flags to get good performance
- Advanced users can still get multiple paths with --limit N

Breaking change:
- Old default behavior (10 paths) now requires --limit 10
- Old --shortest flag removed (now the default)

Migration:
- Old: codenav path --from A --to B (got 10 paths)
- New: codenav path --from A --to B (gets shortest path)
- To get old behavior: codenav path --from A --to B --limit 10

Test results (VSCode 90K nodes):
- Default: 1.97s (was 9.47s) - 4.8x faster
- --limit 10: 8.03s (was 9.47s) - 1.2x faster
Comprehensive technical documentation covering:

- System architecture and data structures
- Indexing phase with parallel processing
- Query algorithms (Query, Trace, Callers, Path, Analyze)
- Performance characteristics and complexity analysis
- Key optimizations (v0.3.0 and v0.4.0)
- Storage format and backward compatibility

Uses ASCII diagrams for clarity and focuses on technical
details: algorithms, complexity, and performance tradeoffs.

Document enables developers to understand the codebase
architecture at a glance.
Revised ARCHITECTURE.md to be architecture-focused:

Removed:
- Source code snippets
- Implementation details
- Unnecessary verbosity

Enhanced:
- High-level algorithm descriptions
- System architecture diagrams
- Performance characteristics
- Design principles
- Complexity analysis tables

Result: Concise technical document focused on architecture,
not implementation details.
Use lowercase for consistency with typical markdown file naming.
Run cargo fmt to fix formatting issues caught by CI.
Added 12 new tests covering core functionality:

Path Finding:
- test_find_shortest_path: BFS shortest path
- test_find_shortest_path_no_path: No path exists
- test_find_shortest_path_depth_limit: Depth constraints
- test_find_paths_limited: Early termination

Trace & Callers:
- test_trace_dependencies: DFS dependency traversal
- test_find_callers: Reverse edge lookup
- test_trace_handles_cycles: Circular dependency handling

Analyze:
- test_get_complexity: Fan-in/fan-out metrics
- test_find_hotspots: Most called functions

Graph Operations:
- test_graph_merge: Parallel graph merging
- test_outgoing_and_incoming_edges: Edge indices
- test_multiple_nodes_same_name: Name collision handling

Test Results:
- Total tests: 24 (was 12) - 100% improvement
- All tests passing
- 0 failures

Code Coverage:
- Total: 41.83%
- core/graph.rs: 51.76% (main logic)
- lib.rs: 100% (tests)

Coverage report: target/llvm-cov/html/index.html

Next steps:
- Consider adding parser integration tests
- Add more edge case tests for analyze commands
- Benchmark test performance
@shahariaazam shahariaazam merged commit 60fabb4 into main Feb 1, 2026
6 checks passed
@shahariaazam shahariaazam deleted the investigation branch February 1, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant