perf: optimize core trajectory and clustering bottlenecks by svdrecbd · Pull Request #773 · cole-trapnell-lab/monocle3

svdrecbd · 2026-01-26T06:59:08Z

Summary

This PR introduces critical performance optimizations to monocle3's most computationally intensive steps, significantly improving scalability for large datasets (e.g., 20k+ cells/genes). It replaces slow R-based loops with vectorized C++ implementations for trajectory learning and marker detection, achieving speedups of 40x to 500x.

Additionally, it integrates BPCells support and Multi-Scale Embedding updates from the development branch, ensuring the package is ready for massive, out-of-core single-cell analysis.

Benchmark Results (M-Series Mac)
Benchmarks verified correctness (Diff < 1e-10) and performance across dataset scales.

Task	Baseline (Master)	Optimized (PR)	Speedup
Marker Detection (20k genes)	5.92s	0.012s	493x
Trajectory Learning (20k cells)	4.04s	0.100s	40x
Graph Construction (200k cells)	2.08s	0.718s	3x

File Change Atlas

The PR affects ~49 files, broken down as follows:

The Performance Engine (C++)

src/clustering.cpp: (Critical) Added project_point_to_graph (fast geometric projection) and calc_specificity_cpp (vectorized entropy calculation). Optimized jaccard_coeff to reduce memory churn.
src/RcppExports.cpp: Auto-generated bindings.

The Bottleneck Fixes (R)

R/learn_graph.R: project2MST now delegates the O(N) loop to C++.
R/find_markers.R: specificity_matrix now uses the C++ implementation for Jensen-Shannon divergence.
R/cluster_cells.R: cluster_cells_make_graph now constructs graphs directly from integer vectors (avoiding data.frame overhead).
R/graph_test.R: Applied similar make_graph optimizations for consistency.

BPCells & Infrastructure

R/matrix.R: Updates for BPCells on-disk matrix handling.
R/reduce_dimensions.R: Updates for Multi-Scale Embeddings (ResNet50 L3+L4).
R/pca.R, R/projection.R: Support for larger, out-of-core datasets.

Testing & Glue

R/RcppExports.R: R wrappers for the new C++ functions.
tests/testthat/: Updated tests to ensure parity with the new optimized logic.
Fixed deprecated igraph::graph.data.frame calls throughout the codebase.

Validation

Passed R CMD check --as-cran (excluding vignettes due to local LaTeX missing).
Numerical parity verified against baseline R implementations.
No regressions in existing functionality.

- Ported `project2MST` cell-to-graph projection to C++ (~40x speedup) - Vectorized `specificity_matrix` JS divergence in C++ (~500x speedup) - Refactored graph construction to use integer-based `make_graph` (~3-9x speedup) - Eliminated `igraph` deprecation warnings - Validated numerical parity with baseline implementations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize core trajectory and clustering bottlenecks#773

perf: optimize core trajectory and clustering bottlenecks#773
svdrecbd wants to merge 1 commit into
cole-trapnell-lab:masterfrom
svdrecbd:performance-optimizations

svdrecbd commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

svdrecbd commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant