rlm-cli uses Cargo features to provide optional functionality and reduce binary size for specific use cases.
What it does: Enables semantic search using FastEmbed ONNX-based embedding models.
Dependencies:
fastembedcrate (ONNX Runtime binaries)- BGE-M3 embedding model (1024 dimensions)
Use when:
- You need semantic similarity search
- Context-aware document retrieval is important
- Hybrid search (semantic + BM25) is required
Binary size impact: ~100MB (includes ONNX runtime + model weights)
Build:
# Enabled by default
cargo build --release
# Explicitly enable
cargo build --release --features fastembed-embeddingsSkip when:
- You only need keyword/regex search
- Binary size is critical (embedded systems, containers)
- BM25 full-text search is sufficient
Build without:
cargo build --release --no-default-featuresWhat it does: Enables high-performance vector search using HNSW (Hierarchical Navigable Small World) algorithm.
Dependencies:
usearchcrate v2.23.x–v2.24.x from crates.io (pinned<2.25)- Requires C++ compiler (C++17 or later)
Note: Version 2.25.0+ is excluded pending validation. See Troubleshooting for details.
Use when:
- Working with large document collections (>10,000 chunks)
- Low-latency vector search is required (<10ms)
- Memory usage is acceptable (HNSW index ~4x embedding size)
Performance:
- Exact search (SQLite): O(n) - 100ms for 10K chunks
- HNSW search: O(log n) - <10ms for 10K chunks
Build:
cargo build --release --features usearch-hnswSkip when:
- Document collection is small (<1,000 chunks)
- Build environment lacks C++ toolchain
- Approximate nearest neighbor trade-offs are unacceptable
What it does: Combines fastembed-embeddings + usearch-hnsw for complete semantic search capabilities.
Use when:
- Production deployment with large-scale semantic search
- Maximum search performance is required
- You want the complete feature set
Build:
cargo build --release --features full-search| Features | Embedding | Vector Search | BM25 | Use Case |
|---|---|---|---|---|
(none) |
❌ | ❌ | ✅ | Keyword search only, minimal binary |
fastembed-embeddings |
✅ | Exact (SQLite) | ✅ | Hybrid search, moderate scale |
usearch-hnsw |
❌ | ❌ | ✅ | No embeddings, BM25 only |
full-search |
✅ | HNSW | ✅ | Production, large scale |
Semantic search commands will fall back to BM25-only:
# This command requires embeddings
rlm-cli search "query" --mode semantic
# Error: FastEmbed not available, falling back to BM25
# Suggestion: Rebuild with --features fastembed-embeddingsThe CLI will automatically use hash-based pseudo-embeddings for compatibility, but results will be degraded.
Vector search uses exact SQLite-based cosine similarity:
rlm-cli search "query" --mode hybrid --top-k 100
# Uses exact search - slower but accuratePerformance degrades linearly with chunk count.
# Smallest binary, keyword search only
cargo build --release --no-default-features
# Result: ~5MB binary, no embedding dependencies# Default: FastEmbed embeddings + SQLite vector search
cargo build --release
# Result: ~100MB binary, hybrid search, moderate scale# Full features: FastEmbed + HNSW
cargo build --release --features full-search
# Result: ~105MB binary, maximum performance# Dockerfile example - minimal size
FROM rust:1.88-slim AS builder
WORKDIR /app
COPY . .
RUN cargo build --release --no-default-features
FROM debian:bookworm-slim
COPY --from=builder /app/target/release/rlm-cli /usr/local/bin/
CMD ["rlm-cli"]When first running with embeddings enabled:
rlm-cli load document.md --name docs
# Downloads BGE-M3 model (~1GB) to ~/.cache/fastembed/
# Progress: Downloading model... 100%
# Generating embeddings... Done (5000 chunks in 30s)Model cache location: $HOME/.cache/fastembed/
Download size: ~1GB (one-time)
Check which features are compiled:
rlm-cli --version
# Output:
# rlm-cli 1.2.4
# Features: fastembed-embeddings, usearch-hnswError: error: failed to compile usearch
Solution: Install C++ compiler
# Ubuntu/Debian
sudo apt-get install build-essential
# macOS
xcode-select --install
# Or disable HNSW
cargo build --release --features fastembed-embeddingsError: ONNX Runtime not found
Solution: Use bundled binaries (enabled by default)
# Explicitly enable bundled ONNX
cargo build --release --features fastembed-embeddingsIssue: Embedding generation is slow
Solutions:
- Use
--chunker parallelfor multi-threaded chunking - Reduce chunk size:
--chunk-size 50000(default: 100k) - Check CPU resources (embedding uses all cores)
Issue: High memory usage during search
Solutions:
- Without HNSW: Memory = chunk_count × 1024 × 4 bytes
- With HNSW: Memory = chunk_count × 1024 × 16 bytes (includes index)
- Use
--top-kto limit result set:--top-k 10
Benchmark: 50,000 chunks, BGE-M3 embeddings (1024d)
| Configuration | Search Time | Memory | Binary Size |
|---|---|---|---|
| No features | 200ms (BM25) | 50MB | 5MB |
| fastembed-embeddings | 800ms (exact) | 250MB | 100MB |
| full-search | 8ms (HNSW) | 450MB | 105MB |
Recommendation: Use fastembed-embeddings (default) for most use cases. Enable usearch-hnsw only for large-scale deployments (>10K chunks).
- Architecture - How features integrate with core systems
- CLI Reference - Commands affected by feature flags
- Plugin Integration - Using features with Claude Code