Skip to content

Chaganti-Reddy/neuro_search

Repository files navigation

NeuroSearch Engine

A high-performance hybrid vector search engine combining Python's flexibility with Rust's raw speed.

Rust Python Docker Redis

Overview

NeuroSearch is a custom-built vector retrieval system designed to demonstrate Foreign Function Interface (FFI) patterns between Python and Rust.

It offloads the heavy mathematical lifting (Cosine Similarity calculation) to a compiled Rust extension, utilizing SIMD (AVX2) intrinsics to process vectors up to 8x faster than standard loop implementations. It wraps this engine in a FastAPI service with Redis semantic caching to minimize latency.

Key Technical Features

1. Hybrid Architecture (Python + Rust)

  • Python (Control Plane): Handles API requests, JSON validation, and ML model inference (HuggingFace Transformers).
  • Rust (Data Plane): Manages in-memory vector storage and performs brute-force similarity scoring using low-level memory management.

2. SIMD Acceleration (std::arch::x86_64)

  • Implements manual AVX2 intrinsics (_mm256_fmadd_ps).
  • Processes 8 floating-point numbers per CPU cycle instead of one, dramatically increasing throughput for high-dimensional vectors (e.g., 384d, 768d).

3. Concurrency & Thread Safety

  • Uses parking_lot::RwLock for a Multiple-Readers-Single-Writer model.
  • Search queries run in parallel across CPU cores using Rayon, while ingestion blocks safely only when necessary.
  • Releases the Python Global Interpreter Lock (GIL) during search, allowing true parallelism.

4. Mathematical Optimization

  • Pre-Normalization: Vectors are L2-normalized upon ingestion.
  • This simplifies the search formula from Cosine Similarity to Dot Product, eliminating expensive Square Root (sqrt) and Division operations inside the hot loop.

Project Structure

bash
neuro_search/
├── app/                  # Python Control Plane
│   ├── main.py           # FastAPI Entrypoint
│   └── config.py         # Settings & Env Vars
├── neuro_engine/         # Rust Data Plane
│   ├── src/lib.rs        # SIMD Logic & RwLock Implementation
│   ├── Cargo.toml        # Rust Dependencies
│   └── pyproject.toml    # Maturin Build Config
├── benchmark.py          # Latency verification script
├── Dockerfile            # Multi-stage production build
└── docker-compose.yml    # Orchestration

Architecture

graph LR
    Client -->|POST /search| API[FastAPI]
    API -->|Check Key| Redis[(Redis Cache)]
    Redis -->|Hit| API
    Redis -->|Miss| Model[Transformer Model]
    Model -->|Embed Text| API
    API -->|Vector| Rust[Rust SIMD Engine]
    Rust -- AVX2 Parallel Scan --> API
    API -->|JSON| Client
Loading

Quick Start (Recommended)

The project includes a multi-stage Docker setup. This is the easiest way to run it, as it handles the Rust compilation in a Linux environment (avoiding Windows file-locking issues).

Prerequisites

  • Docker Desktop installed and running.

1. Build and Run

docker-compose up --build

Wait for the message: Application startup complete. The API is now running at http://localhost:8000.

Usage Examples

You can test the API using curl or the built-in Swagger UI at http://localhost:8000/docs.

1. Ingest Documents

Add text to the engine. It will be embedded by the Transformer model and stored in Rust memory.

curl -X POST "http://localhost:8000/api/v1/ingest" \
     -H "Content-Type: application/json" \
     -d '{"id": "doc1", "text": "Rust is a systems programming language focused on safety."}'

curl -X POST "http://localhost:8000/api/v1/ingest" \
     -H "Content-Type: application/json" \
     -d '{"id": "doc2", "text": "Python is excellent for data science and rapid prototyping."}'

2. Semantic Search

Search for concepts, not just keywords.

curl -X POST "http://localhost:8000/api/v1/search" \
     -H "Content-Type: application/json" \
     -d '{"query": "fast safe language", "limit": 2}'

Response:

{
  "results": [
    { "id": "doc1", "score": 0.8245 }
  ],
  "latency_ms": 12.5,
  "engine_docs": 2
}

Performance Benchmarks

Tests performed on AWS c5.large (2 vCPU, 4GB RAM) with 50k vectors.

Metric Performance
Rust Engine Throughput 10,000+ QPS (Core logic)
End-to-End Latency < 15ms (p95)
Cache Hit Latency < 2ms
Memory Overhead ~1.6KB per vector (384-dim float32)

Verify the benchmarks using the benchmark.py script.

Local Development (Manual Setup)

If you wish to develop without Docker (e.g., on Windows), follow these steps carefully to avoid file-locking errors.

1. Prerequisites

  • Rust (Cargo)
  • Python 3.10+
  • Redis (running locally)

2. Build Rust Extension

cd neuro_engine
# --release is CRITICAL for SIMD optimizations
maturin develop --release
cd ..

> Note: If you get os error 32 on Windows, stop any running Python processes or VS Code terminals and try again.

3. Run FastAPI

pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

About

A high-performance hybrid vector search engine combining Python's flexibility with Rust's raw speed.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors