Skip to content

rhel-lightspeed/docs2db-api

Repository files navigation

Docs2DB-API

Query a Docs2DB RAG database with modern retrieval techniques. Docs2DB-API provides a Python library for hybrid search (vector + BM25) with reranking.

What it does:

  • Queries RAG databases created by docs2db
  • Hybrid search: combines vector similarity with BM25 full-text search
  • Reciprocal Rank Fusion (RRF) for result combination
  • Cross-encoder reranking for improved result quality
  • Question refinement for query expansion
  • Universal RAG engine adaptable to multiple API frameworks

What it's for:

  • Building RAG applications and agents
  • Adding document search to LLM systems
  • Serving RAG APIs (FastAPI, LlamaStack, custom frameworks)

Installation

uv add docs2db-api

Quickstart

Step 1: Create a database with docs2db

uv tool install docs2db
docs2db pipeline /path/to/documents

This creates ragdb_dump.sql.

Step 2: Restore and query

# Start database
uv run docs2db-api db-start

# Restore dump
uv run docs2db-api db-restore ragdb_dump.sql

# Check status
uv run docs2db-api db-status

Step 3: Use in your application

import asyncio
from docs2db_api.rag.engine import UniversalRAGEngine, RAGConfig

async def main():
    # Initialize engine with defaults (auto-detects database from environment)
    engine = UniversalRAGEngine()
    await engine.start()
    
    # # Or with specific settings
    # config = RAGConfig(
    #     model_name="granite-30m-english",
    #     max_chunks=5,
    #     similarity_threshold=0.7
    # )
    # db_config = {
    #     "host": "localhost",
    #     "port": "5432",
    #     "database": "ragdb",
    #     "user": "postgres",
    #     "password": "postgres"
    # }
    # engine = UniversalRAGEngine(config=config, db_config=db_config)
    # await engine.start()
    
    # Search
    result = await engine.search_documents("How do I configure authentication?")
    for doc in result.documents:
        print(f"Score: {doc['similarity_score']:.3f}")
        print(f"Source: {doc['document_path']}")
        print(f"Text: {doc['text'][:200]}...\n")

asyncio.run(main())

LlamaStack Integration

Docs2DB-API includes a native LlamaStack tool provider for agent-based RAG. See the complete demo with setup scripts and examples:

📁 demos/llama-stack/ - LlamaStack RAG tool provider with agent demos

(this needs to be adjusted to work from pypi)

Configuration

Database Configuration

Configuration precedence (highest to lowest):

  1. CLI arguments: --host, --port, --db, --user, --password
  2. Environment variables: DOCS2DB_DB_HOST, DOCS2DB_DB_PORT, DOCS2DB_DB_DATABASE, DOCS2DB_DB_USER, DOCS2DB_DB_PASSWORD
  3. DOCS2DB_DB_URL: postgresql://user:pass@host:port/database
  4. postgres-compose.yml in current directory
  5. Defaults: localhost:5432, user=postgres, password=postgres, db=ragdb

Examples:

# Use defaults
uv run docs2db-api db-status

# Environment variables
export DOCS2DB_DB_HOST=prod.example.com
export DOCS2DB_DB_DATABASE=mydb
uv run docs2db-api db-status

# DOCS2DB_DB_URL (cloud providers)
export DOCS2DB_DB_URL="postgresql://user:pass@host:5432/db"
uv run docs2db-api db-status

# CLI arguments
uv run docs2db-api db-status --host localhost --db mydb

Note: Don't mix DOCS2DB_DB_URL with individual DOCS2DB_DB_* variables.

LLM Configuration (Query Refinement)

Configure the LLM used for query refinement:

export DOCS2DB_LLM_BASE_URL=http://localhost:11434      # OpenAI-compatible API (e.g., Ollama)
export DOCS2DB_LLM_MODEL=qwen2.5:7b-instruct            # Model name
export DOCS2DB_LLM_TIMEOUT=30.0                         # HTTP timeout (seconds)
export DOCS2DB_LLM_TEMPERATURE=0.7                      # Generation temperature
export DOCS2DB_LLM_MAX_TOKENS=500                       # Max tokens per response

RAG Configuration

RAG settings control retrieval behavior (similarity thresholds, reranking, refinement, etc.) and can be stored in the database or provided at query time.

Available Settings

  • refinement_prompt - Custom prompt for query refinement
  • enable_refinement (refinement) - Enable question refinement (true/false)
  • enable_reranking (reranking) - Enable cross-encoder reranking (true/false)
  • similarity_threshold - Similarity threshold 0.0-1.0
  • max_chunks - Maximum chunks to return
  • max_tokens_in_context - Maximum tokens in context window
  • refinement_questions_count - Number of refined questions to generate

Configuration Precedence (highest to lowest)

  1. Query parameters - Passed directly to engine.search_documents() or CLI --threshold, --limit, etc.
  2. RAGConfig object - Provided when initializing UniversalRAGEngine
  3. Database settings - Stored in database via docs2db config command (see docs2db)
  4. Code defaults - Built-in fallback values

Commands

Database Management

docs2db-api db-start               # Start PostgreSQL with Podman/Docker
docs2db-api db-stop                # Stop PostgreSQL (data preserved)
docs2db-api db-destroy             # Stop and delete all data
docs2db-api db-status              # Check connection and stats
docs2db-api db-restore <file>      # Restore database from dump
docs2db-api manifest               # Generate list of documents

Querying

# Basic search
docs2db-api query "How do I configure authentication?"

# Advanced options
docs2db-api query "deployment guide" \
  --model granite-30m-english \
  --limit 20 \
  --threshold 0.8 \
  --no-refine                     # Disable question refinement

RAG Features

Docs2DB-API implements modern retrieval techniques:

  • Contextual chunks - LLM-generated context situating each chunk within its document (Anthropic's approach)
  • Hybrid search - Combines BM25 (lexical) and vector embeddings (semantic)
  • Reciprocal Rank Fusion (RRF) - Intelligent result combination
  • Cross-encoder reranking - Improved result quality
  • Question refinement - Query expansion for better matches
  • PostgreSQL full-text search - tsvector with GIN indexing for BM25
  • pgvector similarity - Fast vector search with HNSW indexes
  • Universal RAG engine - Adaptable to multiple API frameworks

License

See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •