Enhanced RAG Application with TruLens Evaluation

An AI application that allows users to ask questions about documents using the power of LLM through OpenAI API, with comprehensive RAG triad evaluation using TruLens integration.

✨ Features

🎯 Core RAG Functionality

Document Ingestion: Automatic parsing, chunking, metadata extraction, embedding generation and storage
Hybrid Search: Combines sparse/dense vector search for optimal retrieval
Contextual Retrieval: Returns most relevant chunks based on user queries
Chat & Completions: Abstracts retrieval complexity with prompt engineering

📊 Evaluation & Analytics

RAG Triad Evaluation: Real-time scoring of Context Relevance, Groundedness, and Answer Relevance
TruLens Integration: Comprehensive evaluation framework with detailed metrics
Performance Tracking: Latency, token usage, and cost monitoring
Visual Analytics: Interactive charts and dashboards for metrics visualization

🎛️ UI Controls

Dynamic Configuration: Real-time adjustment of chunk size (100-2000 tokens) and top-k (1-20)
Answer Style Toggle: Choose between Concise, Balanced, or Explanatory responses
Temperature Control: Adjust LLM creativity and determinism
Evaluation Toggle: Enable/disable real-time evaluation for performance

💬 Chat Features

Chat History: Persistent conversation storage with search capabilities
Real-time Evaluation: Display RAG triad scores for each response
Document Upload: Multi-format support (PDF, DOCX, TXT, MD)
Status Monitoring: System health and performance indicators

🏗️ Architecture

Core Components

LlamaIndex: RAG pipeline framework with vector stores and embeddings
Qdrant: High-performance vector database for semantic search
OpenAI API: LLM integration for response generation
TruLens: Evaluation framework for RAG triad metrics
Gradio: Modern web interface with real-time updates

Evaluation Metrics

Context Relevance: How relevant retrieved chunks are to the query
Groundedness: How well the answer is supported by retrieved context
Answer Relevance: How relevant the answer is to the original question
RAG Triad Score: Combined score (average of the three metrics)

Hybrid Search Strategy

Vector Search: Semantic similarity using embeddings
Lexical Search: BM25 algorithm for keyword matching
Score Fusion: Intelligent combination of both approaches
Dynamic Weighting: Adaptive scoring based on query characteristics

🚀 Quick Start

Prerequisites

Python 3.13+
OpenAI API key
Git

Installation

Clone the repository:

git clone https://github.com/brainboost/enhanced-rag-ui.git
cd sergei-vedishchev

Create virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

# Using uv (recommended)
pip install uv
uv sync

# Or using pip
pip install -r requirements.txt

Configure environment:

# Copy environment template
cp .env.example .env

# Edit with your configuration
nano .env

Required environment variables:

OPENAI_API_KEY=your_openai_api_key_here

Create sample documents (optional):

# with uv (preferred)
uv run app.py --create-samples

python app.py --create-samples

Launch the application:

# with uv (preferred)
uv run app.py  

# old school
python app.py

The application will be available at http://localhost:7860

📖 Usage Guide

Basic Chat Interaction

Upload Documents: Use the file upload interface to ingest your documents
Configure Settings: Adjust chunk size, top-k, and answer style as needed
Ask Questions: Type your questions in the chat interface
View Evaluation: Check real-time RAG triad scores for each response

Advanced Configuration

Chunk Size Optimization

Small chunks (100-300): Better for precise information retrieval
Medium chunks (300-800): Good balance of context and precision
Large chunks (800-2000): Better for complex queries requiring more context

Top-K Selection

Low (1-3): High precision, lower recall
Medium (4-8): Balanced precision and recall
High (9-20): High recall, lower precision

Answer Styles

Concise: Brief, direct answers with essential information only
Balanced: Well-structured answers with sufficient detail
Explanatory: Detailed answers with explanations and context

Evaluation Interpretation

RAG Triad Scores

0.8-1.0: Excellent performance
0.6-0.8: Good performance
0.4-0.6: Moderate performance
0.2-0.4: Poor performance
0.0-0.2: Very poor performance

Individual Metrics

Context Relevance: Quality of retrieved documents
Groundedness: Factual accuracy based on provided context
Answer Relevance: How well the response addresses the query

🔧 Configuration

Settings Files

settings.yaml: Main configuration file
.env: Environment variables and API keys

Key Configuration Options

# Server settings
server:
  port: 7860
  host: 0.0.0.0

# LLM configuration
llm:
  mode: openai
  model: gpt-5
  temperature: 0.1
  max_tokens: 1024

# Embedding settings
embedding:
  mode: openai
  model: text-embedding-3-small
  embed_dim: 1536

# Vector store
vectorstore:
  database: qdrant

# Ingestion settings
ingestion:
  chunk_size: 512
  chunk_overlap: 50

📊 Monitoring & Analytics

Metrics Dashboard

Real-time Scores: Live RAG triad evaluation
Performance Charts: Latency and cost trends over time
Usage Statistics: Token consumption and API costs
Export Capabilities: Download metrics in JSON or CSV format

Performance Optimization

Latency Monitoring: Track response times across configurations
Cost Tracking: Monitor OpenAI API usage and expenses
Quality Trends: Analyze evaluation scores over time
A/B Testing: Compare different configuration settings

🧪 Testing

Running Tests

# Install test dependencies
uv pip install -e .[dev]

# Run all tests
pytest

# Run with coverage
pytest --cov=.

# Run specific test file
pytest tests/test_evaluation.py

Test Coverage

Unit tests for evaluation functions
Integration tests for RAG pipeline
UI component testing
Configuration validation tests

🚀 Deployment

Local Deployment

# Production mode
uv run app.py # port 7860
uv run app.py --port 8080 --share

# With custom configuration
uv run app.py --config production.yaml

Docker Deployment

FROM python:3.13-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 7860

CMD ["python", "app.py", "--port", "7860"]

Environment Variables

Development: Use .env file
Production: Use environment variables or secret management
Docker: Pass via -e flags or docker-compose

🔍 Troubleshooting

Common Issues

Installation Problems

Python Version: Ensure Python 3.13+ is installed
Dependencies: Try uv sync --frozen before running
Environment: Verify all required environment variables are set

Performance Issues

High Latency: Reduce chunk size or top-k value
Poor Scores: Increase temperature or adjust answer style
Memory Usage: Use smaller chunk sizes for large documents

Evaluation Issues

Zero Scores: Check OpenAI API key and connectivity
Inconsistent Results: Verify document ingestion completed successfully
Missing Metrics: Ensure evaluation is enabled in configuration

Debug Mode

# Enable debug logging
export LOG_LEVEL=DEBUG

# Run with verbose output
python app.py --config debug.yaml

Project Structure

sergei-vedishchev/
├── app.py                 # Main application entry point
├── requirements.txt         # Python dependencies
├── settings.yaml          # Main configuration
├── evaluation/            # Evaluation modules
│   ├── trulens_integration.py
│   └── feedback_functions.py
├── ui/                   # User interface components
│   ├── chat_interface.py
│   └── components.py
├── utils/                 # Utility modules
│   ├── config_manager.py
│   └── metrics_tracker.py
└── data/documents/        # Document storage

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

LlamaIndex - RAG framework
TruLens - Evaluation framework
Gradio - UI framework
Qdrant - Vector database
OpenAI - LLM API

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
evaluation		evaluation
services		services
ui		ui
utils		utils
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
debug_ingest.py		debug_ingest.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
settings.yaml		settings.yaml
test_app.py		test_app.py
test_basic.py		test_basic.py
test_simple.py		test_simple.py
test_structure.py		test_structure.py
uv.lock		uv.lock

License

brainboost/enhanced-rag-ui

Folders and files

Latest commit

History

Repository files navigation