Skip to content
/ RAG Public

An enterprise grade RAG (Retrieval Augmented Generation) system built with Spring Boot 3.x, demonstrating advanced backend engineering and modern AI integration.

Notifications You must be signed in to change notification settings

texican/RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

BYO RAG System

Build Your Own Retrieval Augmented Generation System

Java Spring Boot Spring AI Version Tests License

🎯 Project Overview

BYO RAG System is a comprehensive AI-powered knowledge platform that shows you how to build your own enterprise-grade RAG solution from the ground up. This complete implementation demonstrates the intersection of modern software architecture and artificial intelligence through a fully-realized microservices ecosystem that enables intelligent document processing and conversational AI capabilities.

What is RAG?

Retrieval Augmented Generation (RAG) combines the power of large language models with your organization's private knowledge base, enabling AI-powered question answering over your documents while maintaining complete data privacy and control.

Key Value Propositions

  • 🏒 Multi-tenant Architecture: Complete data isolation for multiple organizations
  • πŸ”’ Enterprise Security: JWT-based authentication with role-based access control
  • ⚑ High Performance: Sub-200ms query responses with vector similarity search
  • πŸ“„ Document Intelligence: Extract insights from PDF, DOCX, TXT, MD, and HTML files
  • πŸ” Hybrid Search: Combines semantic understanding with keyword precision
  • πŸš€ Production Ready: Containerized microservices with monitoring and observability
  • πŸ§ͺ Testing Infrastructure: 99% test coverage - 594/600 functional tests passing across all services

βœ… Current Status: Complete BYO RAG system with all 6 microservices implemented and tested. Docker deployment ready. View detailed status

⚑ Quick Links

First-Time Setup

Common Tasks

Troubleshooting

Deployment

πŸ“š Table of Contents

πŸ—οΈ Architecture & Design

This system implements a microservices architecture with complete multi-tenant isolation, demonstrating enterprise-grade patterns and modern cloud-native design principles.

Core Architecture Principles

  • Domain-Driven Design: Each microservice owns its domain and data
  • Event-Driven Processing: Asynchronous operations via Apache Kafka (future implementation)
  • Optimized Persistence: Consolidated database architecture with single PostgreSQL and Redis instances
  • Security-First: JWT authentication with tenant-scoped data access
  • Observability: Comprehensive monitoring and distributed tracing

Microservices Overview

πŸ” Auth Service (Port 8081)    β†’ JWT authentication & tenant management
πŸ“„ Document Service (Port 8082) β†’ File processing & text extraction
πŸ” Embedding Service (Port 8083) β†’ Vector generation & similarity search
πŸ€– RAG Core Service (Port 8084)  β†’ LLM integration & query processing
βš™οΈ  Admin Service (Port 8085)    β†’ Administrative operations & analytics

πŸ—οΈ Architecture & Design


> **Note**: The API Gateway has been bypassed in favor of direct service access. See [ADR-001: Bypass API Gateway](docs/development/ADR-001-BYPASS-API-GATEWAY.md) for rationale.

### Refined Data Architecture

**Database Architecture** (Updated Nov 2025):
- **Single PostgreSQL Database**: Uses environment-based naming
  - Development: `byo_rag_local`
  - GCP Dev: `byo_rag_dev`
  - Pattern: `byo_rag_{env}`
  - Used by: Auth Service, Document Service, Admin Service
  - Simplified deployment and maintenance
  
- **Single Redis Database**: DB 0 with key prefixes
  - Pattern: `byo_rag_{env}:{service}:{key}`
  - Used by: Embedding Service (vector storage), Core Service (caching)
  - Complete tenant isolation via key prefixes
  
- **Services Using Synchronous REST**:
  - All inter-service communication currently via REST APIs
  - Kafka integration planned for future async operations

**Key Benefits**:
- βœ… Simplified connection management
- βœ… Lower infrastructure costs (~$206/year savings)
- βœ… Smaller Docker images (~100-160MB reduction per service)
- βœ… Maintained complete multi-tenant isolation
- βœ… 99% test pass rate (594/600 functional tests)

## 🌟 Key Features

### πŸ” Enterprise Security & Multi-Tenancy
- **Complete Data Isolation**: Each tenant's data is fully segregated
- **JWT-Based Authentication**: Secure, stateless authentication
- **Role-Based Access Control**: ADMIN, USER, and READER permissions
- **Audit Logging**: Complete traceability of all operations

### πŸ“„ Intelligent Document Processing
- **Multi-Format Support**: PDF, DOCX, TXT, Markdown, HTML
- **Smart Text Extraction**: Apache Tika-powered content analysis
- **Configurable Chunking**: Optimized for different document types
- **Asynchronous Processing**: Non-blocking operations via Kafka events

### πŸ€– Advanced RAG Pipeline  
- **Multiple Embedding Models**: OpenAI, local models, custom implementations
- **Vector Similarity Search**: Redis-powered with tenant isolation
- **Hybrid Search Strategy**: Semantic + keyword search combination
- **LLM Integration**: Support for OpenAI GPT models and local Ollama
- **Streaming Responses**: Real-time answer generation
- **Enterprise Error Handling**: βœ… **Complete** - Comprehensive retry mechanisms, circuit breakers, dead letter queues, and monitoring

### πŸ“Š Administration & Analytics
- **Multi-Tenant Management**: βœ… **Complete** - Full tenant lifecycle operations
- **User Administration**: βœ… **Complete** - Database-backed user management with roles
- **Usage Analytics**: βœ… **Complete** - Comprehensive reporting and monitoring
- **Health Monitoring**: βœ… **Complete** - Deep service health checks and diagnostics
- **Docker Deployment**: βœ… **Complete** - All 6 services operational with health monitoring

## πŸš€ Quick Start Guide

### Prerequisites
- **Java 21+** (OpenJDK recommended)
- **Maven 3.8+**
- **Docker & Docker Compose**
- **Git** for version control

### 1️⃣ Setup Your Environment
```bash
# Clone the repository
git clone https://github.com/texican/RAG.git
cd RAG

# Install development tools (git hooks, etc.)
./scripts/setup/install-dev-tools.sh

# Build all services
make build-all

# Start all services
make start

# Verify all services are running
make status

⚠️ IMPORTANT: Do not use manual docker build or docker restart commands. Always use make rebuild SERVICE=name. See CONTRIBUTING.md for details.

2️⃣ Build and Run Services

Option 1: Docker Compose (Recommended)

# Start all services
docker-compose up -d

# Check system health
./scripts/utils/health-check.sh

# View all service status
make status

Option 2: Individual Maven Services

# Build all modules
mvn clean install

# Run each service in a separate terminal
cd rag-auth-service && mvn spring-boot:run        # Port 8081 - Authentication
cd rag-document-service && mvn spring-boot:run    # Port 8082 - Document Processing
cd rag-embedding-service && mvn spring-boot:run   # Port 8083 - Vector Operations
cd rag-core-service && mvn spring-boot:run        # Port 8084 - RAG Pipeline
cd rag-admin-service && mvn spring-boot:run       # Port 8085 - Admin Operations

πŸ’‘ Development Quick Commands

The project includes a Makefile for common development tasks:

# Rebuild a single service (rebuilds JAR + Docker image + restarts container)
make rebuild SERVICE=rag-auth

# Rebuild with no cache (for stubborn issues)
make rebuild-nc SERVICE=rag-auth

# View logs in real-time
make logs SERVICE=rag-auth

# Show all services status
make status

# Start/stop all services
make start
make stop

# Run tests
make test SERVICE=rag-auth

See make help for all available commands or consult docs/development/DOCKER_DEVELOPMENT.md for detailed Docker workflow guidance.

3️⃣ Verify Installation

Service Status:

Service Health Check URL Port Status
Auth Service http://localhost:8081/actuator/health 8081 βœ… Healthy
Document Service http://localhost:8082/actuator/health 8082 βœ… Healthy
Embedding Service http://localhost:8083/actuator/health 8083 βœ… Healthy
Core Service http://localhost:8084/actuator/health 8084 βœ… Healthy
Admin Service http://localhost:8085/admin/api/actuator/health 8085 βœ… Running

Infrastructure Services:

Service URL Status Notes
PostgreSQL localhost:5432 βœ… Healthy Single DB: byo_rag_{env} pattern
Redis Stack localhost:6379 βœ… Healthy Single DB 0 with key prefixes
Apache Kafka localhost:9092 πŸ”„ Future Work Event streaming planned (see Roadmap)
Ollama LLM localhost:11434 πŸ”„ Optional Local LLM inference
Grafana http://localhost:3000 (admin/admin) βœ… Working Monitoring dashboards
Prometheus http://localhost:9090 βœ… Working Metrics collection

4️⃣ Explore the APIs

Interactive API Documentation (Recommended):

# Start with public access (no credentials needed)
open http://localhost:8082/swagger-ui.html  # Document Service

# Authenticated APIs (username: user, see guide for passwords)
open http://localhost:8084/swagger-ui.html  # Core Service
open http://localhost:8083/swagger-ui.html  # Embedding Service
open http://localhost:8085/admin/api/swagger-ui.html  # Admin Service

πŸ”‘ Access Credentials: See docs/deployment/SWAGGER_UI_ACCESS_GUIDE.md for complete login credentials and troubleshooting

Test Using curl (Alternative):

# 1. Check service health
curl http://localhost:8081/actuator/health  # Auth Service
curl http://localhost:8082/actuator/health  # Document Service

# 2. Create admin user (first time only)
./scripts/db/create-admin-user.sh

# 3. Login via Auth Service (direct)
curl -X POST http://localhost:8081/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "email": "[email protected]",
    "password": "admin123"
  }'

# 4. Use the returned JWT token for authenticated requests
TOKEN="your-jwt-token-here"
curl -X GET http://localhost:8085/admin/api/tenants \
  -H "Authorization: Bearer $TOKEN"

Direct Service Testing:

# Run comprehensive system test
./scripts/tests/test-system.sh

# Check service status
./scripts/utils/service-status.sh

πŸŽ“ What's Next?

Now that your BYO RAG system is running, here's how to start using it:

1. Upload Your First Document

# Login and get JWT token
TOKEN=$(curl -s -X POST http://localhost:8081/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email":"[email protected]","password":"admin123"}' | jq -r '.token')

# Upload a document
curl -X POST http://localhost:8082/documents/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]" \
  -F "metadata={\"title\":\"My First Document\"}"

2. Query the RAG System

# Ask a question about your documents
curl -X POST http://localhost:8084/rag/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query":"What is this document about?","maxResults":5}'

3. Configure LLM Models

  • OpenAI: Set OPENAI_API_KEY in your environment or application.yml
  • Ollama: Install locally and configure endpoint in rag-core-service/application.yml
  • See ollama-chat/README.md for local LLM setup

4. Explore Advanced Features

  • Multi-tenancy: Create additional tenants via Admin Service API
  • User Management: Add users with different roles (ADMIN, USER, READER)
  • Monitoring: Access Grafana dashboards at http://localhost:3000
  • Analytics: View usage statistics via Admin Service endpoints

5. Deploy to Production

  • GCP: Follow GCP Cloud Deployment guide
  • Other Clouds: Adapt Kubernetes manifests in k8s/ directory
  • Security: Review Security section for production hardening

πŸ“š Next Steps Resources:

πŸ“Š Development Status

Current Status: Production-Ready RAG System with GCP Deployment βœ…

All 6 microservices operational with comprehensive testing, API documentation, Docker deployment, and GCP cloud deployment ready.

Services Overview

Service Status Tests API Docs Docker GCP
rag-shared βœ… Complete βœ… 90/90 N/A βœ… Library N/A
rag-auth-service βœ… Complete ⚠️ 111/114 (97%) βœ… Swagger UI βœ… Production βœ… Ready
rag-document-service βœ… Complete βœ… 77/77 (100%) βœ… Swagger UI βœ… Production βœ… Ready
rag-embedding-service βœ… Complete ⚠️ 209/214 (98%) βœ… Swagger UI βœ… Production βœ… Ready
rag-admin-service βœ… Complete βœ… 77/77 (100%) βœ… Swagger UI βœ… Production βœ… Ready
rag-core-service βœ… Complete βœ… 108/108 (100%) βœ… Swagger UI βœ… Production βœ… Ready

Test Summary: 594/600 functional tests passing (99% overall pass rate)

  • ⚠️ Auth Service: 3 security configuration tests failing
  • ⚠️ Embedding Service: 5 Ollama configuration tests failing
  • See BACKLOG.md for TECH-DEBT-006 and TECH-DEBT-007 details

System Capabilities

  • βœ… Multi-tenant Architecture: Complete data isolation with JWT authentication
  • βœ… Document Processing: PDF, DOCX, TXT, MD, HTML with intelligent chunking
  • βœ… Vector Operations: Redis-powered similarity search with enterprise error handling
  • βœ… RAG Pipeline: LLM integration with streaming responses
  • βœ… Admin Operations: Tenant management, user administration, analytics
  • βœ… Testing: 594/600 functional tests passing (99% pass rate)
  • βœ… Documentation: Interactive Swagger UI for all endpoints
  • βœ… Local Deployment: Docker Compose with health monitoring
  • βœ… Cloud Deployment: GCP with GKE, Cloud SQL, Memorystore, Artifact Registry

πŸ› οΈ Developer Reference

Architecture Diagram

graph TB
    Client[Client Applications<br/>Direct Access]
    Auth[Auth Service<br/>Port 8081]
    Doc[Document Service<br/>Port 8082]
    Embed[Embedding Service<br/>Port 8083]
    Core[RAG Core Service<br/>Port 8084]
    Admin[Admin Service<br/>Port 8085]

    PG[(PostgreSQL<br/>byo_rag_local<br/>Port 5432)]
    Redis[(Redis DB 0<br/>Key Prefixes<br/>Port 6379)]
    Kafka[(Apache Kafka<br/>Port 9092<br/>Future)]
    Ollama[(Ollama LLM<br/>Port 11434)]

    Client --> Auth
    Client --> Doc
    Client --> Embed
    Client --> Core
    Client --> Admin

    Auth --> PG
    Doc --> PG
    Admin --> PG

    Embed --> Redis
    Core --> Redis

    Doc -.-> Kafka
    Embed -.-> Kafka
    Core -.-> Kafka

    Core --> Ollama
    Embed --> Ollama

    classDef working fill:#4CAF50,stroke:#2E7D32,stroke-width:2px,color:#FFFFFF;
    classDef planned fill:#FFA726,stroke:#F57C00,stroke-width:2px,stroke-dasharray: 5 5,color:#FFFFFF;
    class Client,Auth,Doc,Embed,Core,Admin,PG,Redis,Ollama working;
    class Kafka planned;
Loading

Refined Architecture Notes:

  • Consolidated Database Architecture: Single PostgreSQL and Redis instances for optimal performance
  • Service Isolation: DB 0 with service-specific key prefixes (byo_rag_{env}:{service}:*)
  • Kafka Integration: Future work - see Roadmap for planned async event processing
  • Cost Savings: ~$206/year from infrastructure optimization
  • Image Size: ~100-160MB reduction per service after removing unused dependencies

Microservices Architecture

  • Multi-tenant isolation: Complete data separation by tenant (PostgreSQL row-level, Redis key prefixes)
  • Event-driven processing: Async operations planned via Kafka (current: synchronous REST)
  • Optimized persistence: Consolidated database architecture
  • Horizontal scaling: Stateless services with shared infrastructure

Tech Stack Reference

πŸ“‹ Core Framework & Runtime
Component Version Purpose
Java 21 (LTS) Primary programming language
Spring Boot 3.2.8 Application framework
Spring AI 1.0.0-M1 AI/ML integration
Spring Cloud 2023.0.2 Microservices framework
Maven 3.8+ Build and dependency management
πŸ—„οΈ Data & Storage (Refined Architecture)
Component Version Purpose
PostgreSQL 42.7.3 Single shared database (byo_rag_{env} pattern)
Redis Stack 5.0.2 Single DB 0 with key prefixes for multi-tenant isolation
Apache Kafka 3.7.0 Event streaming and messaging (planned - see Roadmap)

Architecture Notes:

  • PostgreSQL handles: Auth, Document, Admin service data
  • Redis handles: Embedding vectors, Core service caching
  • Key prefix pattern: byo_rag_{env}:{service}:{key}
  • Cost optimization: ~$206/year savings from infrastructure consolidation
  • Vector Extensions: pgvector support planned for future advanced vector operations
  • Kafka: Infrastructure planned for future async event processing
  • See KAFKA_ERROR_HANDLING.md for planned implementation design
πŸ€– AI/ML Libraries
Component Version Purpose
LangChain4j 0.33.0 LLM integration framework
Apache Tika 2.9.2 Document processing and text extraction
OpenAI API Latest GPT models and embeddings
Ollama Latest Local LLM inference
πŸ§ͺ Testing & Quality
Component Version Purpose
JUnit 5.10.2 Unit testing framework
Testcontainers 1.19.8 Integration testing
Mockito 5.14.2 Mocking framework
WireMock 3.8.0 API mocking

Developer Workflows

Running Tests

# Run all unit tests
mvn test

# Run integration tests (requires Docker)
mvn verify -P integration-tests

# Run tests for a specific service
cd rag-auth-service && mvn test

# Skip tests during development
mvn clean install -DskipTests

Development Mode

# Hot reload enabled by default in Spring Boot DevTools
# Make changes to Java files and they'll auto-reload

# For database schema changes, use Spring Boot's DDL auto-update
# application-dev.yml: spring.jpa.hibernate.ddl-auto=update

Debugging

# Enable debug logging for a service
export LOGGING_LEVEL_COM_ENTERPRISE_RAG=DEBUG

# Debug with remote JVM debugging
mvn spring-boot:run -Dspring-boot.run.jvmArguments="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005"

Working with Docker Services

# View logs for all infrastructure services
docker-compose logs -f

# Restart a specific service
docker-compose restart postgres

# Access PostgreSQL directly
docker exec -it enterprise-rag-postgres psql -U rag_user -d rag_enterprise

# Access Redis CLI
docker exec -it enterprise-rag-redis redis-cli

# View Kafka topics
docker exec -it enterprise-rag-kafka kafka-topics --bootstrap-server localhost:9092 --list

οΏ½ Security

Authentication & Authorization

  • JWT-Based Authentication: Stateless token-based security
  • Token Expiry: Configurable token lifetime (default: 24 hours)
  • Refresh Tokens: Automatic token renewal for long-lived sessions
  • Role-Based Access Control (RBAC):
    • ADMIN: Full system access, tenant management, user administration
    • USER: Document upload, RAG queries, view own data
    • READER: Read-only access to documents and query results

Multi-Tenant Security

  • Complete Data Isolation:
    • PostgreSQL: Row-level security with tenant_id filtering
    • Redis: Key prefix isolation (byo_rag_{env}:{service}:{tenant_id}:*)
    • Each tenant's data is logically and physically separated
  • Tenant Context: All API requests validated against JWT tenant claims
  • Cross-Tenant Protection: Automatic enforcement prevents data leakage

API Security

  • Endpoint Protection: All APIs secured except health checks
  • Health Endpoints: /actuator/health/** publicly accessible for K8s probes
  • Input Validation: Request validation with Spring annotations
  • SQL Injection Prevention: JPA/Hibernate with parameterized queries
  • CORS Configuration: Configurable allowed origins for frontend integration

Secret Management

  • Development: Environment variables and application.yml (never commit secrets)
  • Production (GCP): Google Secret Manager integration
    • Database credentials
    • Redis passwords
    • JWT signing keys
    • API keys (OpenAI, etc.)
  • Kubernetes: Secrets mounted as environment variables

Security Best Practices

# Generate secure JWT secret (development)
openssl rand -base64 32

# Rotate secrets regularly (production)
./scripts/gcp/04-migrate-secrets.sh --rotate

# Audit security configuration
./scripts/security/audit-config.sh

Known Security Considerations

  • ⚠️ Default credentials in development mode - change before production
  • ⚠️ HTTPS required for production deployments
  • ⚠️ Rate limiting not yet implemented (planned)
  • ⚠️ API key rotation not automated (manual process)

See SECURITY_AUDIT_REPORT.md for comprehensive security analysis.

πŸ”§ Troubleshooting

Quick Troubleshooting Index

Issue Section Quick Fix
Service won't start Service Won't Start Check port conflicts with netstat -tulpn | grep :8081
Database connection fails Database Issues Verify docker-compose ps shows postgres running
Tests failing Test Failures Run with -Dspring.profiles.active=test
Docker build issues Contributing Always use make rebuild SERVICE=name
Port already in use Service Won't Start Kill process: kill $(lsof -t -i:8081)
Authentication fails Security Verify JWT token not expired, check credentials

οΏ½πŸ”§ Troubleshooting

πŸ”§ Service Won't Start
# Check if port is already in use
netstat -tulpn | grep :8081

# View application logs
cd rag-auth-service && mvn spring-boot:run

# Check Docker services are running
docker-compose ps
πŸ—„οΈ Database Connection Issues
# Test PostgreSQL connection
docker exec -it enterprise-rag-postgres psql -U rag_user -d byo_rag_local

# Reset database (development only)
docker-compose down -v && docker-compose up -d

# Check database logs
docker-compose logs postgres

Note: Database name follows pattern byo_rag_{env} where {env} is local, dev, staging, or prod.

πŸ§ͺ Tests Failing
# Run tests with verbose output
mvn test -Dtest=YourTestClass -Dspring.profiles.active=test

# Integration tests require Docker
docker-compose up -d
mvn verify -P integration-tests

# Check test container logs
docker-compose logs testcontainers

πŸ“ˆ Performance & Monitoring

Target Metrics

  • Response Time: <200ms (excluding LLM processing)
  • Throughput: 1000+ concurrent users
  • Availability: 99.9% uptime target

Monitoring Endpoints

  • Health Checks: /actuator/health on each service
  • Metrics: /actuator/prometheus for Prometheus scraping
  • Info: /actuator/info for build and version details

Local Monitoring Setup

# Prometheus: http://localhost:9090
# Grafana: http://localhost:3000 (admin/admin)
# Kafka UI: http://localhost:9021 (if Confluent Control Center enabled)

πŸ§ͺ Testing

Test Coverage Overview

Overall: 594/600 functional tests passing (99% pass rate)

Service Tests Passing Coverage Status
rag-shared 90/90 (100%) Unit tests βœ… All passing
rag-auth-service 111/114 (97%) Unit + Integration ⚠️ 3 security config tests
rag-document-service 77/77 (100%) Unit + Integration βœ… All passing
rag-embedding-service 209/214 (98%) Unit + Integration ⚠️ 5 Ollama config tests
rag-admin-service 77/77 (100%) Unit + Integration βœ… All passing
rag-core-service 108/108 (100%) Unit + Integration βœ… All passing

Known Test Issues

  • TECH-DEBT-006: Auth service security configuration tests (3 failures)

    • Spring Security blocking actuator/auth endpoints
    • Pre-existing issue, not affecting functionality
    • See BACKLOG.md for details
  • TECH-DEBT-007: Embedding service Ollama client tests (5 failures)

    • Profile-based bean configuration in test context
    • Functional tests all pass (181/181)
    • Service works correctly at runtime
    • See BACKLOG.md for details

Running Tests

# Run all tests for all services
mvn test

# Run tests for specific service
cd rag-auth-service && mvn test

# Run excluding infrastructure tests
mvn test -Dtest='!InfrastructureValidationTest,!SecurityConfigurationTest'

# Run with coverage report
mvn test jacoco:report

# Run integration tests (requires Docker)
mvn verify -P integration-tests

Test Categories

  • Unit Tests: Fast, no external dependencies, mock all services
  • Integration Tests: Test with real PostgreSQL/Redis via Testcontainers
  • Infrastructure Tests: Validate Bean configuration and startup
  • E2E Tests: Full pipeline testing (planned - see E2E_TEST_BLOCKER_ANALYSIS.md)

Writing Tests

See TESTING_BEST_PRACTICES.md for:

  • Test naming conventions
  • Mock vs. Integration test guidelines
  • Spring Boot test annotations
  • Common testing patterns

Continuous Testing

# Watch mode for development
mvn test -Dtest=YourTest -DfailIfNoTests=false --watch

# Pre-commit hook runs tests automatically
# Installed via: ./scripts/setup/install-dev-tools.sh

🎯 Roadmap

πŸš€ All Core Services Complete! GCP Deployment Infrastructure Ready!

1. High Priority: GCP Cloud Deployment βœ… COMPLETE

  • βœ… GCP Project Setup: Project byo-rag-dev configured
  • βœ… Secret Management: Google Secret Manager with credentials
  • βœ… Container Registry: Artifact Registry with all service images
  • βœ… Cloud SQL: PostgreSQL 15 database
  • βœ… Cloud Memorystore: Redis Standard HA tier
  • βœ… GKE Cluster: Kubernetes Engine cluster (currently stopped)
  • βœ… Kubernetes Manifests: Complete K8s deployment configs with Kustomize
  • βœ… Persistent Storage: GCS buckets and PVCs configured
  • βœ… Deployment Automation: Scripts for service deployment and validation

2. Medium Priority: System Integration & Testing

  • βœ… Docker orchestration: All services running in containers
  • βœ… Authentication testing: Complete auth service unit testing
  • βœ… Embedding testing: Complete embedding service advanced testing
  • πŸ”„ End-to-end testing: Complete RAG pipeline validation
  • πŸ”„ Load testing: Performance testing under concurrent load
  • βœ… API documentation: Comprehensive Swagger UI for all services

3. Lower Priority: Production Features

  • πŸ”„ Ingress & Load Balancer: External access configuration
  • πŸ”„ CI/CD pipeline: Automated testing and deployment
  • πŸ”„ Kafka Integration: Enterprise-grade async event processing
    • Comprehensive error handling with DLQ
    • Circuit breakers and retry mechanisms
    • Event-driven document processing pipeline
    • See KAFKA_ERROR_HANDLING.md for design
  • πŸ”„ Vector Extensions: pgvector support for advanced vector operations
  • πŸ”„ Security hardening: Advanced security features and audit logging
  • πŸ”„ Performance optimization: Database indexing and query optimization
  • πŸ”„ Monitoring & Alerting: Cloud Monitoring dashboards and alerts
  • πŸ”„ Multi-model support: Additional embedding and LLM model integrations

πŸ“š Documentation

The project documentation is organized into the following categories:

πŸš€ Deployment & Infrastructure

πŸ› οΈ Development & Testing

πŸ“‹ Project Management

πŸ“± Applications & Tools

  • ollama-chat/README.md - Enhanced Ollama chat frontend with Docker integration βœ… COMPLETED
  • Service Health Checks - /actuator/health endpoints on all services
  • Monitoring Dashboards - Grafana at http://localhost:3000
  • Comprehensive Javadoc - Enterprise-grade API documentation (92.4% coverage)

☁️ GCP Cloud Deployment

The BYO RAG System is fully configured for deployment on Google Cloud Platform with enterprise-grade infrastructure.

Infrastructure Overview

Project: byo-rag-dev (Development Environment)

Component Service Status Configuration
Compute Google Kubernetes Engine βœ… Ready rag-gke-dev cluster, us-central1
Database Cloud SQL PostgreSQL 15 βœ… Running rag-postgres, single DB: byo_rag_dev
Cache Cloud Memorystore Redis βœ… Running rag-redis Standard HA, DB 0 with key prefixes
Storage Cloud Storage βœ… Ready Document and backup buckets
Registry Artifact Registry βœ… Active rag-system repository
Secrets Secret Manager βœ… Configured Database, Redis, JWT credentials
Messaging Cloud Pub/Sub πŸ”„ Planned Future async messaging system

Architecture Benefits:

  • PostgreSQL database design optimized for multi-tenancy
  • Redis instance configured for optimal memory usage
  • Key prefix pattern enables complete tenant isolation
  • Reduced network traffic and connection overhead

Container Images

All service images are published to Google Artifact Registry:

Registry URL:

us-central1-docker.pkg.dev/byo-rag-dev/rag-system

Available Images:

  • rag-auth-service:0.8.0
  • rag-document-service:0.8.0
  • rag-embedding-service:0.8.0
  • rag-core-service:0.8.0
  • rag-admin-service:0.8.0

Pull Example:

docker pull us-central1-docker.pkg.dev/byo-rag-dev/rag-system/rag-core-service:0.8.0

Tags: 0.8.0, latest, <git-sha>, 0.8.0-<git-sha>

Deployment Scripts

Complete automation for GCP deployment:

# Setup scripts (scripts/gcp/)
./00-setup-project.sh          # Initialize GCP project
./01-setup-network.sh          # Configure VPC and networking
./02-setup-service-accounts.sh # Create GCP service accounts
./04-migrate-secrets.sh        # Migrate to Secret Manager
./07-build-and-push-images.sh  # Build and push to Artifact Registry
./08-setup-cloud-sql.sh        # Provision Cloud SQL
./10-setup-memorystore.sh      # Provision Cloud Memorystore
./12-setup-gke-cluster.sh      # Create GKE cluster
./13-sync-secrets-to-k8s.sh    # Sync secrets to Kubernetes
./14-setup-storage.sh          # Configure Cloud Storage
./16-setup-ingress.sh          # Configure ingress and load balancing
./17-deploy-services.sh        # Deploy services to GKE
./18-init-database.sh          # Initialize database schema
./19-validate-deployment.sh    # Validate deployment health

Kubernetes Deployment

Kubernetes manifests with Kustomize overlays for dev/prod:

# Deploy to development
kubectl apply -k k8s/overlays/dev

# Deploy to production
kubectl apply -k k8s/overlays/prod

See k8s/README.md for detailed Kubernetes deployment documentation.

Infrastructure Costs (Estimated)

Development Environment:

  • GKE Cluster (n1-standard-2, 3 nodes): ~$150/month
  • Cloud SQL (db-custom-2-7680): ~$120/month
  • Cloud Memorystore Redis (Standard, 5GB): ~$80/month
  • Artifact Registry: ~$5/month (storage only)
  • Cloud Storage: ~$5/month (minimal usage)

Total: ~$360/month for development environment

Note: Costs can be reduced by:

  • Stopping GKE cluster when not in use
  • Using smaller Cloud SQL instances
  • Switching to Basic tier Redis for development

GCP Console Links

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ”₯ Ready to contribute? Check out our Contributing Guidelines and start building the future of enterprise RAG systems!


### Infrastructure Costs (Estimated)

**Development Environment:**
- GKE Cluster (n1-standard-2, 3 nodes): ~$150/month
- Cloud SQL (db-custom-2-7680): ~$120/month
- Cloud Memorystore Redis (Standard, 5GB): ~$80/month
- Artifact Registry: ~$5/month (storage only)
- Cloud Storage: ~$5/month (minimal usage)

**Total:** ~$360/month for development environment

**Note:** Costs can be reduced by:
- Stopping GKE cluster when not in use
- Using smaller Cloud SQL instances
- Switching to Basic tier Redis for development

### GCP Console Links

- **Project Dashboard:** https://console.cloud.google.com/home/dashboard?project=byo-rag-dev
- **Artifact Registry:** https://console.cloud.google.com/artifacts/docker/byo-rag-dev/us-central1/rag-system
- **GKE Clusters:** https://console.cloud.google.com/kubernetes/list?project=byo-rag-dev
- **Cloud SQL:** https://console.cloud.google.com/sql/instances?project=byo-rag-dev
- **Secret Manager:** https://console.cloud.google.com/security/secret-manager?project=byo-rag-dev

About

An enterprise grade RAG (Retrieval Augmented Generation) system built with Spring Boot 3.x, demonstrating advanced backend engineering and modern AI integration.

Topics

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •