A comprehensive hands-on course for building Retrieval-Augmented Generation (RAG) agents using Large Language Models (LLMs). This project combines microservices architecture, LangChain framework, and modern AI technologies to create intelligent conversational agents.
- Overview
- Features
- Architecture
- Getting Started
- Project Structure
- Notebooks Guide
- Microservices
- Running the Application
- Testing
- Configuration
- API Endpoints
- Contributing
- Troubleshooting
- License
This project provides a complete learning environment for building RAG (Retrieval-Augmented Generation) agents with Large Language Models. It demonstrates how to:
- Build microservices-based AI applications using Docker and FastAPI
- Implement RAG architectures with vector stores and embeddings
- Create conversational AI agents using LangChain and NVIDIA AI endpoints
- Deploy scalable AI systems with proper orchestration and monitoring
The course is structured as a series of Jupyter notebooks that progressively build knowledge from basic concepts to advanced implementations, culminating in a fully functional RAG chatbot system.
-
π€ Multiple Chatbot Modes:
- Basic LLM access without context
- Context-aware conversations with notebook content
- Agentic behavior with reasoning capabilities
-
π§ Advanced RAG Implementation:
- Document chunking and embedding generation
- Vector store integration (FAISS)
- Semantic search and retrieval
-
π Microservices Architecture:
- Modular, scalable service design
- Docker containerization
- API-first approach with FastAPI
-
π Educational Content:
- Progressive learning through Jupyter notebooks
- Hands-on exercises and examples
- Visual aids and diagrams
The project follows a microservices architecture with the following key components:
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Frontend β β Chatbot β β LLM Client β
β (Gradio UI) βββββΊβ (RAG Agent) βββββΊβ (NVIDIA NIM) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
β βββββββββββββββββββ β
β β Vector Store β β
β β (FAISS) β β
β βββββββββββββββββββ β
β β
βββββββββββββββββ βββββββββββββββ
β β
βββββββββββββββββββ β
β Docker Router β β
β (Orchestrator) ββββββ
βββββββββββββββββββ
- Docker and Docker Compose installed
- Python 3.8+ (for local development)
- NVIDIA API Key (for accessing NVIDIA NIM endpoints)
- Git for cloning the repository
-
Clone the repository:
git clone <repository-url> cd BuildingRAGAgentswithLLMs
-
Set up environment variables:
# Copy the example environment file copy .env.example .env # Edit .env with your actual API keys notepad .env
β οΈ IMPORTANT: Never commit your.envfile! It's already in.gitignore. -
Build and start the services:
cd composer docker-compose build docker-compose up -d
- Access JupyterLab: Navigate to
http://localhost:8888in your browser - Start with the Table of Contents: Open
99_table_of_contents.ipynb - Follow the Learning Path: Begin with
00_jupyterlab.ipynband progress through the numbered notebooks - Try the Chatbot: Access the Gradio interface at
http://localhost:8999
βββ π Notebooks (Learning Materials)
β βββ 00_jupyterlab.ipynb # Environment introduction
β βββ 01_microservices.ipynb # Microservices concepts
β βββ 02_llms.ipynb # Large Language Models
β βββ 03_langchain_intro.ipynb # LangChain framework
β βββ 04_running_state.ipynb # State management
β βββ 05_documents.ipynb # Document processing
β βββ 06_embeddings.ipynb # Vector embeddings
β βββ 07_vectorstores.ipynb # Vector databases
β βββ 09_langserve.ipynb # LangServe deployment
β βββ 99_table_of_contents.ipynb # Course navigation
β
βββ π€ Microservices
β βββ chatbot/ # RAG chatbot implementation
β βββ composer/ # Docker orchestration
β βββ docker_router/ # Service coordination
β βββ frontend/ # User interface
β βββ llm_client/ # LLM API client
β
βββ π Resources
β βββ imgs/ # Course images and diagrams
β βββ slides/ # Presentation materials
β
βββ π Configuration Files
βββ server_app.py # Main server application
βββ requirements.txt # Python dependencies
The course is organized as a progressive learning journey:
| Notebook | Topic | Description |
|---|---|---|
00 |
JupyterLab | Environment setup and navigation |
01 |
Microservices | Architecture and containerization |
02 |
LLMs | Large Language Model fundamentals |
03 |
LangChain | Framework introduction and usage |
04 |
Running State | State management in AI applications |
05 |
Documents | Document processing and chunking |
06 |
Embeddings | Vector representations and similarity |
07 |
Vector Stores | Database storage and retrieval |
09 |
LangServe | Production deployment strategies |
- Location:
./chatbot/ - Purpose: Core RAG agent implementation
- Features: Context-aware conversations, notebook integration
- Port: 8999 (Gradio interface)
- Location:
./frontend/ - Purpose: User interface and assessment platform
- Port: 8090
- Location:
./llm_client/ - Purpose: API gateway to NVIDIA NIM endpoints
- Port: 9000
- Location:
./docker_router/ - Purpose: Service orchestration and monitoring
- Features: Log aggregation, health checks
Start all services:
cd composer
docker-compose up -dView logs:
docker-compose logs -f [service-name]Stop services:
docker-compose downChatbot only:
cd chatbot
pip install -r requirements.txt
python frontend_server.pyLLM Client:
cd llm_client
python client_server.py- Open JupyterLab:
http://localhost:8888 - Run notebook cells: Use
Shift+Enterto execute cells - Test chatbot: Navigate to
http://localhost:8999
# Test LLM client endpoint
curl -X POST http://localhost:9000/v1/chat/completions -H "Content-Type: application/json" -d '{"messages": [{"role": "user", "content": "Hello!"}]}'# Check all services status
docker-compose ps
# View service logs
docker-compose logs chatbot# Required
NVIDIA_API_KEY=your_nvidia_api_key
# Optional
NVIDIA_BASE_URL=http://llm_client:9000/v1
NVIDIA_DEFAULT_MODE=openFor security purposes, all hardcoded API keys have been removed from the notebooks. When you run the notebooks, you'll be prompted to enter your API key securely using getpass() which hides the input.
Two methods to set your API key:
-
Interactive Input (Recommended for learning):
# The notebooks will prompt you with: # Enter your NVIDIA API Key: [hidden input]
-
Environment File (Recommended for development):
# Create a .env file in the project root echo "NVIDIA_API_KEY=your_actual_api_key_here" > .env
Then in your notebook, load it:
from dotenv import load_dotenv load_dotenv()
- Never commit API keys to version control
- Use environment variables or secure vaults in production
- Regularly rotate your API keys
- Limit API key permissions to necessary scopes only
- Ports: Defined in
docker-compose.yml - Dependencies: Managed through requirements.txt files
- Volumes: Shared storage for notebook content and assessments
- GET
/- Gradio interface - POST
/chat- Chat completion endpoint
- POST
/v1/chat/completions- OpenAI-compatible endpoint - GET
/health- Health check
- GET
/- Assessment interface - POST
/submit- Assessment submission
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and test thoroughly
- Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
Services won't start:
- Check Docker is running:
docker --version - Verify ports aren't in use:
netstat -an | findstr :8999 - Check logs:
docker-compose logs
Notebook cells fail:
- Ensure all services are running
- Check network connectivity between containers
- Verify API keys are set correctly
API Key Issues:
- Ensure your NVIDIA API key is valid and active
- Check that the key starts with 'nvapi-'
- Verify your API key hasn't expired
- Make sure you have sufficient API quota/credits
Permission errors:
- On Windows, ensure Docker has proper permissions
- Check file sharing settings in Docker Desktop
Memory issues:
- Increase Docker memory allocation
- Monitor resource usage:
docker stats
- Check logs:
docker-compose logs [service-name] - Restart services:
docker-compose restart - Rebuild containers:
docker-compose build --no-cache - Review documentation in the notebooks
This project is part of NVIDIA's Deep Learning Institute curriculum. Please refer to NVIDIA DLI terms and conditions for usage rights and restrictions.
π― Ready to build your first RAG agent? Start with the Table of Contents and begin your journey into the world of AI-powered conversational agents!
