Skip to content

Joelcic/RAGAgent-Local

Repository files navigation

LOCAL RAG-AGENT

Local RAG app to chat and ask questions about files built with a simple PyQt5 UI, CLI tools, and a Chroma vector store. It allows fo uploading CSV/XLSX/PDF/DOCX/TXT, chunks and embeds with Ollama, then answers questions using retrieved context via LangChain.

Model used

In this project the local Ollama has be utilized,

  • Model: llama3.2:1b
  • Embedd Model: ollama pull mxbai-embed-large

Usage

To start the UI run main.py

  • Drag and drop, or select files.
  • Start asking questions about the files.

Note: by default the DB directory is removed on exit to keep runs clean.

RAG chat

Key Libraries

  • langchain, for chaining and chunking utilities, connects to local Ollama for embeddings and LLMs, persistent vector store and retriever
  • PyQt5: desktop UI (drag & drop ingest, chat-like Q&A view).

Prerequisites

  • Python 3.10+ (3.12 recommended).
  • Ollama installed and running (https://ollama.com). Pull the models you configure in config.yaml:
    • ollama pull mxbai-embed-large
    • ollama pull llama3.2:1b (or your chosen model)

Setup

Create a virtual environment, activate and install requirements:

python -m venv .venv
.venv\Scripts\Activate
pip install -r requirements.txt

Review the config.yaml, most importantly

  • EMBEDD_MODEL: embedding model name for Ollama.
  • MODEL: chat/LLM model name for Ollama.
  • db_location: Chroma directory (created if missing).

How It Works (Workflow)

When uploading files its direcctly stored in the Chroma database, and available for the LLM during the session, then exiting the app, the database is cleared and deleted.

  • Ingest

    • vector_store.read_data(...) extracts text from supported files.
    • vector_store.chunk_str(...) splits text with inot chunks.
    • vector_store.add_to_db(file_path) read the file, chunks it, embedds it, and writes to Chroma database.
  • Retrieval + Generation

    • config.load_config() builds a Chroma retriever and an OllamaLLM, both wired from config.yaml.
    • rag_engine.generate_response(query) retrieves top-k docs, formats a brief prompt, and invokes the LLM. The prompt asks to include sources (file names) available in document metadata.

Files & Responsibilities

  • config.yaml: model names, chunking, and DB settings.
  • config.py: loads YAML, returns (CONFIG, RETRIVER, MODEL) wired to Ollama + Chroma.
  • vector_store.py: read, chunk, embed, and persist to Chroma.
  • rag_engine.py: prompt + generate response using retriever and LLM.
  • main_view.py: PyQt5 UI for ingestion and Q&A (spawns child processes for isolation).
  • ingest_cli.py and rag_cli.py: simple CLI wrappers for headless ingest and Q&A.

Design Notes

  • Keeps concerns separate: ingestion (I/O + chunking) vs retrieval/generation.
  • Small, safe batches when adding to Chroma to avoid long blocking calls.
  • Models and chunking are configurable via YAML without code changes.
  • Uses subprocesses for long-running tasks from the UI to keep it responsive.

About

An lightweight Local RAG Agent, fully local retrieval-augmented generation pipeline: ingest → chunk → embed → store (Chroma) → retrieve → generate using Ollama via LangChain. Includes a simple PyQt5 UI and two CLIs for flexible file uploads, chat, and development.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages