Skip to content

HeaIn/korean-rag-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Korean RAG Server

Python Flask LangChain Docker Nginx

Simply drop your custom PDF documents into a folder, and the system automatically builds a powerful AI backend that answers questions strictly based on your files. Highly optimized for the Korean language and fully containerized for production using an Nginx + Gunicorn multi-worker architecture.

✨ Key Features

  • 📂 Plug-and-Play Architecture: - No complex DB setup required. Just place any PDF files into the /data directory, and the server handles the parsing, splitting, and embedding automatically on boot.
  • 🇰🇷 Korean-Optimized Hybrid Search: - Overcomes the limitations of English-centric embedding models by combining Dense Retrieval (FAISS) and Sparse Retrieval (BM25 with KoNLPy morphological analysis) via an EnsembleRetriever at a 5:5 ratio, ensuring maximum accuracy for Korean queries.
  • 🛡️ Hallucination Mitigation & Strict Citation:
    • Applied advanced prompt engineering to force the LLM to explicitly cite the exact source file and page number for every piece of information used (e.g., [Source: manual.pdf, Page 12]).
  • ⚡ Cost & Speed Optimization via Local Caching:
    • Prevented OpenAI API Rate Limit (429) errors and significantly reduced initialization costs by caching vector stores and preprocessed sparse matrix objects (pickle) to local disk. Server reboot time is reduced to under 1 second.
  • 🏭 Production-Ready Infrastructure:
    • Scalable Gunicorn WSGI multi-worker setup to overcome Flask's built-in concurrency limits.
    • Nginx (Alpine) deployed as a reverse proxy for secure traffic routing within a private Docker network.

🚀 Getting Started

1. Prerequisites

  • Docker and Docker Compose installed on your local machine or server.
  • An active OpenAI API Key from here.

2. Installation

# Clone the repository
git clone https://github.com/heain/korean-rag-server
cd korean-rag-server

# Set up environment variables
echo "OPENAI_API_KEY=your_api_key_here" > .env

# Prepare your RAG data
# Create a 'data' folder and place YOUR target PDF files inside.
mkdir data

3. Run the Server

On the first run, the server will automatically load your PDFs, split the text, and embed the documents, caching the indexes (faiss_index, bm25_index.pkl) locally.

docker-compose up -d --build

Once the containers are up, the API will be accessible via http://localhost:8080/api/ask.

4. Test API

Open test_api.py and edit TO BE FILLED area with your question. Then, run the code.

python test_api.py

About

Simply drop your custom PDF documents into a folder, and the system automatically builds a powerful AI backend that answers questions strictly based on your files. Highly optimized for the Korean language.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors