📚 RAG Chat Assistant

A simple yet powerful Retrieval-Augmented Generation (RAG) pipeline built with Streamlit that allows you to upload PDF documents and chat with them using OpenAI's GPT models.

✨ Features

📄 Multi-PDF Support: Upload and process multiple PDF documents at once
🔍 Smart Chunking: Automatically splits documents into optimal chunks for processing
💬 Interactive Chat: Natural conversation interface with your documents
📖 Source Citations: See exactly which parts of your documents the AI is referencing
🚀 Easy Setup: Simple installation process with helpful setup script
💾 Session Memory: Maintains conversation history during your session

🛠️ Quick Start

Prerequisites

Python 3.8 or higher
OpenAI API key (Get one here)

Installation

Clone the repository

git clone <your-repo-url>
cd rag-pipeline

Run the setup script
```
python setup.py
```
This will:
- Check your Python version
- Install all dependencies
- Create a .env file from the template
Add your OpenAI API key
- Open the .env file
- Replace your-openai-api-key-here with your actual API key
Run the application
```
streamlit run app.py
```

📖 How to Use

Start the App: Run streamlit run app.py and open the provided URL in your browser
Upload Documents:
- Click on "Browse files" in the sidebar
- Select one or more PDF files
- Click "🚀 Process Documents"
Start Chatting:
- Once documents are processed, type your question in the chat input
- The AI will answer based on the content of your uploaded documents
- Click "View Sources" to see which parts of the documents were used
Clear History: Use the "Clear Chat History" button to start a fresh conversation

🔧 Configuration

You can modify these settings in app.py:

Model: Change gpt-3.5-turbo to gpt-4 for more advanced responses
Temperature: Adjust the temperature parameter (0-1) for response creativity
Chunk Size: Modify chunk_size in the text splitter for different document processing
Retrieved Documents: Change k value in search_kwargs to retrieve more/fewer sources

📁 Project Structure

rag-pipeline/
├── app.py              # Main Streamlit application
├── setup.py            # Setup helper script
├── requirements.txt    # Python dependencies
├── env_example.txt     # Environment variables template
├── .gitignore         # Git ignore file
└── README.md          # This file

🤔 Troubleshooting

"OpenAI API key not found!"

Make sure you've created a .env file from env_example.txt
Ensure your API key is correctly added to the .env file

"No module named 'langchain'"

Run pip install -r requirements.txt to install all dependencies

"Error processing documents"

Check that your PDFs are not password-protected
Ensure you have enough OpenAI API credits
Try with smaller PDF files first

💡 Tips

Better Results: Upload documents that are related to each other for more coherent conversations
Specific Questions: Ask specific questions rather than broad ones
Multiple Sources: Upload multiple documents to create a comprehensive knowledge base
Cost Management: Monitor your OpenAI API usage to manage costs

🔒 Security Notes

Never commit your .env file to version control
Keep your OpenAI API key secure
The app processes documents locally before sending to OpenAI

🚀 Future Enhancements

Potential improvements you could make:

Support for more file formats (TXT, DOCX, etc.)
Persistent vector storage
User authentication
Export chat history
Fine-tuning options
Local LLM support

📝 License

This project is open source and available under the MIT License.

Built with ❤️ using Streamlit, LangChain, and OpenAI

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitattributes		.gitattributes
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
setup.py		setup.py
test_document.txt		test_document.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📚 RAG Chat Assistant

✨ Features

🛠️ Quick Start

Prerequisites

Installation

📖 How to Use

🔧 Configuration

📁 Project Structure

🤔 Troubleshooting

"OpenAI API key not found!"

"No module named 'langchain'"

"Error processing documents"

💡 Tips

🔒 Security Notes

🚀 Future Enhancements

📝 License

About

Uh oh!

Releases

Packages

Languages

rishavghosh605/Rag-Pipeline

Folders and files

Latest commit

History

Repository files navigation

📚 RAG Chat Assistant

✨ Features

🛠️ Quick Start

Prerequisites

Installation

📖 How to Use

🔧 Configuration

📁 Project Structure

🤔 Troubleshooting

"OpenAI API key not found!"

"No module named 'langchain'"

"Error processing documents"

💡 Tips

🔒 Security Notes

🚀 Future Enhancements

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages