🎬 YouTube RAG Assistant

Chrome Extension to Chat with Any YouTube Video Using AI

Enter a YouTube video ID, ask any question, and get AI-powered answers grounded in the video's transcript — all from a sleek Chrome extension.

📋 Table of Contents

Overview
Key Features
Architecture
Tech Stack
Getting Started
How It Works
API Reference
Project Structure
Troubleshooting
License

🔍 Overview

YouTube RAG Assistant brings the power of Retrieval-Augmented Generation (RAG) to YouTube. Instead of watching an entire video, you can:

Provide a YouTube Video ID
Ask a question in natural language
Get an accurate, transcript-grounded answer powered by Google Gemini 2.5 Flash

The system automatically downloads the video transcript, chunks it, embeds it into a vector store, and runs a full RAG pipeline to answer your question — all in real time.

✨ Key Features

Feature	Description
🎥 Transcript Q&A	Ask anything about a YouTube video and get answers from its transcript
🤖 Gemini 2.5 Flash	Fast, accurate language model for natural-language understanding
🧠 RAG Pipeline	Retrieval-Augmented Generation ensures factual, grounded responses
🔍 Semantic Search	ChromaDB + HuggingFace embeddings for intelligent context retrieval
💬 Chat Interface	Beautiful dark-themed popup with animated chat bubbles
⚡ Real-Time	Processes transcripts and answers on-the-fly

🏗 Architecture

┌──────────────────────────────────────┐
│        Chrome Extension (MV3)        │
│  ┌────────────┐   ┌───────────────┐ │
│  │ popup.html │   │   popup.js    │ │
│  │ (Chat UI)  │   │ (API calls)   │ │
│  └─────┬──────┘   └───────┬───────┘ │
└────────┼───────────────────┼─────────┘
         │  POST /ask        │
         ▼                   ▼
┌──────────────────────────────────────┐
│         FastAPI Backend (api.py)     │
│                                      │
│  1. YouTube Transcript Loader        │
│         │                            │
│         ▼                            │
│  2. RecursiveCharacterTextSplitter   │
│         │                            │
│         ▼                            │
│  3. ChromaDB Vector Store            │
│     (HuggingFace Embeddings)         │
│         │                            │
│         ▼                            │
│  4. LangChain RAG Chain              │
│     (Retriever → Prompt → LLM)      │
│         │                            │
│         ▼                            │
│  5. Google Gemini 2.5 Flash          │
│         │                            │
│         ▼                            │
│  6. Parsed Answer → JSON Response    │
└──────────────────────────────────────┘

🛠 Tech Stack

Layer	Technology	Purpose
LLM	Google Gemini 2.5 Flash	Answer generation from transcript context
RAG Framework	LangChain	Pipeline orchestration — loader, splitter, retriever, chain
Transcript	`YoutubeLoader` (LangChain)	Automatic YouTube transcript extraction
Vector DB	ChromaDB	Semantic search over transcript chunks
Embeddings	HuggingFace `all-MiniLM-L6-v2`	Sentence-level embeddings
Backend	FastAPI + Uvicorn	Async REST API
Frontend	Chrome Extension (Manifest V3)	Chat interface popup
Styling	Vanilla CSS	Dark theme, indigo gradients, smooth animations

🚀 Getting Started

Prerequisites

Requirement	Version
Python	3.8+
Google Chrome	Latest
Google API Key	Get one here

1️⃣ Backend Setup

cd backend

# Create virtual environment
python -m venv venv

# Activate
venv\Scripts\activate          # Windows
# source venv/bin/activate     # macOS / Linux

# Install dependencies
pip install fastapi uvicorn python-dotenv langchain langchain-google-genai langchain-community chromadb sentence-transformers

# Create .env file
echo GOOGLE_API_KEY=your_api_key_here > .env

2️⃣ Start the Server

uvicorn api:app --reload --port 8000

API available at http://localhost:8000.

3️⃣ Load Chrome Extension

Open chrome://extensions/
Enable Developer mode
Click "Load unpacked"
Select the frontend/ folder
Pin the extension

4️⃣ Try It Out!

Click the extension icon
Enter a YouTube Video ID (e.g., aircAruvnKk)
Type a question: "What is this video about?"
Click "Ask" and get your answer!

Tip: The Video ID is the part after v= in a YouTube URL.
For https://www.youtube.com/watch?v=aircAruvnKk, the ID is aircAruvnKk.

📡 API Reference

`POST /ask`

Send a video ID and question to get a transcript-grounded answer.

Request Body:

{
  "video_id": "aircAruvnKk",
  "question": "What are the main topics discussed?"
}

Response:

{
  "answer": "The video discusses neural networks, specifically..."
}

Error Response (no transcript available):

{
  "detail": "Transcript not available for this video"
}

⚙️ How It Works

Transcript Loading — YoutubeLoader fetches the auto-generated or manual transcript
Text Splitting — RecursiveCharacterTextSplitter chunks transcript into 1000-char pieces with 200-char overlap
Embedding & Storage — Chunks embedded with all-MiniLM-L6-v2 and stored in ChromaDB
Retrieval — Top 4 most relevant chunks retrieved for each question
RAG Chain — Retrieved context + question fed through a LangChain prompt template
LLM Answer — Gemini 2.5 Flash generates a grounded answer
Response — Answer returned to the extension and displayed in the chat

📁 Project Structure

yt_video_chatbot/
├── backend/
│   ├── api.py               # FastAPI backend — full RAG pipeline
│   └── yt_chroma_db/        # ChromaDB persistent storage (auto-generated)
│
└── frontend/
    ├── manifest.json         # Chrome Extension config (Manifest V3)
    ├── popup.html            # Chat UI — dark indigo theme, animations
    └── popup.js              # Extension logic — API calls, chat rendering

🐛 Troubleshooting

Problem	Solution
"Transcript not available"	The video may not have captions — try a different video
Server not running	Run `uvicorn api:app --reload --port 8000`
API key error	Add `GOOGLE_API_KEY` to `backend/.env`
Extension not loading	Enable Developer mode at `chrome://extensions/`
Slow first request	HuggingFace embeddings model downloads on first run (~90MB)
Wrong video ID	Use only the ID part (e.g., `aircAruvnKk`), not the full URL

📄 License

MIT License — feel free to modify and use for your own projects.

Built with ❤️ for Knowledge Seekers 🎬🤖

Ask a video anything — powered by RAG + Gemini AI.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎬 YouTube RAG Assistant

Chrome Extension to Chat with Any YouTube Video Using AI

📋 Table of Contents

🔍 Overview

✨ Key Features

🏗 Architecture

🛠 Tech Stack

🚀 Getting Started

Prerequisites

1️⃣ Backend Setup

2️⃣ Start the Server

3️⃣ Load Chrome Extension

4️⃣ Try It Out!

📡 API Reference

`POST /ask`

⚙️ How It Works

📁 Project Structure

🐛 Troubleshooting

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎬 YouTube RAG Assistant

Chrome Extension to Chat with Any YouTube Video Using AI

📋 Table of Contents

🔍 Overview

✨ Key Features

🏗 Architecture

🛠 Tech Stack

🚀 Getting Started

Prerequisites

1️⃣ Backend Setup

2️⃣ Start the Server

3️⃣ Load Chrome Extension

4️⃣ Try It Out!

📡 API Reference

POST /ask

⚙️ How It Works

📁 Project Structure

🐛 Troubleshooting

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`POST /ask`

Packages