Skip to content

nishit546/youtube_RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎬 YouTube RAG Assistant

Chrome Extension to Chat with Any YouTube Video Using AI

Python FastAPI Gemini LangChain Chrome Extension License

Enter a YouTube video ID, ask any question, and get AI-powered answers grounded in the video's transcript β€” all from a sleek Chrome extension.


πŸ“‹ Table of Contents


πŸ” Overview

YouTube RAG Assistant brings the power of Retrieval-Augmented Generation (RAG) to YouTube. Instead of watching an entire video, you can:

  1. Provide a YouTube Video ID
  2. Ask a question in natural language
  3. Get an accurate, transcript-grounded answer powered by Google Gemini 2.5 Flash

The system automatically downloads the video transcript, chunks it, embeds it into a vector store, and runs a full RAG pipeline to answer your question β€” all in real time.


✨ Key Features

Feature Description
πŸŽ₯ Transcript Q&A Ask anything about a YouTube video and get answers from its transcript
πŸ€– Gemini 2.5 Flash Fast, accurate language model for natural-language understanding
🧠 RAG Pipeline Retrieval-Augmented Generation ensures factual, grounded responses
πŸ” Semantic Search ChromaDB + HuggingFace embeddings for intelligent context retrieval
πŸ’¬ Chat Interface Beautiful dark-themed popup with animated chat bubbles
⚑ Real-Time Processes transcripts and answers on-the-fly

πŸ— Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚        Chrome Extension (MV3)        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚ popup.html β”‚   β”‚   popup.js    β”‚ β”‚
β”‚  β”‚ (Chat UI)  β”‚   β”‚ (API calls)   β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚  POST /ask        β”‚
         β–Ό                   β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         FastAPI Backend (api.py)     β”‚
β”‚                                      β”‚
β”‚  1. YouTube Transcript Loader        β”‚
β”‚         β”‚                            β”‚
β”‚         β–Ό                            β”‚
β”‚  2. RecursiveCharacterTextSplitter   β”‚
β”‚         β”‚                            β”‚
β”‚         β–Ό                            β”‚
β”‚  3. ChromaDB Vector Store            β”‚
β”‚     (HuggingFace Embeddings)         β”‚
β”‚         β”‚                            β”‚
β”‚         β–Ό                            β”‚
β”‚  4. LangChain RAG Chain              β”‚
β”‚     (Retriever β†’ Prompt β†’ LLM)      β”‚
β”‚         β”‚                            β”‚
β”‚         β–Ό                            β”‚
β”‚  5. Google Gemini 2.5 Flash          β”‚
β”‚         β”‚                            β”‚
β”‚         β–Ό                            β”‚
β”‚  6. Parsed Answer β†’ JSON Response    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ›  Tech Stack

Layer Technology Purpose
LLM Google Gemini 2.5 Flash Answer generation from transcript context
RAG Framework LangChain Pipeline orchestration β€” loader, splitter, retriever, chain
Transcript YoutubeLoader (LangChain) Automatic YouTube transcript extraction
Vector DB ChromaDB Semantic search over transcript chunks
Embeddings HuggingFace all-MiniLM-L6-v2 Sentence-level embeddings
Backend FastAPI + Uvicorn Async REST API
Frontend Chrome Extension (Manifest V3) Chat interface popup
Styling Vanilla CSS Dark theme, indigo gradients, smooth animations

πŸš€ Getting Started

Prerequisites

Requirement Version
Python 3.8+
Google Chrome Latest
Google API Key Get one here

1️⃣ Backend Setup

cd backend

# Create virtual environment
python -m venv venv

# Activate
venv\Scripts\activate          # Windows
# source venv/bin/activate     # macOS / Linux

# Install dependencies
pip install fastapi uvicorn python-dotenv langchain langchain-google-genai langchain-community chromadb sentence-transformers

# Create .env file
echo GOOGLE_API_KEY=your_api_key_here > .env

2️⃣ Start the Server

uvicorn api:app --reload --port 8000

API available at http://localhost:8000.

3️⃣ Load Chrome Extension

  1. Open chrome://extensions/
  2. Enable Developer mode
  3. Click "Load unpacked"
  4. Select the frontend/ folder
  5. Pin the extension

4️⃣ Try It Out!

  1. Click the extension icon
  2. Enter a YouTube Video ID (e.g., aircAruvnKk)
  3. Type a question: "What is this video about?"
  4. Click "Ask" and get your answer!

Tip: The Video ID is the part after v= in a YouTube URL.
For https://www.youtube.com/watch?v=aircAruvnKk, the ID is aircAruvnKk.


πŸ“‘ API Reference

POST /ask

Send a video ID and question to get a transcript-grounded answer.

Request Body:

{
  "video_id": "aircAruvnKk",
  "question": "What are the main topics discussed?"
}

Response:

{
  "answer": "The video discusses neural networks, specifically..."
}

Error Response (no transcript available):

{
  "detail": "Transcript not available for this video"
}

βš™οΈ How It Works

  1. Transcript Loading β€” YoutubeLoader fetches the auto-generated or manual transcript
  2. Text Splitting β€” RecursiveCharacterTextSplitter chunks transcript into 1000-char pieces with 200-char overlap
  3. Embedding & Storage β€” Chunks embedded with all-MiniLM-L6-v2 and stored in ChromaDB
  4. Retrieval β€” Top 4 most relevant chunks retrieved for each question
  5. RAG Chain β€” Retrieved context + question fed through a LangChain prompt template
  6. LLM Answer β€” Gemini 2.5 Flash generates a grounded answer
  7. Response β€” Answer returned to the extension and displayed in the chat

πŸ“ Project Structure

yt_video_chatbot/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ api.py               # FastAPI backend β€” full RAG pipeline
β”‚   └── yt_chroma_db/        # ChromaDB persistent storage (auto-generated)
β”‚
└── frontend/
    β”œβ”€β”€ manifest.json         # Chrome Extension config (Manifest V3)
    β”œβ”€β”€ popup.html            # Chat UI β€” dark indigo theme, animations
    └── popup.js              # Extension logic β€” API calls, chat rendering

πŸ› Troubleshooting

Problem Solution
"Transcript not available" The video may not have captions β€” try a different video
Server not running Run uvicorn api:app --reload --port 8000
API key error Add GOOGLE_API_KEY to backend/.env
Extension not loading Enable Developer mode at chrome://extensions/
Slow first request HuggingFace embeddings model downloads on first run (~90MB)
Wrong video ID Use only the ID part (e.g., aircAruvnKk), not the full URL

πŸ“„ License

MIT License β€” feel free to modify and use for your own projects.


Built with ❀️ for Knowledge Seekers πŸŽ¬πŸ€–

Ask a video anything β€” powered by RAG + Gemini AI.

About

A chrome extension that accepts yt video id and gives answers with the use of transcript present in that video and shows also timings of where that particular question happened

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors