diff --git a/.claude/settings.local.json b/.claude/settings.local.json new file mode 100644 index 000000000..79791ec26 --- /dev/null +++ b/.claude/settings.local.json @@ -0,0 +1,8 @@ +{ + "permissions": { + "allow": [ + "mcp__playwright__browser_navigate", + "mcp__playwright__browser_take_screenshot" + ] + } +} diff --git a/.gitignore b/.gitignore index 41b4384b8..3abcbd692 100644 --- a/.gitignore +++ b/.gitignore @@ -28,4 +28,7 @@ uploads/ # OS .DS_Store -Thumbs.db \ No newline at end of file +Thumbs.db + +# claude code tmp files +tmpclaude-*-cwd \ No newline at end of file diff --git a/.playwright-mcp/current-chat-button.png b/.playwright-mcp/current-chat-button.png new file mode 100644 index 000000000..a68d2abb0 Binary files /dev/null and b/.playwright-mcp/current-chat-button.png differ diff --git a/.playwright-mcp/final-chat-button.png b/.playwright-mcp/final-chat-button.png new file mode 100644 index 000000000..a68d2abb0 Binary files /dev/null and b/.playwright-mcp/final-chat-button.png differ diff --git a/.playwright-mcp/updated-chat-button.png b/.playwright-mcp/updated-chat-button.png new file mode 100644 index 000000000..a68d2abb0 Binary files /dev/null and b/.playwright-mcp/updated-chat-button.png differ diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 000000000..86155f54a --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,402 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +<<<<<<< HEAD +This is a Retrieval-Augmented Generation (RAG) system for querying course materials. The system uses ChromaDB for vector storage, Anthropic's Claude for AI generation, and provides a FastAPI-based web interface. + +## Guidelines for Claude Code + +**IMPORTANT: Do not automatically start the server or run the application.** The developer will manually start the server when ready. Never execute `./run.sh` or `uvicorn` commands unless explicitly requested by the developer. + +## Development Commands + +### Running the Application + +**Quick start:** +```bash +./run.sh +``` + +**Manual start:** +```bash +======= +This is a Retrieval-Augmented Generation (RAG) system for querying course materials. It combines semantic search (ChromaDB) with AI generation (Claude API) to provide intelligent, context-aware answers about educational content. + +**Tech Stack**: FastAPI backend, vanilla JavaScript frontend, ChromaDB vector store, Claude Sonnet 4, sentence-transformers for embeddings. + +## Development Commands + +**CRITICAL: Always use `uv` to run this service.** This project uses `uv` for Python package management and virtual environment management. Never use `pip`, `python`, or `python -m` directly - always prefix commands with `uv run` or use `uv sync` for dependency management. + +### Running the Application + +```bash +# Quick start (recommended) +./run.sh + +# Manual start +>>>>>>> 2a1fb30e25e5e5b6452f8067f0eebc40daff0b70 +cd backend +uv run uvicorn app:app --reload --port 8000 +``` + +<<<<<<< HEAD +The app will be available at `http://localhost:8000` with API docs at `http://localhost:8000/docs`. + +### Dependency Management + +**Install dependencies:** +```bash +uv sync +``` + +**Note:** This project uses `uv` for Python package management, not pip. Python 3.13+ is required. + +### Environment Setup + +Create a `.env` file in the root directory with: +``` +ANTHROPIC_API_KEY=your_api_key_here +``` + +## Architecture + +### RAG System Flow + +The system follows this flow for query processing: + +1. **Query Reception** (`app.py`) - FastAPI receives user query via POST `/api/query` +2. **RAG Orchestration** (`rag_system.py`) - Coordinates all components: + - Creates/retrieves session for conversation context + - Passes query to AI generator with tool access +3. **Tool-Based Search** (`search_tools.py` + `ai_generator.py`): + - Claude decides whether to use the search tool based on query type + - If needed, executes `search_course_content` tool +4. **Vector Search** (`vector_store.py`): + - Resolves course names semantically via `course_catalog` collection + - Searches content in `course_content` collection with filters + - Returns ranked results with metadata +5. **Response Generation** (`ai_generator.py`) - Claude synthesizes final answer from search results +6. **Session Management** (`session_manager.py`) - Stores conversation history (last 2 exchanges by default) + +### Core Components + +**RAGSystem** (`rag_system.py`) - Main orchestrator that wires together: +- `DocumentProcessor` - Parses course documents and creates chunks +- `VectorStore` - Manages ChromaDB collections and semantic search +- `AIGenerator` - Handles Claude API calls with tool support +- `SessionManager` - Tracks conversation history per session +- `ToolManager` + `CourseSearchTool` - Provides search capability to Claude + +**VectorStore** (`vector_store.py`) - Two ChromaDB collections: +- `course_catalog` - Course metadata (title, instructor, lessons) for semantic course name resolution +- `course_content` - Text chunks with metadata (course_title, lesson_number, chunk_index) + +**Document Processing** (`document_processor.py`) - Expects structured format: +``` +Course Title: [title] +Course Link: [url] +Course Instructor: [name] + +Lesson 1: [title] +Lesson Link: [url] +[content...] +``` + +Chunks text into ~800 character segments with 100 character overlap, adding context markers like `"Course X Lesson Y content: ..."`. + +**AI Tool System** (`search_tools.py`) - Claude uses tool calling to search when needed: +- Tool decides whether to search based on query type (general knowledge vs course-specific) +- Supports optional `course_name` and `lesson_number` filters +- Course names are fuzzy-matched semantically via vector search + +### Configuration + +All settings in `backend/config.py`: +- `ANTHROPIC_MODEL`: "claude-sonnet-4-20250514" +- `EMBEDDING_MODEL`: "all-MiniLM-L6-v2" (sentence-transformers) +- `CHUNK_SIZE`: 800 chars, `CHUNK_OVERLAP`: 100 chars +- `MAX_RESULTS`: 5 search results returned +- `MAX_HISTORY`: 2 conversation exchanges remembered + +### Data Flow + +Documents → DocumentProcessor (parse + chunk) → VectorStore (embed + index) → RAG query → ToolManager (search if needed) → AIGenerator (synthesize response) → User + +### Frontend + +Static HTML/CSS/JS in `/frontend`: +- `index.html` - Main UI +- `script.js` - API calls to `/api/query` and `/api/courses` +- `style.css` - Styling + +Served by FastAPI's StaticFiles with no-cache headers for development. + +## Common Workflows + +### Adding New Documents + +Place `.txt`, `.pdf`, or `.docx` files in `/docs` folder. On startup, `app.py` automatically loads documents that don't already exist in the vector store (checks by course title). + +To force a rebuild: +```python +# In backend/app.py startup_event(), add clear_existing=True: +courses, chunks = rag_system.add_course_folder(docs_path, clear_existing=True) +``` + +### Modifying Search Behavior + +Search logic is in `vector_store.py` `search()` method: +- Course name resolution: `_resolve_course_name()` uses semantic search on catalog +- Content filtering: `_build_filter()` creates ChromaDB where clauses +- Adjust `MAX_RESULTS` in config to change number of chunks returned + +### Changing AI Behavior + +System prompt in `ai_generator.py` `SYSTEM_PROMPT` controls: +- When to use search tool (course-specific vs general knowledge) +- Response style (concise, educational, no meta-commentary) +- Tool usage limits (one search per query maximum) + +## Project Structure Notes + +- **No tests directory** - Tests not yet implemented +- **ChromaDB persistence** - Vector DB stored in `./chroma_db` (gitignored) +- **Windows compatibility** - Use Git Bash for shell scripts on Windows +- **Frontend integration** - Backend serves frontend files, no separate frontend server needed +======= +Access points: +- Web UI: `http://localhost:8000` +- API docs: `http://localhost:8000/docs` + +### Package Management + +```bash +# Install/sync dependencies +uv sync + +# Add a new dependency +uv add package-name + +# Run Python commands in the virtual environment +uv run python script.py +``` + +### Environment Setup + +Required `.env` file in root directory: +``` +ANTHROPIC_API_KEY=your_key_here +``` + +## Architecture Overview + +### Request Flow (Frontend → Backend → AI → Response) + +1. **Frontend** (`frontend/script.js`) sends POST to `/api/query` with `{query, session_id}` +2. **FastAPI** (`backend/app.py`) routes to RAG system +3. **RAG System** (`backend/rag_system.py`) orchestrates the entire flow: + - Retrieves conversation history from SessionManager + - Calls AIGenerator with query, history, and tool definitions +4. **AI Generator** (`backend/ai_generator.py`) makes **two Claude API calls**: + - **Call #1**: Claude decides whether to use the search tool + - If tool use requested: ToolManager executes search + - **Call #2**: Claude generates final answer using search results +5. **Tool Execution** (`backend/search_tools.py`): + - CourseSearchTool calls VectorStore.search() + - Formats results with course/lesson context + - Tracks sources for UI display +6. **Vector Store** (`backend/vector_store.py`): + - Resolves fuzzy course names via semantic search in `course_catalog` + - Searches content in `course_content` collection with filters + - Returns top 5 most relevant chunks +7. **Response**: Answer + sources returned to frontend, conversation history updated + +### Two-Collection Vector Store Design + +**Why two collections?** +- `course_catalog`: Stores course metadata (title, instructor, lessons) for course name resolution +- `course_content`: Stores actual text chunks for semantic content search + +This separation enables fuzzy course name matching (e.g., "MCP" → "Introduction to MCP Servers") before filtering content. + +### Document Processing Pipeline + +**Location**: `backend/document_processor.py` + +Documents in `docs/` folder are processed on startup (`app.py:startup_event`): + +1. **Parse metadata** from first 3 lines: + ``` + Course Title: [title] + Course Link: [url] + Course Instructor: [name] + ``` + +2. **Extract lessons** using regex pattern `Lesson \d+: [title]` + - Optionally followed by `Lesson Link: [url]` + +3. **Chunk text** (sentence-based): + - Chunk size: 800 characters + - Overlap: 100 characters + - Preserves sentence boundaries + +4. **Add context** to chunks: + - Format: `"Course {title} Lesson {N} content: {text}"` + - Helps AI understand source during retrieval + +5. **Store in ChromaDB**: + - Course metadata → `course_catalog` collection + - Text chunks → `course_content` collection + +**Deduplication**: Existing course titles are checked before adding to avoid duplicates. + +### Session Management + +**Location**: `backend/session_manager.py` + +- Each user gets a unique `session_id` (format: `session_N`) +- Conversation history limited to **2 exchanges** (4 messages total) +- History is formatted and injected into Claude's system prompt +- Enables context-aware multi-turn conversations + +### Tool-Based Search Pattern + +**Key insight**: Claude autonomously decides when to search, not the application. + +The system provides Claude with a `search_course_content` tool definition. Claude: +1. Analyzes the user query +2. Decides if search is needed (general knowledge vs. course-specific) +3. Calls tool with appropriate parameters (query, course_name, lesson_number) +4. Receives formatted results +5. Synthesizes final answer + +This pattern keeps search logic in AI reasoning, not hardcoded rules. + +## Key Configuration Parameters + +**Location**: `backend/config.py` + +```python +ANTHROPIC_MODEL = "claude-sonnet-4-20250514" +EMBEDDING_MODEL = "all-MiniLM-L6-v2" +CHUNK_SIZE = 800 # Characters per chunk +CHUNK_OVERLAP = 100 # Character overlap between chunks +MAX_RESULTS = 5 # Max search results returned +MAX_HISTORY = 2 # Max conversation exchanges stored +CHROMA_PATH = "./chroma_db" # Persistent vector store location +``` + +**AI Generator settings** (`ai_generator.py`): +- Temperature: 0 (deterministic responses) +- Max tokens: 800 +- System prompt emphasizes brevity and educational value + +## Important Implementation Details + +### Adding New Documents + +Place `.txt` files in `docs/` folder with the expected format. On next server restart, they'll be automatically processed and indexed. The system skips documents with duplicate course titles. + +To force rebuild: +```python +# In app.py startup_event +rag_system.add_course_folder(docs_path, clear_existing=True) +``` + +### Adding New Tools + +1. Create a class inheriting from `Tool` in `search_tools.py` +2. Implement `get_tool_definition()` and `execute()` methods +3. Register with ToolManager in `rag_system.py:__init__` +4. Claude will automatically have access to the new tool + +### Data Models + +**Location**: `backend/models.py` + +- `Course`: Represents a course with title, link, instructor, and lessons list +- `Lesson`: Individual lesson with number, title, and optional link +- `CourseChunk`: Text chunk with course context (course_title, lesson_number, chunk_index) + +These models ensure type safety and consistent data structure throughout the pipeline. + +## Critical Architectural Decisions + +### Why Two API Calls to Claude? + +The AI Generator makes two sequential API calls when tools are used: + +1. **First call**: Claude receives the query + tool definitions, decides to use `search_course_content` +2. **Tool execution**: System runs the search and gets results +3. **Second call**: Claude receives the search results and generates the final answer + +This pattern is required by Anthropic's tool use API - the tool results must be sent back as a user message for Claude to synthesize the final response. + +### Why Sentence-Based Chunking? + +The document processor uses sentence boundaries (not fixed character splits) to: +- Preserve semantic coherence within chunks +- Avoid cutting sentences mid-way +- Maintain context through 100-character overlap + +This improves retrieval quality compared to naive character-based splitting. + +### ChromaDB Persistence + +The vector database persists to disk at `./chroma_db/`. This means: +- Documents are indexed once and survive server restarts +- No need to re-process documents on every startup +- Deduplication prevents duplicate indexing + +To reset the database, delete the `chroma_db/` directory. + +## Frontend Architecture + +**Location**: `frontend/` directory + +- **Vanilla JavaScript** (no framework) - keeps it simple and lightweight +- **Marked.js** for Markdown rendering of AI responses +- **Session management**: Stores `session_id` in memory, creates new session on page load +- **Loading states**: Shows animated dots while waiting for API response +- **Source attribution**: Displays collapsible `
` element with sources + +Key files: +- `index.html`: Layout with sidebar (course stats, suggested questions) and chat area +- `script.js`: API calls, message rendering, event handling +- `style.css`: Responsive design with CSS variables + +## Troubleshooting + +### ChromaDB Issues + +If you see ChromaDB errors on startup: +```bash +# Delete the database and restart +rm -rf backend/chroma_db +./run.sh +``` + +### Missing API Key + +Error: `anthropic.APIConnectionError` or `401 Unauthorized` +- Ensure `.env` file exists in root directory +- Verify `ANTHROPIC_API_KEY` is set correctly +- Restart the server after adding the key + +### Port Already in Use + +If port 8000 is occupied: +```bash +# Find and kill the process +lsof -ti:8000 | xargs kill -9 + +# Or use a different port +cd backend +uv run uvicorn app:app --reload --port 8001 +``` + +>>>>>>> 2a1fb30e25e5e5b6452f8067f0eebc40daff0b70 diff --git a/backend/app.py b/backend/app.py index 5a69d741d..515f33680 100644 --- a/backend/app.py +++ b/backend/app.py @@ -40,10 +40,15 @@ class QueryRequest(BaseModel): query: str session_id: Optional[str] = None +class Source(BaseModel): + """Source citation with optional link""" + text: str + link: Optional[str] = None + class QueryResponse(BaseModel): """Response model for course queries""" answer: str - sources: List[str] + sources: List[Source] session_id: str class CourseStats(BaseModel): diff --git a/backend/search_tools.py b/backend/search_tools.py index adfe82352..e5209496c 100644 --- a/backend/search_tools.py +++ b/backend/search_tools.py @@ -88,29 +88,39 @@ def execute(self, query: str, course_name: Optional[str] = None, lesson_number: def _format_results(self, results: SearchResults) -> str: """Format search results with course and lesson context""" formatted = [] - sources = [] # Track sources for the UI - + sources = [] # Track sources for the UI (now List[Dict[str, Optional[str]]]) + for doc, meta in zip(results.documents, results.metadata): course_title = meta.get('course_title', 'unknown') lesson_num = meta.get('lesson_number') - + # Build context header header = f"[{course_title}" if lesson_num is not None: header += f" - Lesson {lesson_num}" header += "]" - - # Track source for the UI - source = course_title + + # Build source text + source_text = course_title if lesson_num is not None: - source += f" - Lesson {lesson_num}" - sources.append(source) - + source_text += f" - Lesson {lesson_num}" + + # Retrieve lesson link if lesson_num exists + lesson_link = None + if lesson_num is not None: + lesson_link = self.store.get_lesson_link(course_title, lesson_num) + + # Store as dict with text and optional link + sources.append({ + "text": source_text, + "link": lesson_link + }) + formatted.append(f"{header}\n{doc}") - + # Store sources for retrieval self.last_sources = sources - + return "\n\n".join(formatted) class ToolManager: diff --git a/frontend/index.html b/frontend/index.html index f8e25a62f..d40bd0c08 100644 --- a/frontend/index.html +++ b/frontend/index.html @@ -7,7 +7,7 @@ Course Materials Assistant - +
@@ -19,6 +19,14 @@

Course Materials Assistant

+ +
+ +
+
diff --git a/frontend/script.js b/frontend/script.js index 562a8a363..cd2e8a1da 100644 --- a/frontend/script.js +++ b/frontend/script.js @@ -38,6 +38,12 @@ function setupEventListeners() { sendMessage(); }); }); + + // New chat button + const newChatButton = document.getElementById('newChatButton'); + if (newChatButton) { + newChatButton.addEventListener('click', handleNewChat); + } } @@ -115,25 +121,40 @@ function addMessage(content, type, sources = null, isWelcome = false) { const messageDiv = document.createElement('div'); messageDiv.className = `message ${type}${isWelcome ? ' welcome-message' : ''}`; messageDiv.id = `message-${messageId}`; - + // Convert markdown to HTML for assistant messages const displayContent = type === 'assistant' ? marked.parse(content) : escapeHtml(content); - + let html = `
${displayContent}
`; - + if (sources && sources.length > 0) { + // Format sources as clickable links or plain text + const formattedSources = sources.map(source => { + // Handle new object format {text, link} + if (typeof source === 'object' && source.text) { + if (source.link) { + return `${escapeHtml(source.text)}`; + } else { + return escapeHtml(source.text); + } + } else { + // Legacy string format (backward compatibility) + return escapeHtml(source); + } + }).join(', '); + html += `
Sources -
${sources.join(', ')}
+
${formattedSources}
`; } - + messageDiv.innerHTML = html; chatMessages.appendChild(messageDiv); chatMessages.scrollTop = chatMessages.scrollHeight; - + return messageId; } @@ -152,6 +173,26 @@ async function createNewSession() { addMessage('Welcome to the Course Materials Assistant! I can help you with questions about courses, lessons and specific content. What would you like to know?', 'assistant', null, true); } +function handleNewChat() { + // Confirm if there's an ongoing conversation + const hasMessages = chatMessages && chatMessages.children.length > 1; // More than just welcome message + + if (hasMessages) { + const confirmed = confirm('Start a new chat? This will clear the current conversation.'); + if (!confirmed) { + return; + } + } + + // Create new session (sets currentSessionId to null and clears UI) + createNewSession(); + + // Optional: Focus on input for immediate use + if (chatInput) { + chatInput.focus(); + } +} + // Load course statistics async function loadCourseStats() { try { diff --git a/frontend/style.css b/frontend/style.css index 825d03675..dae6b3780 100644 --- a/frontend/style.css +++ b/frontend/style.css @@ -111,6 +111,57 @@ header h1 { margin-bottom: 0; } +/* New Chat Button Section */ +.new-chat-section { + margin-bottom: 1.5rem; + padding-bottom: 1.5rem; + border-bottom: 1px solid var(--border-color); +} + +.new-chat-button { + width: 100%; + display: flex; + align-items: center; + justify-content: flex-start; + gap: 0.5rem; + padding: 0.5rem 0; + background: none; + border: none; + border-radius: 0; + color: var(--text-secondary); + font-size: 0.875rem; + font-weight: 600; + text-transform: uppercase; + letter-spacing: 0.5px; + cursor: pointer; + transition: all 0.2s ease; +} + +.new-chat-button:hover { + background: none; + border-color: transparent; + color: var(--primary-color); +} + +.new-chat-button:focus { + outline: none; + color: var(--primary-color); +} + +.new-chat-button:active { + transform: none; +} + +.new-chat-icon { + font-size: 1.25rem; + line-height: 1; + font-weight: 400; +} + +.new-chat-text { + font-size: 0.875rem; +} + /* Main Chat Area */ .chat-main { flex: 1; @@ -241,8 +292,62 @@ header h1 { } .sources-content { - padding: 0 0.5rem 0.25rem 1.5rem; + padding: 0.5rem 0.5rem 0.5rem 1.5rem; color: var(--text-secondary); + line-height: 1.8; +} + +/* Source link styling - Enhanced visual design */ +.source-link { + display: inline-block; + position: relative; + color: var(--primary-color); + text-decoration: none; + padding: 0.25rem 0.5rem; + margin: 0.15rem 0.25rem 0.15rem 0; + border-radius: 0.375rem; + background: linear-gradient(135deg, rgba(59, 130, 246, 0.08) 0%, rgba(99, 102, 241, 0.08) 100%); + border: 1px solid rgba(59, 130, 246, 0.2); + transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1); + font-weight: 500; +} + +.source-link::after { + content: "↗"; + margin-left: 0.3rem; + font-size: 0.75em; + opacity: 0.6; + transition: all 0.3s ease; +} + +.source-link:hover { + color: #fff; + background: linear-gradient(135deg, var(--primary-color) 0%, var(--primary-hover) 100%); + border-color: var(--primary-hover); + transform: translateY(-1px); + box-shadow: 0 4px 12px rgba(59, 130, 246, 0.3); +} + +.source-link:hover::after { + opacity: 1; + transform: translate(2px, -2px); +} + +.source-link:active { + transform: translateY(0); + box-shadow: 0 2px 6px rgba(59, 130, 246, 0.2); +} + +.source-link:visited { + color: #8b5cf6; + border-color: rgba(139, 92, 246, 0.2); + background: linear-gradient(135deg, rgba(139, 92, 246, 0.08) 0%, rgba(168, 85, 247, 0.08) 100%); +} + +.source-link:visited:hover { + color: #fff; + background: linear-gradient(135deg, #8b5cf6 0%, #a855f7 100%); + border-color: #a855f7; } /* Markdown formatting styles */ @@ -665,7 +770,21 @@ details[open] .suggested-header::before { .sidebar::-webkit-scrollbar-thumb:hover { background: var(--text-secondary); } - + + .new-chat-section { + margin-bottom: 1rem; + padding-bottom: 1rem; + } + + .new-chat-button { + padding: 0.625rem 0.875rem; + font-size: 0.8rem; + } + + .new-chat-icon { + font-size: 1.1rem; + } + .chat-main { order: 1; } diff --git "a/\345\211\215\345\220\216\347\253\257\344\272\244\344\272\222\346\265\201\347\250\213\345\233\276.md" "b/\345\211\215\345\220\216\347\253\257\344\272\244\344\272\222\346\265\201\347\250\213\345\233\276.md" new file mode 100644 index 000000000..159fc67be --- /dev/null +++ "b/\345\211\215\345\220\216\347\253\257\344\272\244\344\272\222\346\265\201\347\250\213\345\233\276.md" @@ -0,0 +1,212 @@ +# RAG 聊天机器人前后端交互流程图 + +## 完整请求流程序列图 + +```mermaid +sequenceDiagram + participant User as 用户 + participant Frontend as 前端
(script.js) + participant FastAPI as FastAPI服务器
(app.py) + participant RAG as RAG系统
(rag_system.py) + participant Session as 会话管理器
(session_manager.py) + participant AIGen as AI生成器
(ai_generator.py) + participant Claude as Claude API + participant ToolMgr as 工具管理器
(search_tools.py) + participant Vector as 向量存储
(vector_store.py) + participant Chroma as ChromaDB + + User->>Frontend: 输入问题并点击发送 + Frontend->>Frontend: 禁用输入框 + Frontend->>Frontend: 显示用户消息 + Frontend->>Frontend: 显示加载动画 + + Frontend->>+FastAPI: POST /api/query
{query, session_id} + + FastAPI->>Session: 创建或获取会话ID + Session-->>FastAPI: session_id + + FastAPI->>+RAG: query(query, session_id) + + RAG->>Session: 获取对话历史 + Session-->>RAG: conversation_history + + RAG->>+AIGen: generate_response()
(query, history, tools) + + Note over AIGen: 构建系统提示词
+ 对话历史 + + AIGen->>+Claude: API调用 #1
(带工具定义) + Note over Claude: 分析问题
决定使用工具 + Claude-->>-AIGen: stop_reason="tool_use"
tool_name="search_course_content" + + AIGen->>+ToolMgr: execute_tool()
(query, course_name, lesson_number) + + ToolMgr->>+Vector: search()
(query, course_name, lesson_number) + + alt 提供了课程名 + Vector->>Chroma: 查询 course_catalog
解析课程名 + Chroma-->>Vector: 匹配的课程标题 + end + + Note over Vector: 构建过滤条件
(course_title, lesson_number) + + Vector->>Chroma: 查询 course_content
(向量相似度搜索) + Chroma-->>Vector: 最相关的5个文本块 + + Vector-->>-ToolMgr: SearchResults
(documents, metadata) + + Note over ToolMgr: 格式化结果
添加课程/课时标题
保存来源信息 + + ToolMgr-->>-AIGen: 格式化的搜索结果 + + Note over AIGen: 将搜索结果作为
tool_result返回 + + AIGen->>+Claude: API调用 #2
(不带工具) + Note over Claude: 基于搜索结果
生成最终回答 + Claude-->>-AIGen: 最终回答文本 + + AIGen-->>-RAG: response + + RAG->>ToolMgr: 获取来源列表 + ToolMgr-->>RAG: sources[] + + RAG->>Session: 更新对话历史
add_exchange() + + RAG-->>-FastAPI: (answer, sources) + + FastAPI-->>-Frontend: JSON Response
{answer, sources, session_id} + + Frontend->>Frontend: 移除加载动画 + Frontend->>Frontend: Markdown转HTML + Frontend->>Frontend: 显示AI回答 + Frontend->>Frontend: 显示来源(可折叠) + Frontend->>Frontend: 重新启用输入框 + + Frontend->>User: 显示完整回答 +``` + +## 数据流转详解 + +### 1. 前端请求阶段 +``` +用户输入 → 前端验证 → 构建请求体 → HTTP POST +``` + +**请求数据结构**: +```json +{ + "query": "What is MCP?", + "session_id": "session_1" +} +``` + +### 2. 后端处理阶段 +``` +FastAPI接收 → 会话管理 → RAG系统 → AI生成器 +``` + +### 3. AI工具调用阶段 +``` +Claude分析 → 决定使用工具 → 执行搜索 → 向量检索 +``` + +### 4. 响应返回阶段 +``` +格式化结果 → Claude生成回答 → 提取来源 → 返回JSON +``` + +**响应数据结构**: +```json +{ + "answer": "MCP stands for Model Context Protocol...", + "sources": [ + "Introduction to MCP Servers - Lesson 1", + "Introduction to MCP Servers - Lesson 2" + ], + "session_id": "session_1" +} +``` + +### 5. 前端渲染阶段 +``` +接收JSON → Markdown渲染 → 显示回答 → 显示来源 +``` + +## 关键交互点 + +### 🔄 会话管理 +- **首次请求**: `session_id = null` → 后端创建新会话 +- **后续请求**: 使用已有 `session_id` → 保持对话上下文 +- **历史限制**: 最多保留2轮对话(4条消息) + +### 🔧 工具调用机制 +- **自动决策**: Claude自动判断是否需要搜索 +- **两次API调用**: + 1. 第一次: 决定工具使用 + 2. 第二次: 基于工具结果生成回答 + +### 🔍 向量搜索 +- **语义匹配**: 使用向量相似度而非关键词 +- **课程解析**: 模糊匹配课程名(如 "MCP" → "Introduction to MCP Servers") +- **过滤机制**: 支持按课程和课时过滤 + +### 📊 来源追踪 +- **搜索时记录**: 工具执行时保存来源 +- **格式化显示**: `课程标题 - Lesson N` +- **可折叠UI**: 前端使用 `
` 标签 + +## 技术栈映射 + +| 层级 | 组件 | 技术 | +|------|------|------| +| 前端 | UI交互 | HTML + CSS + JavaScript | +| 前端 | Markdown渲染 | Marked.js | +| 后端 | Web框架 | FastAPI + Uvicorn | +| 后端 | RAG协调 | Python (自定义) | +| AI | 语言模型 | Claude Sonnet 4 (Anthropic API) | +| 存储 | 向量数据库 | ChromaDB | +| 嵌入 | 向量化 | all-MiniLM-L6-v2 | + +## 性能优化点 + +1. **预构建参数**: AI Generator 预先构建基础API参数 +2. **持久化存储**: ChromaDB 使用磁盘持久化 +3. **去重机制**: 避免重复加载相同课程 +4. **限制结果数**: 最多返回5个搜索结果 +5. **历史限制**: 对话历史限制为2轮,减少token消耗 + +## 错误处理流程 + +```mermaid +graph TD + A[请求开始] --> B{会话存在?} + B -->|否| C[创建新会话] + B -->|是| D[获取历史] + C --> E[RAG处理] + D --> E + E --> F{搜索成功?} + F -->|是| G[格式化结果] + F -->|否| H[返回错误信息] + G --> I[Claude生成回答] + H --> J[显示错误] + I --> K[返回响应] + K --> L[前端显示] + J --> L +``` + +## 文件对应关系 + +| 功能 | 文件路径 | 行号参考 | +|------|----------|----------| +| 前端发送请求 | `frontend/script.js` | 45-96 | +| 后端接收请求 | `backend/app.py` | 56-74 | +| RAG查询处理 | `backend/rag_system.py` | 102-140 | +| AI生成响应 | `backend/ai_generator.py` | 43-135 | +| 工具执行 | `backend/search_tools.py` | 52-114 | +| 向量搜索 | `backend/vector_store.py` | 61-100 | +| 会话管理 | `backend/session_manager.py` | 18-56 | + +--- + +**生成时间**: 2026-01-13 +**系统版本**: RAG Chatbot v1.0 +