Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<p>
<a href="https://github.com/2002yy/study-agent/actions/workflows/ci.yml"><img src="https://github.com/2002yy/study-agent/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
<img src="https://img.shields.io/badge/python-3.12-blue" alt="Python 3.12">
<img src="https://img.shields.io/badge/tests-277%20passed-green" alt="277 tests passed">
<img src="https://img.shields.io/badge/tests-290%20passed-green" alt="290 tests passed">
</p>

A local AI learning assistant with long-term memory, role-based group chat,
Expand All @@ -17,7 +17,7 @@ Study Agent 是一个本地优先的 AI 学习助手,重点不是简单调用
- **长期记忆**:Markdown memory + safe writer
- **上下文分层**:fast / light / deep / archive
- **联网搜索**:RSS / News fetch → article extraction → LLM digest → source tracing
- **RAG MVP**:本地 Markdown / TXT / DOCX / PDF 索引、关键词 / 本地向量原型 / hybrid / backend-vector 检索、可配置 embedding provider、可选 Chroma 持久化、引用上下文、来源块、Streamlit 检索/调试面板、聊天注入和 FastAPI RAG 接口
- **RAG MVP**:本地 Markdown / TXT / DOCX / PDF 索引、关键词 / 本地向量原型 / hybrid / backend-vector 检索、可配置 embedding provider、可选 Chroma 持久化、受控本地知识检索工具、引用上下文、来源块、Streamlit 检索/调试面板、聊天注入和 FastAPI RAG / chat / memory 基础接口
- **工程安全**:SSRF protection、detect-secrets、配置模板
- **工程质量**:pytest 测试套件、Ruff、GitHub Actions CI、打包检查

Expand All @@ -27,11 +27,11 @@ Study Agent 是一个本地优先的 AI 学习助手,重点不是简单调用
- **Model routing** with fast / light / deep / archive context tiers
- **Long-term memory** based on Markdown files and safe-writer persistence
- **Web search pipeline**: feed registry → URL safety checks → article extraction → LLM digest → auditable source trace
- **RAG MVP**: local Markdown / TXT / DOCX / PDF indexing, lexical / local vector prototype / hybrid / backend-vector retrieval, configurable embedding providers, optional Chroma persistence, citation-first context formatting, source blocks, a Streamlit retrieval/debug panel, optional chat injection, and FastAPI RAG endpoints
- **RAG MVP**: local Markdown / TXT / DOCX / PDF indexing, lexical / local vector prototype / hybrid / backend-vector retrieval, configurable embedding providers, optional Chroma persistence, a controlled local-knowledge retrieval tool, citation-first context formatting, source blocks, a Streamlit retrieval/debug panel, optional chat injection, and FastAPI RAG / chat / memory foundation endpoints
- **SSRF protection** for article fetching, **detect-secrets** in CI
- **Batched session logging** and multi-layer caching for performance
- **Performance budget**: mode-based `max_tokens` bounds on the main chat, WeChat, and news LLM paths
- **277 pytest tests**, Ruff clean, mypy clean, GitHub Actions CI workflow
- **290 pytest tests**, Ruff clean, mypy clean, GitHub Actions CI workflow

For a detailed breakdown of the stack and engineering highlights, see [Technical Stack & Engineering Highlights](docs/TECH_STACK.md).

Expand Down Expand Up @@ -109,7 +109,7 @@ Study Agent 的定位很明确:**一个运行在你本地的、有长期记忆
| **角色群聊** | 四位角色(三月七、刻晴、纳西妲、流萤)群聊讨论,各有独立人设 |
| **联网搜索** | Google News + Bing News + RSSHub 多源聚合,页面正文三层提取 |
| **来源追溯** | 搜索结果写入群聊记录,可回溯依据 |
| **RAG MVP** | 本地 Markdown / TXT / DOCX / PDF 文档索引,前端面板返回带文件路径、行号、分数、命中词和 score breakdown 的引用片段,并可注入单人聊天和微信群互动回复;FastAPI 提供 `/health`、`/rag`、`/rag/index`、`/rag/query` |
| **RAG MVP** | 本地 Markdown / TXT / DOCX / PDF 文档索引,前端面板返回带文件路径、行号、分数、命中词和 score breakdown 的引用片段,并可注入单人聊天和微信群互动回复;FastAPI 提供 `/health`、`/rag`、`/rag/index`、`/rag/query`、`/rag/status`、`/rag/upload`、`/rag/local-knowledge` |
| **课后总结** | 学习完成后自动总结进展,用户确认后写入记忆 |
| **长期记忆** | 学习者画像、进度追踪、项目上下文、当前焦点,多级记忆档案 |
| **多 Provider** | 支持 OpenAI / DeepSeek / OpenRouter / SiliconFlow / 本地模型 |
Expand Down Expand Up @@ -233,7 +233,7 @@ RAG_EMBEDDING_PROVIDER=local_hash
│ ├── llm_router.py # 模型路由分发
│ ├── context_builder.py # 上下文构建
│ ├── mode_manager.py # 模式管理(版本/性能/氛围)
│ ├── api.py # FastAPI health / RAG endpoints
│ ├── api.py # FastAPI health / chat / memory / sessions / RAG endpoints
│ ├── role_manager.py # 角色加载与管理
│ ├── performance_budget.py # 性能预算(max_tokens 分级)
│ ├── memory.py # 记忆系统
Expand All @@ -250,6 +250,7 @@ RAG_EMBEDDING_PROVIDER=local_hash
│ ├── router.py # 路由配置
│ ├── news/ # 新闻聚合链路
│ ├── rag/ # 本地 RAG MVP:加载、分块、索引、关键词/向量原型/embedding/可选后端检索
│ ├── tools/ # 受控工具边界:本地知识检索等
│ └── ui/ # Streamlit UI 组件
├── tests/ # pytest 测试套件
├── docs/ # 设计文档与工程说明
Expand All @@ -270,7 +271,7 @@ RAG_EMBEDDING_PROVIDER=local_hash
## 测试

```bash
pytest tests/ -v # current local baseline: 277 passed
pytest tests/ -v # current local baseline: 290 passed
pytest tests/ --cov=src # 覆盖率
ruff check src/ tests/ # linting
mypy --explicit-package-bases src/ # type check
Expand Down Expand Up @@ -312,8 +313,8 @@ CI 通过 GitHub Actions 在 push / pull request 上运行,集成 `pytest`、`

求职导向的技术演进路线:

- [ ] FastAPI service layer (partial): `/health`, `/rag`, `/rag/index`, `/rag/query` implemented; `/chat` and `/memory` remain planned
- [x] RAG MVP: Markdown / TXT / DOCX / PDF loading, chunking, local keyword retrieval, local vector prototype, hybrid retrieval, backend-vector retrieval, configurable embedding provider, optional Chroma adapter, citation context, source blocks, Streamlit retrieval panel, optional single-chat and WeChat interactive injection
- [x] FastAPI service layer foundation: `/health`, `/chat`, `/memory/preview`, `/memory/commit`, `/sessions`, `/rag`, `/rag/index`, `/rag/query`, `/rag/status`, `/rag/upload` and `/rag/local-knowledge` implemented; streaming, auth and frontend-specific contracts remain planned
- [x] RAG MVP: Markdown / TXT / DOCX / PDF loading, chunking, local keyword retrieval, local vector prototype, hybrid retrieval, backend-vector retrieval, configurable embedding provider, optional Chroma adapter, controlled local-knowledge retrieval, citation context, source blocks, Streamlit retrieval panel, optional single-chat and WeChat interactive injection
- [ ] RAG document QA (partial): PDF parsing has file-size, page-count, extracted-text and encrypted-file guards; production embedding requires explicit API/env configuration and Chroma remains optional
- [ ] Vector store: Chroma optional adapter implemented; FAISS local prototype and pgvector engineering version remain planned
- [ ] Web UI: TypeScript + Vue3 / React, streaming chat, source panel
Expand Down
2 changes: 1 addition & 1 deletion docs/INTERVIEW_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Study Agent 是一个本地优先的 AI 学习助手,重点在多 Provider 模
2. **长期记忆写入安全** — safe writer + preview/confirm 机制,防止不可逆的记忆污染
3. **联网搜索来源追溯** — Feed registry / RSS 多源聚合 → URL safety matrix → 文章正文三层提取 → LLM digest → pipeline trace 全过程来源可回溯
4. **Streamlit 重渲染性能优化** — 多层缓存策略、按模式批量落盘、主链路 token 预算控制
5. **CI / Ruff / detect-secrets 工程检查** — 277 pytest tests、Ruff clean、mypy local clean、GitHub Actions workflow、detect-secrets 对未豁免发现硬阻断
5. **CI / Ruff / detect-secrets 工程检查** — 290 pytest tests、Ruff clean、mypy local clean、GitHub Actions workflow、detect-secrets 对未豁免发现硬阻断

## 可讲亮点

Expand Down
32 changes: 25 additions & 7 deletions docs/RAG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,11 @@ Implemented:
- Streamlit retrieval panel for uploads, local paths, indexing, querying and citation preview
- Optional single-chat and WeChat interactive reply injection through the `用于聊天回答` toggle
- UI source blocks for retrieved file paths, line ranges, scores and matched terms
- FastAPI endpoints: `GET /health`, `POST /rag`, `POST /rag/index`, `POST /rag/query`
- FastAPI endpoints: `GET /health`, `POST /rag`, `POST /rag/index`, `POST /rag/query`, `GET /rag/status`, `POST /rag/upload`, `POST /rag/local-knowledge`
- Streamlit knowledge/debug panel with index summary, document rows, chunk preview and score breakdowns
- Optional vector backend interface with local fallback and Chroma adapter
- Configurable embedding providers: deterministic `local_hash` by default, OpenAI-compatible embeddings when explicitly configured
- Controlled local-knowledge retrieval tool with intent gating, deterministic query rewrite and explicit not-found behavior

Not implemented yet:

Expand All @@ -44,7 +45,8 @@ Not implemented yet:
| `src/rag/eval.py` | LLM-free retrieval quality evaluation over gold query fixtures |
| `src/rag/service.py` | Application-facing helpers for indexing, querying and context formatting |
| `src/rag/schema.py` | Dataclasses for documents, chunks, indexes and search results |
| `src/api.py` | FastAPI health and RAG endpoints |
| `src/tools/local_knowledge.py` | Controlled retrieval boundary for agentic local knowledge use |
| `src/api.py` | FastAPI health, chat, memory, session, RAG and local-knowledge endpoints |

## Data Flow

Expand All @@ -56,7 +58,9 @@ local files
-> save_rag_index
-> query_documents
-> build_rag_context
-> optional controlled local-knowledge tool
-> optional single-chat / WeChat interactive prompt injection or FastAPI response
-> optional frontend-facing chat / memory / session API flow
```

## Retrieval Behavior
Expand Down Expand Up @@ -111,8 +115,10 @@ Regression coverage lives in `tests/test_rag.py` and verifies:
- Local hash-vector and hybrid retrieval behavior
- Citation formatting and context budget behavior
- Streamlit RAG panel helpers for uploaded filenames and local path parsing
- FastAPI `/health`, `/rag`, `/rag/index` and `/rag/query`
- FastAPI `/health`, `/rag`, `/rag/index`, `/rag/query`, `/rag/status`, `/rag/upload` and `/rag/local-knowledge`
- FastAPI `/chat`, `/memory/preview`, `/memory/commit`, `/sessions` and `/sessions/{session_id}/flush`
- Prompt injection behavior for cited RAG context
- Controlled local-knowledge tool behavior for skip / found / not-found / rewrite

`tests/test_rag_eval.py` adds a small gold fixture suite under `tests/fixtures/rag_eval/` and verifies:

Expand Down Expand Up @@ -182,7 +188,19 @@ Goal: turn the Streamlit expander into a usable knowledge panel.

Goal: let the model retrieve when it needs evidence instead of always pre-retrieving.

- Add a `retrieve_local_knowledge(query)` tool boundary.
- Route retrieval only for knowledge-grounded questions.
- Allow query rewrite and second-pass retrieval when first-pass evidence is weak.
- Require explicit "not found in local knowledge" behavior when no source is retrieved.
- [x] Add a `retrieve_local_knowledge(query)` tool boundary.
- [x] Route retrieval only for knowledge-grounded questions through deterministic intent gating.
- [x] Allow deterministic query rewrite and second-pass retrieval when first-pass evidence is weak.
- [x] Require explicit "not found in local knowledge" behavior when no source is retrieved.
- [x] Expose the same boundary through `POST /rag/local-knowledge` for future frontends.
- [ ] Add LLM tool-calling / function-calling integration; current implementation is controlled pre-generation retrieval, not free-form tool use.

### P8: Service API Foundation

Goal: expose the current local-first capabilities through stable API boundaries before building a separate web frontend.

- [x] Add RAG status and upload endpoints for index inspection and rebuilds.
- [x] Add a non-streaming `/chat` endpoint that reuses model routing, role prompts, memory bundles, local-knowledge retrieval and session logging.
- [x] Add memory preview / commit endpoints with the same runtime write-mode guard as the Streamlit UI.
- [x] Add session listing and force-flush endpoints for local session inspection.
- [ ] Add streaming chat, auth, CORS policy and frontend-oriented error envelopes before public or LAN deployment.
66 changes: 37 additions & 29 deletions docs/STUDY_AGENT_OPTIMIZATION_ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ Study Agent 后续的核心竞争力应该来自 RAG,而不是普通聊天。

不要让模型无限制自由调用工具,而是先用可控路由实现稳定 Agent 工作流。

## 9. P1:FastAPI 服务化
## 9. P8:FastAPI 服务化

不建议立刻推翻 Streamlit。推荐三步走:

Expand All @@ -298,7 +298,7 @@ Streamlit UI → core/chat_engine.py

### 阶段 2:增加 FastAPI

最小接口
当前基础接口已经落地

```text
GET /health
Expand All @@ -307,12 +307,21 @@ POST /memory/preview
POST /memory/commit
POST /rag/upload
POST /rag/query
GET /rag/status
POST /rag/local-knowledge
GET /sessions
POST /sessions/{session_id}/flush
```

仍需补齐:streaming chat、auth、CORS、统一错误响应、OpenAPI 示例和 Docker 部署配置。

### 阶段 3:补前端

前端可用 Vue3 或 React。推荐先 Vue3,开发成本较低。
前端建议进入 P9 后使用 React + Vite + TypeScript。理由是:

- React 生态更适合后续做聊天流、引用面板、调试抽屉和状态组件拆分。
- Vite 开发服务器启动快,生产构建输出静态 `dist`,可以独立部署,也可以由 FastAPI 挂载静态目录。
- TypeScript 能把 API response、RAG source、memory preview、session row 等数据结构固定下来,减少前后端联调时的隐性字段漂移。

最低页面:

Expand Down Expand Up @@ -368,7 +377,7 @@ GET /sessions
| RAG 测试 | chunk、入库、检索、引用来源 |
| Tool 测试 | 新闻检索、文件读取、摘要 |
| ContextBuilder 测试 | 不同模式下上下文是否正确 |
| API 测试 | /chat、/health、/rag/query |
| API 测试 | /chat、/health、/rag/query、/rag/upload、/rag/status、/memory/preview、/memory/commit、/sessions |
| UI smoke 测试 | 页面能打开、基本交互不崩 |

最关键的是 Mock Provider。真实模型用于演示和实际使用,Mock Provider 用于自动测试和 CI,避免测试依赖外部 API。
Expand Down Expand Up @@ -438,23 +447,23 @@ docs/

任务:

1. 增加 FastAPI
2. 实现 /health
3. 实现 /chat
4. 实现 /rag/upload
5. 实现 /rag/query
6. 实现 /memory/preview
7. 实现 /memory/commit
8. 补 API 测试
9. Docker Compose
1. [x] 增加 FastAPI
2. [x] 实现 /health
3. [x] 实现 /chat(当前为非流式)
4. [x] 实现 /rag/upload
5. [x] 实现 /rag/query
6. [x] 实现 /memory/preview
7. [x] 实现 /memory/commit
8. [x] 补 API 测试
9. [ ] 补 streaming chat / auth / CORS / Docker Compose

### v1.0:前端产品化版本

目标:能演示、能截图、能部署、能写简历。

任务:

1. Vue3 / React 前端
1. React + Vite + TypeScript 前端
2. 聊天页
3. 文件上传页
4. 知识库列表页
Expand All @@ -479,28 +488,27 @@ docs/

## 15. 当前最建议执行的下一步

第一步先画清主流程并拆模块
当前主流程已经可以按 FastAPI 边界继续收口

```text
用户输入
→ UI 接收
→ Streamlit 或 Web UI 接收
→ FastAPI /chat
→ memory 读取
→ context 构建
→ tool 判断
local knowledge tool 判断
→ provider 调用
stream 输出
response 输出
→ session 记录
→ memory 写回确认
```

推荐重构顺序:

1. Provider 抽象稳定
2. MemoryManager 稳定
3. ContextBuilder 稳定
4. SessionLogger 批量写入
5. ToolRouter 初步成型
6. Streamlit 只保留 UI
7. 再加 FastAPI
8. 再加 RAG
9. 最后做前端
推荐推进顺序:

1. [x] Provider 抽象稳定
2. [x] Memory / ContextBuilder 基础稳定
3. [x] SessionLogger 批量写入
4. [x] RAG MVP 与 local knowledge tool
5. [x] FastAPI 基础服务层
6. [ ] streaming chat / auth / CORS / Docker
7. [ ] React + Vite + TypeScript 前端
Loading
Loading