Skip to content

Commit 840adf4

Browse files
committed
v1.2.0: 安全漏洞修复与文档更新
安全修复: - 升级 mcp>=1.23.0,修复 3 个高危 CVE (CVE-2025-53365, CVE-2025-53366, CVE-2025-66416) - 替换 PyPDF2 为 pypdf>=6.7.1,修复 CVE-2023-36464 - 添加路径遍历防护,防止任意文件读取攻击 - 错误信息脱敏,移除完整路径泄露 新增: - 添加 project.urls 链接到 GitHub 仓库 变更: - 升级 python-docx>=1.2.0 - 升级 openpyxl>=3.1.5 - CI/CD 迁移到 uv - 更新所有文档的依赖版本说明
1 parent 0eadb5e commit 840adf4

13 files changed

Lines changed: 161 additions & 94 deletions

File tree

.github/workflows/ci.yml

Lines changed: 10 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -17,40 +17,30 @@ jobs:
1717
steps:
1818
- uses: actions/checkout@v4
1919

20-
- name: Set up Python ${{ matrix.python-version }}
21-
uses: actions/setup-python@v5
20+
- name: Install uv
21+
uses: astral-sh/setup-uv@v4
2222
with:
2323
python-version: ${{ matrix.python-version }}
2424

25-
- name: Cache pip dependencies
26-
uses: actions/cache@v4
25+
- name: Set up Python ${{ matrix.python-version }}
26+
uses: actions/setup-python@v5
2727
with:
28-
path: ~/.cache/pip
29-
key: ${{ runner.os }}-pip-${{ matrix.python-version }}-${{ hashFiles('pyproject.toml') }}
30-
restore-keys: |
31-
${{ runner.os }}-pip-${{ matrix.python-version }}-
32-
${{ runner.os }}-pip-
28+
python-version: ${{ matrix.python-version }}
3329

3430
- name: Install dependencies
35-
run: |
36-
python -m pip install --upgrade pip
37-
pip install -e .[dev]
31+
run: uv sync --all-extras
3832

3933
- name: Lint with Ruff
40-
run: |
41-
ruff check mcp_documents_reader.py tests
34+
run: uv run ruff check mcp_documents_reader.py tests
4235

4336
- name: Check formatting with Ruff
44-
run: |
45-
ruff format --check mcp_documents_reader.py tests
37+
run: uv run ruff format --check mcp_documents_reader.py tests
4638

4739
- name: Type check with Basedpyright
48-
run: |
49-
basedpyright mcp_documents_reader.py
40+
run: uv run basedpyright mcp_documents_reader.py
5041

5142
- name: Test with pytest
52-
run: |
53-
pytest --cov-report=xml
43+
run: uv run pytest --cov-report=xml
5444

5545
- name: Upload coverage to Codecov
5646
uses: codecov/codecov-action@v5

.github/workflows/release.yml

Lines changed: 22 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ on:
1111
default: 'false'
1212
type: boolean
1313
version:
14-
description: 'Version to publish (e.g., 1.1.0, required when skip_pypi is true)'
14+
description: 'Version to publish (e.g., 1.2.0, required when skip_pypi is true)'
1515
required: false
1616
type: string
1717

@@ -25,36 +25,30 @@ jobs:
2525
steps:
2626
- uses: actions/checkout@v4
2727

28-
- name: Set up Python ${{ matrix.python-version }}
29-
uses: actions/setup-python@v5
28+
- name: Install uv
29+
uses: astral-sh/setup-uv@v4
3030
with:
3131
python-version: ${{ matrix.python-version }}
3232

33-
- name: Cache pip dependencies
34-
uses: actions/cache@v4
33+
- name: Set up Python ${{ matrix.python-version }}
34+
uses: actions/setup-python@v5
3535
with:
36-
path: ~/.cache/pip
37-
key: ${{ runner.os }}-pip-${{ matrix.python-version }}-${{ hashFiles('pyproject.toml') }}
38-
restore-keys: |
39-
${{ runner.os }}-pip-${{ matrix.python-version }}-
40-
${{ runner.os }}-pip-
36+
python-version: ${{ matrix.python-version }}
4137

4238
- name: Install dependencies
43-
run: |
44-
python -m pip install --upgrade pip
45-
pip install -e .[dev]
39+
run: uv sync --all-extras
4640

4741
- name: Lint with Ruff
48-
run: ruff check mcp_documents_reader.py tests
42+
run: uv run ruff check mcp_documents_reader.py tests
4943

5044
- name: Check formatting with Ruff
51-
run: ruff format --check mcp_documents_reader.py tests
45+
run: uv run ruff format --check mcp_documents_reader.py tests
5246

5347
- name: Type check with Basedpyright
54-
run: basedpyright mcp_documents_reader.py
48+
run: uv run basedpyright mcp_documents_reader.py
5549

5650
- name: Run tests with coverage
57-
run: pytest --cov-report=xml
51+
run: uv run pytest --cov-report=xml
5852

5953
build:
6054
needs: test
@@ -65,30 +59,19 @@ jobs:
6559
fetch-depth: 0
6660
fetch-tags: true
6761

62+
- name: Install uv
63+
uses: astral-sh/setup-uv@v4
64+
6865
- name: Set up Python
6966
uses: actions/setup-python@v5
7067
with:
7168
python-version: "3.10"
7269

73-
- name: Cache pip dependencies
74-
uses: actions/cache@v4
75-
with:
76-
path: ~/.cache/pip
77-
key: ${{ runner.os }}-pip-build-${{ hashFiles('pyproject.toml') }}
78-
restore-keys: |
79-
${{ runner.os }}-pip-build-
80-
${{ runner.os }}-pip-
81-
82-
- name: Install dependencies
83-
run: |
84-
python -m pip install --upgrade pip
85-
pip install build twine
86-
8770
- name: Build package
88-
run: python -m build
71+
run: uv build
8972

9073
- name: Check distribution
91-
run: python -m twine check --strict dist/*
74+
run: uv run twine check --strict dist/*
9275

9376
- name: Upload Artifacts
9477
uses: actions/upload-artifact@v4
@@ -136,17 +119,20 @@ jobs:
136119
137120
- name: Update server.json version
138121
run: |
139-
VERSION=${{ steps.get_version.outputs.VERSION }}
122+
VERSION="${{ steps.get_version.outputs.VERSION }}"
140123
python -c "
141124
import json
125+
import os
142126
with open('server.json', 'r') as f:
143127
data = json.load(f)
144-
data['version'] = '$VERSION'
128+
data['version'] = os.environ.get('VERSION', '')
145129
for pkg in data.get('packages', []):
146-
pkg['version'] = '$VERSION'
130+
pkg['version'] = os.environ.get('VERSION', '')
147131
with open('server.json', 'w') as f:
148132
json.dump(data, f, indent=2)
149133
"
134+
env:
135+
VERSION: ${{ steps.get_version.outputs.VERSION }}
150136

151137
- name: Download mcp-publisher
152138
run: |

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -209,10 +209,10 @@ Read any supported document type.
209209
## Dependencies
210210

211211
### Core Dependencies
212-
- `mcp` >= 0.1.0 - MCP protocol implementation
213-
- `python-docx` >= 0.8.11 - DOCX file reading
214-
- `PyPDF2` >= 3.0.1 - PDF file reading
215-
- `openpyxl` >= 3.0.10 - Excel file reading
212+
- `mcp` >= 1.23.0 - MCP protocol implementation
213+
- `python-docx` >= 1.2.0 - DOCX file reading
214+
- `pypdf` >= 6.7.1 - PDF file reading (replaces PyPDF2)
215+
- `openpyxl` >= 3.1.5 - Excel file reading
216216

217217
### Development Dependencies
218218
- `pytest` >= 8.0.0 - Testing framework

README.zh-CN.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -209,10 +209,10 @@ if DocumentReaderFactory.is_supported("file.xlsx"):
209209
## 依赖
210210

211211
### 核心依赖
212-
- `mcp` >= 0.1.0 - MCP 协议实现
213-
- `python-docx` >= 0.8.11 - DOCX 文件读取
214-
- `PyPDF2` >= 3.0.1 - PDF 文件读取
215-
- `openpyxl` >= 3.0.10 - Excel 文件读取
212+
- `mcp` >= 1.23.0 - MCP 协议实现
213+
- `python-docx` >= 1.2.0 - DOCX 文件读取
214+
- `pypdf` >= 6.7.1 - PDF 文件读取(替代 PyPDF2)
215+
- `openpyxl` >= 3.1.5 - Excel 文件读取
216216

217217
### 开发依赖
218218
- `pytest` >= 8.0.0 - 测试框架

docs/en/API.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,8 @@ content = reader.read("/path/to/document.docx")
6262

6363
Reads PDF documents.
6464

65+
> **Note:** Starting from v1.2.0, the PDF reader has been migrated from PyPDF2 to pypdf (more secure and better maintained).
66+
6567
```python
6668
from mcp_documents_reader import PdfReader
6769

docs/en/CHANGELOG.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,32 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [1.2.0] - 2025-03-02
9+
10+
### Security Fixes
11+
12+
- **MCP SDK Security Vulnerabilities**: Upgraded mcp>=1.23.0, fixed 3 high-severity CVEs
13+
- CVE-2025-53365: Unhandled exception in Streamable HTTP Transport leading to DoS
14+
- CVE-2025-53366: FastMCP Server validation error leading to DoS
15+
- CVE-2025-66416: DNS rebinding protection not enabled by default
16+
- **PyPDF2 Security Vulnerability**: Replaced with pypdf>=6.7.1, fixed CVE-2023-36464
17+
- **Path Traversal Protection**: Added explicit path validation to prevent arbitrary file read attacks
18+
- **Error Message Sanitization**: Removed full paths from error messages to prevent information disclosure
19+
20+
### Added
21+
22+
- **PyPI Package Metadata**: Added project.urls linking to GitHub repository
23+
24+
### Changed
25+
26+
- **Dependency Upgrades**:
27+
- mcp>=0.1.0 → mcp>=1.23.0
28+
- PyPDF2>=3.0.1 → pypdf>=6.7.1
29+
- python-docx>=0.8.11 → python-docx>=1.2.0
30+
- openpyxl>=3.0.10 → openpyxl>=3.1.5
31+
- typing_extensions>=4.0.0 → typing_extensions>=4.12.0
32+
- **CI/CD Migration**: Migrated from pip to uv for faster builds
33+
834
## [1.1.0] - 2025-03-01
935

1036
### Fixed

docs/zh/API.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,8 @@ content = reader.read("/path/to/document.docx")
6262

6363
读取 PDF 文档。
6464

65+
> **注意:** 从 v1.2.0 开始,PDF 读取器已从 PyPDF2 迁移到 pypdf(更安全、维护更好)。
66+
6567
```python
6668
from mcp_documents_reader import PdfReader
6769

docs/zh/CHANGELOG.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,32 @@
55
格式基于 [Keep a Changelog](https://keepachangelog.com/zh-CN/1.0.0/)
66
本项目遵循 [语义化版本](https://semver.org/lang/zh-CN/)
77

8+
## [1.2.0] - 2025-03-02
9+
10+
### 安全修复
11+
12+
- **MCP SDK 安全漏洞**:升级 mcp>=1.23.0,修复 3 个高危 CVE
13+
- CVE-2025-53365: Streamable HTTP Transport 未处理异常导致 DoS
14+
- CVE-2025-53366: FastMCP Server 验证错误导致 DoS
15+
- CVE-2025-66416: DNS rebinding 保护默认未启用
16+
- **PyPDF2 安全漏洞**:替换为 pypdf>=6.7.1,修复 CVE-2023-36464
17+
- **路径遍历防护**:添加显式路径验证,防止任意文件读取攻击
18+
- **错误信息脱敏**:移除错误信息中的完整路径,防止信息泄露
19+
20+
### 新增
21+
22+
- **PyPI 包元数据**:添加 project.urls,链接到 GitHub 仓库
23+
24+
### 变更
25+
26+
- **依赖升级**
27+
- mcp>=0.1.0 → mcp>=1.23.0
28+
- PyPDF2>=3.0.1 → pypdf>=6.7.1
29+
- python-docx>=0.8.11 → python-docx>=1.2.0
30+
- openpyxl>=3.0.10 → openpyxl>=3.1.5
31+
- typing_extensions>=4.0.0 → typing_extensions>=4.12.0
32+
- **CI/CD 迁移**:从 pip 迁移到 uv,提升构建速度
33+
834
## [1.1.0] - 2025-03-01
935

1036
### 修复

mcp_documents_reader.py

Lines changed: 34 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
from docx import Document as DocxDocument
88
from mcp.server.fastmcp import FastMCP
99
from openpyxl import load_workbook
10-
from PyPDF2 import PdfReader as PyPdfReader
10+
from pypdf import PdfReader as PyPdfReader
1111
from typing_extensions import override
1212

1313
# Directory where documents are stored
@@ -193,12 +193,34 @@ def is_supported(cls, file_path: str) -> bool:
193193

194194

195195
def _get_document_path(ctx: object, filename: str) -> str:
196-
"""Get full document path from context or environment"""
197-
try:
198-
doc_dir = getattr(ctx, "document_directory", DOCUMENT_DIRECTORY)
199-
except Exception:
200-
doc_dir = DOCUMENT_DIRECTORY
201-
return os.path.join(doc_dir, filename)
196+
"""获取文档路径,防止路径遍历攻击。
197+
198+
Args:
199+
ctx: FastMCP 上下文对象
200+
filename: 文件名
201+
202+
Returns:
203+
str: 安全的完整文件路径
204+
205+
Raises:
206+
ValueError: 当检测到路径遍历攻击时
207+
"""
208+
doc_dir = getattr(ctx, "document_directory", DOCUMENT_DIRECTORY)
209+
210+
# 使用 basename 防止路径遍历
211+
safe_filename = os.path.basename(filename)
212+
213+
# 构建完整路径
214+
full_path = os.path.join(doc_dir, safe_filename)
215+
216+
# 验证路径在允许的目录内
217+
real_path = os.path.realpath(full_path)
218+
real_doc_dir = os.path.realpath(doc_dir)
219+
220+
if not real_path.startswith(real_doc_dir + os.sep) and real_path != real_doc_dir:
221+
raise ValueError("Access denied: path outside document directory")
222+
223+
return full_path
202224

203225

204226
@mcp.tool()
@@ -211,10 +233,13 @@ def read_document(ctx: object, filename: str) -> str:
211233
:param filename: Name of the document file to read
212234
:return: Extracted text from the document
213235
"""
214-
doc_path = _get_document_path(ctx, filename)
236+
try:
237+
doc_path = _get_document_path(ctx, filename)
238+
except ValueError:
239+
return "Error: Invalid file path."
215240

216241
if not os.path.exists(doc_path):
217-
return f"Error: File '{filename}' not found at {doc_path}."
242+
return f"Error: File '{filename}' not found."
218243

219244
if not DocumentReaderFactory.is_supported(doc_path):
220245
return f"Error: Unsupported document type for file '{filename}'."

pyproject.toml

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "mcp-documents-reader"
3-
version = "1.1.0"
3+
version = "1.2.0"
44
description = "An MCP enabled multi-format document reader supporting DOCX, PDF, TXT, and Excel files"
55
keywords = ["mcp", "model-context-protocol", "document-reader", "pdf", "docx", "excel"]
66
authors = [
@@ -9,13 +9,19 @@ authors = [
99
readme = "README.md"
1010
requires-python = ">=3.10"
1111
dependencies = [
12-
"mcp>=0.1.0",
13-
"python-docx>=0.8.11",
14-
"PyPDF2>=3.0.1",
15-
"openpyxl>=3.0.10",
16-
"typing_extensions>=4.0.0"
12+
"mcp>=1.23.0",
13+
"python-docx>=1.2.0",
14+
"pypdf>=6.7.1",
15+
"openpyxl>=3.1.5",
16+
"typing_extensions>=4.12.0"
1717
]
1818

19+
[project.urls]
20+
Homepage = "https://github.com/xt765/mcp_documents_reader"
21+
Repository = "https://github.com/xt765/mcp_documents_reader"
22+
Issues = "https://github.com/xt765/mcp_documents_reader/issues"
23+
Documentation = "https://github.com/xt765/mcp_documents_reader#readme"
24+
1925
[project.optional-dependencies]
2026
dev = [
2127
"ruff>=0.8.0",

0 commit comments

Comments
 (0)