feat(security): mcp tier + handler rate-limit sweep#212
Conversation
MCP is a distinct interface from the human-facing REST API: server-to-
server, always API-key-authenticated, much chattier per session than
human-driven REST traffic (LLM agents iterate through tool calls inside
a conversation). The api tier's 100/min cap is too tight for legitimate
agent workloads and IP keying collapses two customers behind the same
NAT'd egress into one bucket.
This commit adds the tier; it does not yet wire mcp/route.ts to drop its
redundant apiLimiter.check (that lands in the handler-sweep commits
later in this PR — the policy rule has to exist first).
Source changes:
- lib/security/constants.ts: add LIMITS.MCP = envInt('RATE_LIMIT_MCP', 300)
- lib/security/rate-limit.ts: add mcpLimiter; extend RateLimitTier union
with 'mcp'; add mcp -> mcpLimiter entry to RATE_LIMIT_TIERS registry
- lib/security/rate-limit-policy.ts: insert new rule
{ match: /^\/api\/v1\/mcp(\/|$)/, tier: 'mcp', key: 'api-key' }
BEFORE the api consumer block and the catch-all so MCP requests don't
fall through to session-user keying (which would defeat the point)
Why api-key keying matters:
- Two customers sharing a NAT'd egress IP get independent buckets
(previously collapsed into one bucket under apiLimiter's IP keying)
- One key abused across many IPs is correctly bucketed under a single
api-key bucket (previously each IP got its own 100/min bucket)
- Per-customer budgets remain tunable via the existing McpRateLimiter
per-key sub-cap; the section tier here is the coarse ceiling above
Why 300/min:
- LLM agents fire rapid tool-call sequences within sessions; 100/min
trips on legitimate use, defeating the cap's purpose
- 300/min leaves room for normal agent activity while still rate-
limiting a runaway loop within ~5 seconds
- Override via RATE_LIMIT_MCP for ops who need to tune per deployment
Docs:
- .context/security/rate-limiting.md: new tier in the policy table
(rule #5, renumbers consumer rules); 5-tier section table; explicit
paragraph on why mcp is separate from api; updated Configuration
table with RATE_LIMIT_MCP row; tier example in "Adding a new tier"
section refreshed to show the 5-tier baseline
- .env.example: new RATE_LIMIT_MCP=1000 row commented out
Tests (+6):
- rate-limit.test.ts: mcpLimiter config test (300/min default);
RATE_LIMIT_TIERS.mcp identity assertion; full bucket-exhaustion
enforcement test mirroring the orchestration tier
- rate-limit-policy.test.ts: tier-resolution test for /api/v1/mcp/*
paths; boundary test for bare /api/v1/mcp (no trailing slash);
declared-order test updated (was 9 rules, now 10); first-match-wins
test extended to include mcp ahead of catch-all
- constants.test.ts: MCP default test (300); env-var override test;
four-way independence test extended to include MCP
Coverage:
- All 3 modified source files at 100/100/100/100
- Full suite: 850/850 files, 17,915 tests (+6), 6 skipped, 10 todo
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lers Section-level rate limiting for /api/v1/admin/** is now enforced centrally by proxy.ts via the policy table (the admin and orchestration tiers from PR #211). Per-handler adminLimiter.check calls became redundant defense- in-depth on PR #211's merge — this commit removes the now-dead code along with its tests. Source changes (139 handlers under app/api/v1/admin/orchestration/): - Remove `const rateLimit = adminLimiter.check(<arg>)` + the `if (!rateLimit.success) return createRateLimitResponse(rateLimit)` check from every handler. - Strip the `const clientIP = getClientIP(request)` declaration when clientIP is no longer used in the handler body (per-handler scope analysis via brace-matching). Preserve it in 57 handlers that still use clientIP for `logAdminAction({ clientIp: clientIP })` audit logging. - Rename the handler `request` arg to `_request` when removing `getClientIP(request)` left it unused (TypeScript's noUnusedLocals). - Clean up imports: drop `adminLimiter`, `createRateLimitResponse`, and `getClientIP` from `@/lib/security/rate-limit` / `@/lib/security/ip` when they're no longer referenced (multi-line import blocks handled). - Update two stale docstrings: `costs/route.ts` and `settings/route.ts` both said "rate-limited via `adminLimiter`" — now point at the middleware policy table. Special-case preservation: - `conversations/export/route.ts` — KEPT its `adminLimiter.check(\`export:${ip}\`)` call. This is a deliberate per-flow custom-token sub-cap on the admin bucket, not a section-level redundancy. The codemod detected the template-literal token shape and skipped the file. - All per-flow sub-limiters (chatLimiter, audioLimiter, imageLimiter, contactLimiter, inboundLimiter, embedChatLimiter, etc.) untouched. Test changes (143 test files under tests/{unit,integration}/.../admin/): - Delete `describe('...Rate [lL]imit{ing}...', ...)` blocks when the block body references adminLimiter (false-positive guard: blocks for other limiters like cspReportLimiter survive). - Delete standalone `it('returns 429 when ...')` / `it('calls adminLimiter.check ...')` blocks whose body references adminLimiter. - Delete `vi.mock('@/lib/security/rate-limit', ...)` blocks that ONLY mocked adminLimiter and createRateLimitResponse — preserve mocks that also carry sub-limiters. - Delete stray `vi.mocked(adminLimiter.check).mockReturnValue(...)` setup lines (with multi-line `();` handling). - Conditionally delete `vi.mocked(createRateLimitResponse).mockReturnValue(...)` setups: only when the file's vi.mock for `@/lib/security/rate-limit` was also deleted (otherwise the symbol is still mocked and the setup is legitimate — preserves cspReportLimiter test in csp-report/route.test.ts). - Clean up dead imports of adminLimiter / createRateLimitResponse from test files (multi-line aware), excluding mock-factory bodies and comment references from the "still used" count. ESLint config: add `.claude/tmp/**` to the ignore list so the throwaway codemod scripts (used to drive this sweep) don't trigger lint errors during `npm run validate` — same rationale as the existing `.claude/worktrees/**` exclusion. Net diff: 283 files changed, +25 / -4,373 lines 139 source files, 143 test files, 1 config file Validation: Type-check: clean Lint: clean (524 pre-existing warnings, 0 new errors) Format: clean Tests: 850/850 files passing, 17,734 tests (-173 deleted dead tests), 6 skipped, 10 todo The 173 deleted tests covered the contract "handler calls adminLimiter.check; on failure returns 429" — that contract has moved to the middleware dispatcher and is covered by tests at the right layer in tests/unit/lib/security/rate-limit-middleware.test.ts (the limiter-exhaustion → 429 + envelope shape suite added in PR #211). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lers Companion to PR #211's middleware-driven rate limiting and to the prior commit's admin sweep. The api tier (100/min) and the new mcp tier (300/min, from commit 1 of this PR) are enforced centrally by proxy.ts via the policy table, so per-handler apiLimiter.check calls became redundant defense-in-depth. Source changes (13 handlers): - 11 codemod-applied: provider-models/route.ts (+ recommend), chat/conversations/[id]/route.ts (+ share + messages/.../rate), embed/widget-config/route.ts, orchestration/approvals/[id]/status, user/api-keys (+ [keyId]), users/me/preferences, webhooks/trigger/[slug]. - 2 manual: chat/stream/route.ts and mcp/route.ts both have per-flow sub-limiters (chatLimiter / agentChatLimiter / imageLimiter and McpRateLimiter respectively) that stay in the handler as additive caps. Only the redundant apiLimiter.check section call was removed. Special-case docstring updates: - mcp/route.ts file header: was "Rate limited at IP level via apiLimiter, then per-key via McpRateLimiter" — now describes the layered rate-limiting model (proxy applies the mcp tier; the handler's McpRateLimiter is the per-key sub-cap). - mcp/route.ts POST: removed the stale "// 1. IP-level rate limit" numbered comment that pointed at the now-removed block; replaced with a brief note explaining where rate limiting actually happens. - embed/widget-config/route.ts: stripped the now-dead `rateKey = \`${token}:${clientIp}\`` declaration that was only used to build the apiLimiter token. Test changes (14 test files, 13 deleted dead tests): - Delete describe('...rate limit...') and `it()` blocks that referenced apiLimiter in the body. Same logic + safety guard as the admin sweep in the prior commit (the blockReferencesApiLimiter check prevents false positives on tests for sub-limiters like consumerChatLimiter and McpRateLimiter, which still legitimately fire from handlers). - Delete `vi.mock('@/lib/security/rate-limit', ...)` blocks that only mock apiLimiter + createRateLimitResponse — preserve mocks that also carry sub-limiters (e.g. chat/stream/route.test.ts keeps the block because it also mocks consumerChatLimiter + agentChatLimiter). - Strip stray `vi.mocked(apiLimiter.check)` and conditionally-stray `vi.mocked(createRateLimitResponse)` setup lines (the conditional rule: only strip if the vi.mock block was also deleted, so legitimate sub-limiter test setups survive). - Clean up dead imports of apiLimiter and dead helpers (e.g. makeRateLimitResult in chat/conversations/[id]/route.test.ts). Net diff: 27 files changed, +9 / -401 lines 13 source files + 14 test files Validation: Type-check: clean Lint: clean (524 pre-existing warnings, 0 new errors) Format: clean Tests: 850/850 files passing, 17,718 tests (-13 deleted dead tests), 6 skipped, 10 todo The 13 deleted tests covered "handler calls apiLimiter.check; on failure returns 429" — moved to the middleware tests added in PR #211. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ests
Final pass on PR scope — sweeps the leftover doc/mock-factory references
that the prior commits' codemods couldn't reach (because the surrounding
code was load-bearing for unrelated reasons). All purely cosmetic /
hygiene; no behaviour changes.
Stale docstrings updated (~16 files):
- Test-file JSDoc lines like `* - Rate limited (adminLimiter)` referenced
per-handler limiter calls that no longer exist. Updated to point at the
middleware tier instead: `* - Rate limiting enforced by proxy.ts
(orchestration tier)`. Same treatment for the per-method variants
("on POST", "on PATCH and DELETE") and the inline comment in
`models.test.ts` about the `?refresh=true` refresh path.
- One outdated note in conversations.id.get.test.ts ("No rate-limiting
call on GET (only DELETE has adminLimiter)") deleted entirely —
neither method calls adminLimiter now.
- Two integration-test docstrings in chat.stream.test.ts and the
admin chat/stream route.test.ts that listed adminLimiter as a key
security assertion: rewritten to enumerate the per-flow caps that
do still hold (chatLimiter, agentChatLimiter, imageLimiter) and
point at proxy.ts for the section tier.
Dead mock-factory entries removed (~6 files):
- `vi.mock('@/lib/security/rate-limit', () => ({ adminLimiter: ..., ... }))`
blocks where the `adminLimiter:` / `apiLimiter:` entry was the only
thing the prior codemod hadn't touched because the surrounding mock
block had a sub-limiter that kept it alive. The dead entry itself
contributed nothing (source doesn't import either symbol any more);
removing it doesn't change test behaviour but matches the actual
surface of the rate-limit module.
- `background-execution-crash-flow.test.ts`: the entire
`vi.mock('@/lib/security/rate-limit')` block was dead (only mocked
adminLimiter + createRateLimitResponse) — deleted.
- `webhooks/trigger/route.test.ts`: stripped the dead `apiLimiter:` mock
entry, the now-unused `import { apiLimiter }` line, and a
`(apiLimiter.check as ReturnType<typeof vi.fn>).mockReturnValue(...)`
beforeEach setup (a type-cast pattern the earlier `vi.mocked(...)`
regex didn't catch).
- `chat/stream` and `chat.stream` (admin variant) tests: stripped
apiLimiter/adminLimiter from their `vi.mock` factory bodies.
- `provider-models.test.ts` (integration): the entire rate-limit
vi.mock block was made of dead entries — removed; the inline comment
about "GET uses apiLimiter, POST uses adminLimiter" was stale too,
deleted.
Net diff:
30 files changed, +30 / -55 lines
Validation:
Type-check, lint, format: clean
Tests: 850/850 files passing, 17,718 tests, 6 skipped, 10 todo
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…are test Addresses finding #2 from .reviews/tests-lib-security.md (alignment, 88 confidence). The comment introducing the synthetic-rule injection at rate-limit-middleware.test.ts:432 said "No real RATE_LIMIT_POLICY rule currently uses 'api-key', so we inject one." That was true when written but is now factually wrong — this PR added the MCP rule (tier: 'mcp', key: 'api-key') and PR #211 added the webhooks rule (tier: 'api', key: 'api-key'). The test's logic (inject a synthetic path for bucket isolation) is still correct; only the justification was stale. Updated to: "Inject a synthetic path so this test's bucket is isolated from the live mcp/webhooks rules — both use 'api-key' keying in the production policy and would otherwise share buckets with real traffic." Comment-only change; no test behaviour or assertions touched. Full suite still green (17,718 tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Code reviewFound 3 issues:
🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Closes the 3 findings (each at 75 confidence) surfaced by /code-review:
1. Add a dedicated `exportLimiter` per-flow sub-cap (10/min per admin
user) and switch `conversations/export/route.ts` to use it instead of
calling `adminLimiter.check(\`export:${ip}\`)` directly. The original
pattern was a custom-token sub-cap that piggy-backed on the section
limiter — workable but architecturally awkward: both the JSDoc on
`adminLimiter` and `.context/security/rate-limiting.md`'s anti-pattern
section explicitly forbid direct handler calls to section limiters.
The new `exportLimiter` matches the documented per-flow sub-cap shape
(mirrors `contactLimiter`, `uploadLimiter`, etc.), is session-user
keyed so two admins in the same office don't share a bucket, and the
"section limiters never called from handlers" invariant now holds
uniformly across the codebase.
2. Renumber the POST handler step comments in `mcp/route.ts` from
`// 3.` – `// 7.` to `// 1.` – `// 5.`. Commit 587e20b deleted the
original step 1 (IP-level rate limit) and converted step 2 to prose,
but left the remaining sequence starting at 3 — a non-sequitur for
readers following the numbered flow.
3. Add a `vi.mock('@/lib/security/rate-limit', ...)` block to the export
route's test file. Pre-fix, the codemod that swept the test had
removed the rate-limit mock; combined with the source still calling
the real `adminLimiter` directly, this risked LRU bucket bleed
across test invocations under parallelism. With the route now using
`exportLimiter` (mocked explicitly in this commit), the test exercises
a controlled stub. Removed the dead `vi.mock('@/lib/security/ip')`
block at the same time since the route no longer reads
`getClientIP` — exports are session-user keyed.
Source changes:
- lib/security/constants.ts: add LIMITS.EXPORT = 10
- lib/security/rate-limit.ts: declare `exportLimiter` adjacent to other
per-flow sub-caps, with JSDoc explaining the layered model
- app/api/v1/admin/orchestration/conversations/export/route.ts: switch
from adminLimiter+IP keying to exportLimiter+session-user keying;
drop unused getClientIP import
- app/api/v1/mcp/route.ts: renumber 3-7 → 1-5
Tests (+1):
- rate-limit.test.ts: config-check test for exportLimiter (limit=10)
- export/route.test.ts: replace dead ip mock with exportLimiter mock
Docs:
- .context/security/rate-limiting.md: add exportLimiter row to the
per-flow sub-cap catalogue
Validation:
- Type-check, lint, format: clean
- Full suite: 850/850 files, 17,719 tests (+1), 6 skipped, 10 todo
- No section limiter is now called directly from any handler — the
documented invariant holds without exception.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Follow-up to PR #211. Two related changes:
mcptier (300/min, api-key keyed) to the rate-limit policy. MCP is server-to-server LLM-agent traffic with its own authentication model and bursty session shape — theapitier's 100/min IP-keyed cap was the wrong default. Two customers sharing a NAT'd egress now get independent buckets; the per-customerMcpRateLimitersub-cap inside the handler stays in place as the tunable knob.adminLimiter.check/apiLimiter.checkblocks from ~150 handlers. The middleware enforces every section cap centrally since PR feat(security): middleware-driven rate limiting with policy table #211 merged; the in-handler calls were defense-in-depth that's now duplicate work. Codemod-driven, deterministic —.claude/tmp/sweep-{admin,api}-{limiter,tests}.mjsscript is gitignored but documented in commit messages.Behaviour changes worth reviewer attention
mcptier. Two distinct customers behind one egress IP no longer fight for one bucket. One API key abused across many IPs is now correctly bucketed under a single 300/min cap./api/v1/admin/orchestration/**. This is the orchestration tier's documented design (chatty admin UI); the prior 30/min adminLimiter handler call was tighter than what PR feat(security): middleware-driven rate limiting with policy table #211 established for the orchestration tier. Section-level enforcement now comes solely from the middleware.getClientIP). The middleware tier keys onsession.user.idwith IP fallback. Better behaviour for CGNAT/office shared IPs; same identity model PR feat(security): middleware-driven rate limiting with policy table #211 established for the catch-all.Out of scope (intentionally)
conversations/export/route.tsroute keeps itsadminLimiter.check(\export:${ip}`)` call — that's a deliberate per-flow custom-token sub-cap on the admin bucket, not a section-level redundancy. Codemod detected the template-literal token shape and skipped the file.Test plan
X-RateLimit-Limit: 300on a/api/v1/mcp/*429 response)🤖 Generated with Claude Code