Skip to content

[integrations] REST API gateway#201

Open
alanshurafa wants to merge 8 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/rest-api-gateway
Open

[integrations] REST API gateway#201
alanshurafa wants to merge 8 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/rest-api-gateway

Conversation

@alanshurafa

Copy link
Copy Markdown
Collaborator

Summary

Deno/Supabase Edge Function providing a REST API for non-MCP clients — 15 endpoints over thoughts, ingestion, and search.

Hardened per Wave 2.5 review:

  • String IDs — all thought IDs parsed, validated, and forwarded as strings (not `Number()`-coerced). Prevents silent row-identity corruption when IDs cross 2^53 or migrations preserve historical BIGINT values.
  • Robust upsert response parsing — `extractThoughtId()` helper handles scalar, `{id}`, `{thought_id}`, array-wrapped, and null-returning shapes. No more 500 on a legitimate response.
  • Timing-safe auth — `timingSafeEqual()` XOR-accumulator for `x-brain-key` comparison. Early return on missing key; `.trim()` handles copy-paste whitespace.
  • CORS allowlist — `CORS_ALLOWED_ORIGINS` env (comma-separated). Legacy default `*` preserved for backward compat, but README documents the risk for production.
  • Per-key rate limiting — 100 req/min default (`RATE_LIMIT_PER_MIN` env). SHA-256-hashed keys so logs never leak the raw key. In-memory bucket with `Retry-After` response.
  • Ingest proxy hardening — 60s `AbortController` timeout, 1 MB body cap, distinguishes 502 (unreachable), 504 (timeout), invalid-JSON vs forwarded-JSON responses.
  • Sort-column allowlist — rejects arbitrary column names; prevents schema enumeration and ordering by heavy/PII columns.
  • Error opacity — top-level catch returns `{error:"internal_error", code:"GENERIC", error_id:}` and logs full error server-side with the same UUID. No PostgREST constraint names or data values leak to callers.
  • Sensitivity re-detection on update — `handleUpdateThought` runs `detectSensitivity` + escalation-only `resolveSensitivityTier`. A standard thought rewritten to contain a credit card can no longer stay `standard`.

Why

The stock MCP server is fine for AI clients, but dashboards, scripts, and webhooks need HTTP. This gateway mirrors the core MCP's capture/search/list/update surface over REST with the same auth model (`x-brain-key`) — so any client that can do HTTPS + headers can talk to an Open Brain.

Replaces the content from the closed PR #45 bundle with the Wave 2.5 security hardening pass incorporated.

Test plan

  • POST `/capture` with valid key — thought created, ID returned as string
  • POST `/capture` with wrong key — 401 returned in constant time (no timing-based key-length leak)
  • Flood 150 requests in 60s with same key — verify 101st gets 429 + `Retry-After`
  • GET `/thoughts?sort=embedding` — verify 400 (sort column not in allowlist)
  • GET `/thoughts?sort=created_at` — verify 200 + sorted results
  • POST `/update` changing content to include `4111-1111-1111-1111` on a `standard` thought — verify tier escalates to `personal` or `restricted`
  • Unset `CORS_ALLOWED_ORIGINS` → verify `*` still honored with README warning
  • Set `CORS_ALLOWED_ORIGINS="https://app.example.com"\` → verify other origins rejected
  • `deno check index.ts` + `_shared/*.ts`

@github-actions github-actions Bot added the integration Contribution: MCP extension or capture source label Apr 18, 2026
@github-actions github-actions Bot added recipe Contribution: step-by-step recipe schema Contribution: database extension labels Apr 22, 2026
@alanshurafa alanshurafa added area: integrations Review area: integrations/MCP/capture sources risk: auth-security Touches auth, secrets, permissions, or security-sensitive behavior review: needs-refresh Branch is stale, conflicted, or needs rebase before review alan-reviewed Reviewed by Alan Shurafa in Community Reviewer role labels May 20, 2026
@alanshurafa

Copy link
Copy Markdown
Collaborator Author

Conflicts with main. This is tied to the design discussion in #192; I'll rebase once there's a direction there.

alanshurafa and others added 8 commits May 20, 2026 13:12
… variants

JavaScript Number loses precision beyond 2^53. All ID parsing now uses
string validation instead of Number(). Added extractThoughtId() to
handle all upsert_thought RPC response shapes (scalar, {id}, {thought_id}).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Why: === short-circuits at the first differing byte. For a public Edge
Function deployed with --no-verify-jwt where MCP_ACCESS_KEY is the only
auth layer, a remote attacker can use response-latency differences to
byte-wise discover the key. Replace with XOR-accumulate over equal-length
byte arrays so comparison time is independent of where inputs differ.
Why: The gateway is deployed with --no-verify-jwt so MCP_ACCESS_KEY is
the only auth layer. Allow-Origin: * combined with write methods let any
origin attempt a cross-origin call with a leaked key, and the complete
absence of throttling meant a single leaked key could burn Supabase row
quota and OpenRouter credits at fetch speed.

- CORS_ALLOWED_ORIGINS env var (comma-separated) restricts to an
  explicit allowlist; unset keeps legacy * for backward compatibility.
- RATE_LIMIT_PER_MIN env var (default 100) enforces a rolling 60s
  per-key cap with SHA-256 hashed bucket keys and 429 + Retry-After.
- README Security section documents both settings and their defaults.
Why: /ingest and /ingestion-jobs/:id/execute forwarded request bodies
to smart-ingest with no AbortSignal, no size limit, and a bare
response.json() call. Three failure modes: hung upstream holds a worker
slot until the 150s edge timeout kills it; an attacker can POST a 50 MB
blob that the Edge Function will parse, stringify, and forward; an
HTML/text error page from Supabase causes response.json() to throw a
SyntaxError that the top-level catch mislabels as 'Invalid JSON in
request body' — pointing the caller at their own payload instead of
the real upstream failure.

- readJsonWithCap() rejects bodies over 1 MB with 413 before parsing.
- proxyFetchJson() uses AbortController with 60s timeout, distinguishes
  upstream_timeout (504), upstream_unreachable (502),
  upstream_invalid_json (502 + raw text snippet), and upstream_error
  (passes upstream status) from legitimate JSON responses.
Why: The sort query string was forwarded verbatim to PostgREST's order
param. The service role can read every column on thoughts including
embedding (1536-dim vector — heavy scan) and sensitivity_reasons (JSONB
with PII-adjacent strings). An attacker could enumerate the schema,
force unindexed scans, or partially exfiltrate metadata through sort
order of returned rows.

Allowlist: id, created_at, updated_at, importance, quality_score.
Unknown sort values return 400 with the valid options.
Why: String(error) on a PostgrestError or an Error wrapped with a
PostgREST message leaked internal SQL text — table names, column names,
and constraint names — to the caller. Under service-role access,
constraint errors include data values. That is an info-leak vector for
any authenticated attacker.

Return a stable opaque payload: { error: 'internal_error', code:
'GENERIC', error_id: <uuid> }. Log the full error server-side tagged
with the same UUID so operators can correlate without exposing internals.
Why: PUT /thought/:id only refreshed content, embedding, type, and
importance. A caller could rewrite a standard thought to include a
credit-card number, SSN, or health identifier and the sensitivity_tier
would stay at standard — so the thought would still be returned in
exclude_restricted=true queries and in semantic search without
filtering.

Now runs detectSensitivity on the new content and applies
resolveSensitivityTier against the existing tier (escalation-only).
Writes sensitivity_tier + sensitivity_reasons into the update only when
the tier actually changes. Adds an opt-in force_sensitivity flag that
lets the caller bypass escalation-only semantics — but even with the
flag the result is clamped to at least what detection returned, so
force_sensitivity cannot hide detected PII.
@alanshurafa alanshurafa force-pushed the contrib/alanshurafa/rest-api-gateway branch from 8a3d698 to a77389b Compare May 20, 2026 17:13
alanshurafa added a commit to alanshurafa/OB1 that referenced this pull request May 20, 2026
README: adds the "Gemini bulk history sync (Phase B/C)" section covering
how the debugger-based capture works end-to-end, the "Debugging this
browser" banner users will see while syncing, the anti-bot throttling
strategy, and how to pause the flow (toggle Gemini off or dismiss the
debugger banner). Updates the Supported Sites table row for Gemini, the
Usage paragraph, and the Chrome Web Store permission justifications for
`debugger` and `scripting`.

metadata.json: bumps version 1.0.0 → 1.1.0, adds the `gemini-bulk-sync`
tag, updates the `updated` date, and drops the `_todo` field so the
current (stricter) metadata schema validates. The TODO it referenced
(PR NateBJones-Projects#201 slug) is still explained in the README alongside the
prerequisite link.

License remains FSL-1.1-MIT. No new runtime dependencies, no binary
blobs, no telemetry or third-party hosts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alanshurafa

Copy link
Copy Markdown
Collaborator Author

Rebased onto main — conflicts cleared. The only conflict was a stray repo-wide markdown-lint commit on this branch that collided with main's own lint cleanup; I dropped that commit. The branch's actual changes are untouched. Now mergeable.

@alanshurafa alanshurafa added review: ready-for-maintainer Community reviewer recommends maintainer review and removed review: needs-refresh Branch is stale, conflicted, or needs rebase before review labels May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

alan-reviewed Reviewed by Alan Shurafa in Community Reviewer role area: integrations Review area: integrations/MCP/capture sources integration Contribution: MCP extension or capture source recipe Contribution: step-by-step recipe review: ready-for-maintainer Community reviewer recommends maintainer review risk: auth-security Touches auth, secrets, permissions, or security-sensitive behavior schema Contribution: database extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant