Skip to content

[schemas] Add smart-ingest-tables schema#364

Open
alanshurafa wants to merge 2 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/smart-ingest-tables
Open

[schemas] Add smart-ingest-tables schema#364
alanshurafa wants to merge 2 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/smart-ingest-tables

Conversation

@alanshurafa

Copy link
Copy Markdown
Collaborator

What this adds

schemas/smart-ingest-tables/ — the ingestion_jobs and ingestion_items tables plus the recount_ingestion_job and append_thought_evidence RPCs that the already-merged integrations/smart-ingest Edge Function depends on.

Why

The merged smart-ingest integration's README points to schemas/smart-ingest-tables as a setup step, but that schema was never merged. Anyone following the integration's setup hits missing tables and a missing append_thought_evidence RPC, and the function can't run. This lands the missing dependency.

It consolidates the two earlier drafts — #196 (bigint ids) and #350 (uuid ids) — into one schema. The bigint draft is superseded; this one keeps OB1's public.thoughts(id uuid) contract throughout.

Notes for review

  • UUID-native. Every thought reference (matched_thought_id, result_thought_id, the RPC argument) is uuid. No bigint ids, no sequence grants.
  • Matches what the Edge Function writes. Columns line up with the shipped integration: input_length (not input_bytes), and sequence is nullable because the function omits it and orders items by id.
  • Security. recount_ingestion_job and append_thought_evidence are SECURITY INVOKER with EXECUTE granted to service_role only and revoked from public. RLS is on; anon and authenticated have all access revoked. This is the change that resolves the privilege concern raised on [schemas] Smart ingest pipeline tables #196.
  • Safe to re-run. Idempotent throughout, and legacy bigint function overloads are dropped first so an older install upgrades cleanly.
  • Evidence integrity. append_thought_evidence locks the thought row, hashes a structured identity (no concatenation collisions), and normalizes the evidence value before appending.

Known follow-up (separate PR)

The shipped integration still parses thought and job ids as numbers (extractThoughtId, handleExecuteJob), so on a UUID install it can mark successful upserts as failed and reject UUID job_id. That's an integration-code fix, tracked separately — this PR is the schema only.

Testing

  • Markdownlint clean against .github/.markdownlint.jsonc.
  • metadata.json validates against .github/metadata.schema.json.
  • Reviewed for the UUID contract, idempotency, and the SECURITY INVOKER grant model.
  • Schema not yet applied to a live database — that's the next step on this fork before it goes upstream.

alanshurafa and others added 2 commits June 13, 2026 13:56
The merged integrations/smart-ingest README documents schemas/smart-ingest-tables
as its database dependency, but that schema was never merged. Fresh installs that
follow the setup hit missing ingestion_jobs/ingestion_items tables and a missing
append_thought_evidence RPC, so the integration cannot run.

Consolidate the two competing drafts (NateBJones-Projects#196 bigint, NateBJones-Projects#350 uuid) onto one
UUID-native schema that matches OB1's public.thoughts(id uuid) contract and the
columns the shipped Edge Function actually writes (input_length; nullable
sequence). Helper functions are SECURITY INVOKER granted to service_role only,
never anon, which closes the privilege issue that blocked NateBJones-Projects#196. Evidence identity
is hashed over a structured payload to avoid concatenation collisions, the
evidence value is normalized before iteration, and legacy bigint function
overloads are dropped so re-runs over an older install stay clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
source_label was NOT NULL, but the Edge Function and dashboard ingest
client create jobs without it, so every default insert failed and no
dry-run items persisted. Make it nullable.

append_thought_evidence forced search_path to public only while calling
unqualified digest(); on Supabase projects pgcrypto lives in extensions,
so the call failed at runtime. Add extensions to the function search_path
(public first) so it resolves in either layout.

Also add a brief More from Nate provenance section to the README.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added integration Contribution: MCP extension or capture source schema Contribution: database extension labels Jun 14, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 99be1a2e59

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

create extension if not exists pgcrypto;

create table if not exists public.ingestion_jobs (
id uuid primary key default gen_random_uuid(),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep job ids executable by smart-ingest

When users follow the documented dry-run flow, createJob() selects this id and returns it as job_id, but I checked the already-shipped integrations/smart-ingest/index.ts execute handler and it only accepts typeof body.job_id === "number" before querying ingestion_jobs. With this schema the returned job id is a UUID string, so POST /smart-ingest/execute rejects every persisted dry-run job with job_id is required; either keep the table id compatible or update the Edge Function in the same change before presenting this schema as the integration dependency.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration Contribution: MCP extension or capture source schema Contribution: database extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant