Skip to content

[schemas] Connector sync state#362

Open
alanshurafa wants to merge 2 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/connector-sync-state
Open

[schemas] Connector sync state#362
alanshurafa wants to merge 2 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/connector-sync-state

Conversation

@alanshurafa

Copy link
Copy Markdown
Collaborator

What this adds

schemas/connector-sync-state/ — one generic table that any capture connector can use to remember where it left off and whether its last run failed.

Every connector that pulls from an external source on a schedule (email importer, calendar poller, RSS reader, webhook receiver, backfill job) needs the same three things: a resume cursor so the next run doesn't re-read everything, lifecycle timestamps, and a last-error plus error count so a stalled or failing connector can be spotted. Writing that table once per connector gets repetitive and inconsistent. This gives them one shared table and three RPCs that wrap the begin / success / error transitions.

What's in it

  • public.connector_sync_state — one row per (connector, surface, sync_key). Holds cursor_value, high_watermark, the four lifecycle timestamps, last_error, error_count, and free-form counters / metadata jsonb.
  • connector_sync_begin(...) — upserts the row and marks the run started.
  • connector_sync_success(...) — advances the cursor, stamps success, resets the error count, merges counters. A NULL cursor leaves the existing one in place.
  • connector_sync_error(...) — stamps the error and increments the count, but leaves the cursor untouched so the next run retries from the last known-good position instead of skipping the failed window.

Notes for reviewers

  • No thought ids. This table is keyed by connector identity, not by any thought. The surfaced primary key is a BIGINT id; the canonical public.thoughts.id (a UUID) never appears here, and the schema is standalone — it doesn't touch public.thoughts at all.
  • Operational state, service-role only. RLS is on with no policy (deny by default), all three functions are SECURITY INVOKER, and table/function privileges are granted to service_role only. The table also explicitly REVOKEs from PUBLIC, anon, authenticated, so a Supabase project that blanket-grants new tables to its API roles still can't expose this one.
  • Idempotent and additiveCREATE TABLE IF NOT EXISTS, CREATE INDEX IF NOT EXISTS, CREATE OR REPLACE FUNCTION; no DROP/TRUNCATE/unqualified DELETE. Safe to re-run.
  • Pairs with brain-health-monitoring. The README shows the two queries (erroring connectors, stalled connectors) you'd wrap as a health view to alert on.

Testing

Applied to a throwaway Postgres 18 and exercised end to end: begin → success advances the cursor and keeps error_count at 0; begin → error increments the count and preserves the cursor; a later clean run resets the count; a NULL-cursor success preserves the prior cursor; distinct surfaces keep independent rows. Verified the status CHECK, RLS-on, SECURITY INVOKER, and that grants resolve to service_role only — including under a blanket ALTER DEFAULT PRIVILEGES ... GRANT ALL ON TABLES TO anon, authenticated, where the explicit revoke keeps anon/authenticated off the table. Re-applied the migration to confirm idempotency. Markdownlint clean against the repo config.

alanshurafa and others added 2 commits June 13, 2026 19:08
Capture connectors all need the same bookkeeping: a resume cursor, a
high-watermark, success/error timestamps, and an error count so a stalled
or failing connector can be detected. Writing that table per connector is
repetitive and drifts. This adds one generic connector_sync_state table
plus begin/success/error RPCs that any connector shares, keyed by
(connector, surface, sync_key) — not by any thought, so no thought id
appears. Pairs with brain-health-monitoring for alerting on bad connectors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Overlapping connector runs can finish out of order; an older run
completing late would overwrite a newer cursor/high_watermark. Add a
no-regression guard so completions never move them backward, fix the
README cursor-read to fail loud on non-2xx, and add a provenance CTA.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the schema Contribution: database extension label Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

schema Contribution: database extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant