Skip to content

[schemas] enhanced-thoughts: harden upsert/match/search RPCs#363

Open
alanshurafa wants to merge 3 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/enhanced-thoughts-rpc-sync
Open

[schemas] enhanced-thoughts: harden upsert/match/search RPCs#363
alanshurafa wants to merge 3 commits into
NateBJones-Projects:mainfrom
alanshurafa:contrib/alanshurafa/enhanced-thoughts-rpc-sync

Conversation

@alanshurafa

@alanshurafa alanshurafa commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

What this does

The enhanced-thoughts schema shipped an older generation of the upsert_thought and search_thoughts_text RPCs than the reference brain has been running. This PR closes that gap. Everything is additive and idempotent — re-running schema.sql on a v1 install is a no-op, and the v1 return contract ({id, fingerprint}) plus the status / status_updated_at handling that workflow-status depends on are left exactly as they were.

Changes

  • search_thoughts_text learns date and tier filters. p_filter now reads three reserved control keys — start_date, end_date (ISO 8601), and exclude_restricted (boolean) — and applies them at the data layer. They are peeled off before the metadata @> filter containment check, so any other key keeps its old behavior.

  • upsert_thought gets two dedup/merge guards.

    • Original-fingerprint fallback. When a thought's content is corrected its fingerprint changes; a later reimport of the original text used to insert a stale sibling that outvoted the correction. If the pre-edit fingerprint lives in an append-only metadata.original_fingerprints[] array, the reimport now lands on the corrected row instead.
    • User-edit guard. Keys listed in metadata.user_edits are treated as human-owned and stripped from an incoming automated patch, so a reimport can't clobber a human correction. Guards metadata keys only.
    • To make the fallback possible the function does an explicit lookup and branches INSERT vs UPDATE instead of ON CONFLICT. The unique-index race that ON CONFLICT handled for free is caught explicitly and folded back into the merge path.
  • New opt-in match_thoughts_superseded_aware. Same shape as the core match_thoughts plus a superseded_by column. Thoughts that have been replaced (the target of a supersedes edge in thought_edges) take a 0.8x ranking penalty so fresh thoughts surface above their stale predecessors, without ever being excluded. The core match_thoughts is untouched; callers opt in by name, matching the recency-boosted-match-thoughts pattern. Installed only when schemas/typed-reasoning-edges/ is present; otherwise it's skipped with a NOTICE and the rest of the migration still applies.

ID contract

All ported references use UUID against public.thoughts(id). The superseded-aware RPC returns superseded_by UUID and reads thought_edges where from_thought_id is the newer replacement and to_thought_id is the stale thought — the directional semantics OB1's typed-reasoning-edges already documents.

Importance scale

Some installs use a narrower 0-6 importance scale. Open Brain's upsert already accepts a wider 0-100 range, so it never clipped 0-6 values — switching to 0-6 here would retroactively rescale every existing row, which is a breaking data change, not an additive one. The 0-100 clamp stays; the README documents 0-6 as a subset for cross-scale parity.

Testing

Applied against a throwaway PostgreSQL instance with stub thoughts + thought_edges tables:

  • Insert returns {id, fingerprint}; re-insert dedupes to the same row.
  • Original-fingerprint reimport lands on the corrected row (no stale sibling; content preserved).
  • User-edit guard keeps a human title against a robot overwrite.
  • start_date and exclude_restricted filters return the expected subset.
  • match_thoughts_superseded_aware returns the stale row at 0.8x with superseded_by set, fresh row at full similarity.
  • Re-running schema.sql is clean; dropping thought_edges triggers the documented skip-with-NOTICE.

Compatibility

Preserves the workflow-status dependency documented in #328upsert_thought still writes status / status_updated_at, and the README's Prerequisites now call out workflow-status explicitly.

alanshurafa and others added 3 commits June 13, 2026 15:06
Existing installs ran older upsert/search RPCs than the reference brain.
This brings them current without breaking the v1 contracts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A metadata-only re-upsert recomputed importance/quality_score/type/
source_type/sensitivity_tier to hardcoded defaults, so the merge-path
COALESCE never fell through to the existing column and silently
overwrote it -- including a sensitivity_tier restricted->standard
privacy downgrade. Split the locals into insert-only defaults (new
rows) and explicit-incoming values that are NULL when the payload omits
the key, so an omitted field preserves the existing column while an
explicit value still updates. Extend the user-edit guard to the
promoted scalar columns so a human-owned field keeps column and
metadata in agreement, strip both source/source_type aliases when
either is guarded, and re-derive the task/idea status seed from the
effective post-guard type so a rejected type can't seed a stale status.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…reimports

Restricted rows leaked through search_thoughts_text when the tier lived
only in metadata (column default 'standard'); restrict if either source
says 'restricted', matching how provenance-chains reads the tier.

Original-fingerprint dedup returns the corrected row, but p_content is
the old text; surface matched_via so open-brain-rest skips overwriting
the corrected row's embedding with a stale-text vector.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added integration Contribution: MCP extension or capture source schema Contribution: database extension labels Jun 14, 2026
@alanshurafa alanshurafa changed the title [schemas] enhanced-thoughts: sync RPCs with current ExoCortex behavior [schemas] enhanced-thoughts: harden upsert/match/search RPCs Jun 14, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c6f972b990

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +447 to +449
if (!matchedViaOriginalFingerprint) {
update.embedding = embedding;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve corrected rows after original-fingerprint matches

When matched_via is original_fingerprint, this guard skips only embedding, but the subsequent update still writes metadata, promoted fields, and status derived from the stale original capture. That bypasses the RPC's user_edits / original_fingerprints merge protection and can overwrite the corrected row's human-owned metadata or workflow state whenever a REST reimport hits the fallback; in this branch the update should be skipped or limited to fields that cannot be stale.

Useful? React with 👍 / 👎.

Comment on lines +671 to +672
IF v_type IN ('task', 'idea') THEN
v_status := 'new';

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve existing task statuses on duplicate upserts

On merge paths where the caller omits status, this re-derives v_status = 'new' from the effective existing type, so any duplicate or metadata-only re-upsert of a task/idea that is already planning, active, done, etc. is reset back to new and gets a fresh status_updated_at. That is a regression from preserving omitted fields/status; only seed new on insert or when the existing row's status is null.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration Contribution: MCP extension or capture source schema Contribution: database extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant