diff --git a/schemas/enhanced-thoughts/README.md b/schemas/enhanced-thoughts/README.md
index 5e1c69b9..f7314114 100644
--- a/schemas/enhanced-thoughts/README.md
+++ b/schemas/enhanced-thoughts/README.md
@@ -1,14 +1,23 @@
# Enhanced Thoughts Columns and Utility RPCs
+
+
+
+
+**Created by [@alanshurafa](https://github.com/alanshurafa)**
+
+
+
> Adds structured columns and utility functions to the Open Brain thoughts table for richer classification, full-text search, statistics, and connection discovery.
## What It Does
-This schema extension adds six new columns to the `thoughts` table (`type`, `sensitivity_tier`, `importance`, `quality_score`, `source_type`, `enriched`) so thoughts can be classified, filtered, and ranked without parsing the metadata JSONB every time. It also upgrades `upsert_thought` so metadata-backed writes keep those structured columns in sync. It installs three utility RPC functions:
+This schema extension adds six new columns to the `thoughts` table (`type`, `sensitivity_tier`, `importance`, `quality_score`, `source_type`, `enriched`) so thoughts can be classified, filtered, and ranked without parsing the metadata JSONB every time. It also installs four RPC functions:
- **`search_thoughts_text`** -- Full-text search with boolean operators, ILIKE fallback, pagination, and result counts.
- **`brain_stats_aggregate`** -- Returns total thought count, top types, and top topics as a single JSONB payload.
- **`get_thought_connections`** -- Finds thoughts that share metadata topics or people with a given thought.
+- **`backfill_thought_types(p_allowed_types TEXT[])`** -- Populates the new top-level `type` column from `metadata->>'type'`. The default allowlist covers the canonical eight values (`idea`, `task`, `person_note`, `reference`, `decision`, `lesson`, `meeting`, `journal`). Pass a custom array to accept additional values, or pass `NULL` to backfill whatever `metadata->>'type'` contains.
## Prerequisites
@@ -36,19 +45,33 @@ SUPABASE (from your Open Brain setup)
2. Create a new query and paste the full contents of `schema.sql`
3. Click **Run** to execute the migration
4. Open **Table Editor** and select the `thoughts` table to confirm the new columns appear: `type`, `sensitivity_tier`, `importance`, `quality_score`, `source_type`, `enriched`
-5. Navigate to **Database > Functions** and verify three new functions exist: `search_thoughts_text`, `brain_stats_aggregate`, `get_thought_connections`
-6. Verify `upsert_thought` still exists. The enhanced version mirrors `metadata.type`, `metadata.source`, `metadata.importance`, `metadata.quality_score`, `metadata.sensitivity_tier`, and task/idea status into top-level columns.
-7. If you have existing thoughts with `type` or `source` values stored in the metadata JSONB, the backfill statements at the bottom of the script will have populated the new columns automatically
+5. Navigate to **Database > Functions** and verify the new functions exist: `search_thoughts_text`, `brain_stats_aggregate`, `get_thought_connections`, `backfill_thought_types`
+6. If you have existing thoughts with `type` or `source` values stored in the metadata JSONB, the script automatically calls `backfill_thought_types()` with the default canonical allowlist. If your brain uses non-canonical `type` values, re-run `SELECT backfill_thought_types(ARRAY['your','custom','types']);` or `SELECT backfill_thought_types(NULL);` to accept any value
## Expected Outcome
After running the migration:
-- The `thoughts` table has six new columns with dashboard-friendly defaults.
+- The `thoughts` table has six new columns with sensible defaults:
+ - `sensitivity_tier TEXT DEFAULT 'standard'` (canonical values: `'standard'`, `'personal'`, `'restricted'`)
+ - `importance SMALLINT DEFAULT 3` (scale: 1-5, where 3 is the default)
+ - `quality_score NUMERIC(5,2) DEFAULT 50` (scale: 0-100, where 50 is the default)
+ - `enriched BOOLEAN DEFAULT false`
+ - `type TEXT` (nullable; populated by backfill or writers)
+ - `source_type TEXT` (nullable; populated by backfill or writers)
- New indexes on `type`, `importance`, `source_type`, and a GIN tsvector index on `content` for fast full-text search.
-- Three new RPC functions callable via the Supabase client or REST API.
-- `upsert_thought` remains the canonical write path, but now keeps structured dashboard columns synchronized with metadata payloads.
-- Any existing thoughts with `type` or `source` in their metadata JSONB will have those values copied into the new top-level columns.
+- Four new RPC functions callable via the Supabase client or REST API (`search_thoughts_text`, `brain_stats_aggregate`, `get_thought_connections`, `backfill_thought_types`).
+- Any existing thoughts with `type` or `source` in their metadata JSONB will have those values copied into the new top-level columns (via `backfill_thought_types()` for `type` with the canonical allowlist, plus an inline `UPDATE` for `source_type`).
+
+## Security
+
+This schema follows stock Open Brain's "service_role only" posture:
+
+- `brain_stats_aggregate` and `get_thought_connections` are `SECURITY DEFINER` with `SET search_path = public` (defense in depth against search-path hijacks). They can read the full `thoughts` table regardless of RLS.
+- `search_thoughts_text` is `SECURITY INVOKER` and respects RLS.
+- **None of the three RPCs are granted to `anon`.** Execute privilege is limited to `authenticated` and `service_role`. The publishable anon key cannot call them.
+
+If you want to expose any of these to `anon` (for example, a public-read dashboard), add your own `GRANT EXECUTE ... TO anon;` in a follow-up migration and confirm that `p_exclude_restricted := true` (the default) plus your sensitivity-tier hygiene gives you the exposure surface you actually want. This is an explicit opt-in: the default stance is private.
## Troubleshooting
@@ -59,4 +82,4 @@ Solution: These are safe to ignore. The `ADD COLUMN IF NOT EXISTS` syntax preven
Solution: Confirm your thoughts have content populated. Try a simple query first (single word, no operators). If using boolean operators, ensure the syntax matches websearch format ("quoted phrases", word AND word, -excluded).
**Issue: brain_stats_aggregate returns empty types or topics**
-Solution: The function filters by `created_at`. Pass `p_since_days := 0` for all-time stats. Also confirm that your thoughts have the `type` column populated (run the backfill UPDATE if needed).
+Solution: The function filters by `created_at`. Pass `p_since_days := 0` for all-time stats. Also confirm that your thoughts have the `type` column populated. If you use non-canonical type values in `metadata->>'type'` (anything outside `idea`, `task`, `person_note`, `reference`, `decision`, `lesson`, `meeting`, `journal`), call the backfill RPC with your own allowlist, e.g. `SELECT backfill_thought_types(ARRAY['idea','task','article','quote']);`, or `SELECT backfill_thought_types(NULL);` to accept whatever is present.
diff --git a/schemas/enhanced-thoughts/metadata.json b/schemas/enhanced-thoughts/metadata.json
index 0c26fdc3..757a341b 100644
--- a/schemas/enhanced-thoughts/metadata.json
+++ b/schemas/enhanced-thoughts/metadata.json
@@ -14,5 +14,5 @@
"difficulty": "beginner",
"estimated_time": "15 minutes",
"created": "2026-04-06",
- "updated": "2026-04-06"
+ "updated": "2026-04-17"
}
diff --git a/schemas/enhanced-thoughts/schema.sql b/schemas/enhanced-thoughts/schema.sql
index 3c4d6650..2e1297bf 100644
--- a/schemas/enhanced-thoughts/schema.sql
+++ b/schemas/enhanced-thoughts/schema.sql
@@ -55,7 +55,7 @@ RETURNS TABLE (
total_count BIGINT
)
LANGUAGE plpgsql
-VOLATILE
+STABLE
SET statement_timeout = '25s'
AS $$
BEGIN
@@ -84,7 +84,7 @@ BEGIN
AND (SELECT count(*) FROM tsvector_hits) < (p_limit + p_offset)
AND t.content ILIKE '%' || q.raw_query || '%'
AND t.metadata @> coalesce(p_filter, '{}'::jsonb)
- AND t.id NOT IN (SELECT th.hit_id FROM tsvector_hits th)
+ AND NOT EXISTS (SELECT 1 FROM tsvector_hits th WHERE th.hit_id = t.id)
LIMIT 500
),
all_hits AS (
@@ -118,8 +118,10 @@ BEGIN
ELSE 0
END
)
- + (coalesce(t.importance, 5) / 20.0)::real
- + (coalesce(t.quality_score, 0.50) / 500.0)::real
+ -- importance is 1..5; max bonus 5/20 = 0.25
+ + (coalesce(t.importance, 3) / 20.0)::real
+ -- quality_score is 0..100; max bonus 100/500 = 0.20
+ + (coalesce(t.quality_score, 50) / 500.0)::real
)::real AS rank
FROM public.thoughts t
CROSS JOIN query_input q
@@ -138,8 +140,12 @@ BEGIN
END;
$$;
+-- Do NOT grant to `anon`. Stock Open Brain keeps `thoughts` behind RLS
+-- (service_role only). Broadening execution to the publishable anon key
+-- would expose the entire brain to anyone who knows the project URL.
+-- See README "Security" section.
GRANT EXECUTE ON FUNCTION search_thoughts_text(TEXT, INTEGER, JSONB, INTEGER)
- TO authenticated, anon, service_role;
+ TO authenticated, service_role;
-- ============================================================
-- 3. BRAIN STATS AGGREGATE RPC
@@ -195,8 +201,10 @@ BEGIN
END;
$$;
+-- Do NOT grant to `anon`. This RPC is SECURITY DEFINER and would bypass
+-- RLS on the thoughts table. See README "Security" section.
GRANT EXECUTE ON FUNCTION brain_stats_aggregate(INTEGER, BOOLEAN)
- TO authenticated, anon, service_role;
+ TO authenticated, service_role;
-- ============================================================
-- 4. THOUGHT CONNECTIONS RPC
@@ -220,6 +228,7 @@ RETURNS TABLE (
overlap_count INT
)
LANGUAGE plpgsql
+STABLE
SECURITY DEFINER
SET search_path = public
AS $$
@@ -266,7 +275,7 @@ BEGIN
) AS shared_people
FROM thoughts bt
WHERE bt.id != p_thought_id
- AND (NOT p_exclude_restricted OR bt.sensitivity_tier != 'restricted')
+ AND (NOT p_exclude_restricted OR bt.sensitivity_tier IS DISTINCT FROM 'restricted')
AND (
EXISTS (
SELECT 1 FROM jsonb_array_elements_text(bt.metadata->'topics') val
@@ -288,8 +297,12 @@ BEGIN
END;
$$;
+-- Do NOT grant to `anon`. This RPC is SECURITY DEFINER and exposes
+-- a 200-char content preview plus metadata for any thought by UUID;
+-- granting to anon would let anyone with the project URL pull content.
+-- See README "Security" section.
GRANT EXECUTE ON FUNCTION get_thought_connections(UUID, INT, BOOLEAN)
- TO authenticated, anon, service_role;
+ TO authenticated, service_role;
-- ============================================================
-- 5. BACKFILL EXISTING DATA
@@ -297,10 +310,42 @@ GRANT EXECUTE ON FUNCTION get_thought_connections(UUID, INT, BOOLEAN)
-- exist. Safe to run multiple times (WHERE ... IS NULL guard).
-- ============================================================
--- Backfill type from metadata
-UPDATE thoughts SET type = metadata->>'type'
-WHERE type IS NULL AND metadata->>'type' IS NOT NULL
- AND metadata->>'type' IN ('idea','task','person_note','reference','decision','lesson','meeting','journal');
+-- Backfill `type` from metadata. Wrapped in an RPC so callers can
+-- override the allowlist. Default allowlist matches the canonical
+-- Open Brain type vocabulary; pass NULL to accept any string value
+-- present in metadata->>'type'.
+CREATE OR REPLACE FUNCTION backfill_thought_types(
+ p_allowed_types TEXT[] DEFAULT ARRAY[
+ 'idea','task','person_note','reference',
+ 'decision','lesson','meeting','journal'
+ ]
+)
+RETURNS BIGINT
+LANGUAGE plpgsql
+VOLATILE
+SET search_path = public
+AS $$
+DECLARE
+ v_updated BIGINT;
+BEGIN
+ UPDATE public.thoughts
+ SET type = metadata->>'type'
+ WHERE type IS NULL
+ AND metadata->>'type' IS NOT NULL
+ AND (p_allowed_types IS NULL OR metadata->>'type' = ANY(p_allowed_types));
+
+ GET DIAGNOSTICS v_updated = ROW_COUNT;
+ RETURN v_updated;
+END;
+$$;
+
+-- Do NOT grant to `anon`. This RPC writes to the thoughts table.
+GRANT EXECUTE ON FUNCTION backfill_thought_types(TEXT[])
+ TO authenticated, service_role;
+
+-- Run the backfill with the default allowlist so the paste-and-run
+-- flow still auto-populates `type` for canonical values.
+SELECT backfill_thought_types();
-- Backfill source_type from metadata
UPDATE thoughts SET source_type = metadata->>'source'