Skip to content

benediktms/brain

Repository files navigation

brain

A local-first personal knowledge base with task management and memory retrieval, exposing token-budgeted tools to AI agents over MCP.


Why brain?

Long-running AI agents face a hard constraint: context windows are finite, and filling them with irrelevant content wastes both money and reasoning quality. Research on agentic memory systems — MemGPT, Generative Agents, Mem0 — has converged on the same insight: agents need explicit mechanisms to control what they read and how much they read, not just better retrieval.

Existing tools don't solve this well:

  • Knowledge graph tools (Obsidian plugins, etc.) are built for human navigation with no concept of token budgets or machine-readable retrieval APIs.
  • RAG-as-a-service (Pinecone, Weaviate cloud) requires network calls, incurs per-query costs, and sends your notes to third parties — with only simple top-k vector search.
  • Local vector databases (sqlite-vec, Chroma) provide semantic similarity but miss keyword search, recency, link structure, and tag matching. A query for "meeting notes from last Tuesday" fails entirely on pure vector search.

brain takes a different position: run everything locally, combine all retrieval signals into a hybrid score, enforce token budgets at the API level, and treat Markdown files as the durable source of truth.


Overview

brain is a Rust daemon that manages tasks, records, and Markdown notes. It incrementally indexes content into a dual-store system (SQLite for full-text search and metadata, LanceDB for 384-dim vector embeddings) and serves retrieval tools over MCP stdio JSON-RPC.

Token budgeting is a first-class constraint. memory.retrieve supports three levels of detail (L0 extractive abstracts ~100 tokens, L1 LLM summaries ~2000 tokens, L2 full source). Agents choose their budget upfront and get LOD-adjusted content in a single call, spending context window space efficiently.

Hybrid scoring combines six signals — vector similarity, BM25 keyword ranking, recency decay, backlink count, tag match, and importance — into a single relevance score. A strategy parameter shifts signal weights per query type: lookup upweights keyword precision; planning upweights recency and link structure; synthesis upweights semantic similarity.

Everything runs on-device. No network calls, no API keys, no ongoing cost after the initial ~130MB model download. The entire index can be rebuilt from the Markdown files at any time.


Features

Tasks

An event-sourced task system with dependencies, labels, and comments. Tasks are automatically indexed into the memory system as they are created or updated, making them semantically searchable alongside notes.

Memory & Search

Hybrid retrieval combining vector search (LanceDB) and keyword search (SQLite FTS5). Supports episodic memory (episodes), procedural memory (procedures), and reflective synthesis (reflections).

Records

Content-addressed storage for typed work products and snapshots. Records can be linked to tasks or note chunks, providing a durable audit trail of documents, analyses, plans, and state captures.

Notes

Markdown-based knowledge base. brain watches your note directories and incrementally indexes changes, extracting wiki-links, tags, and heading-aware chunks.

Jobs

An internal job system for deferred and recurring work. Handles background tasks like embedding stale content, generating summaries, and consolidating episodic memories into reflections.

neural_link

Real-time multi-agent coordination. Enables multiple agents to share a coordination room, exchange findings, and resolve blockers during complex multi-step tasks.


Installation

Homebrew (macOS)

brew install benediktms/brain/brain

Shell installer (macOS / Linux)

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/benediktms/brain/releases/latest/download/brain-installer.sh | sh

From source

Requires Rust (stable, edition 2024), just, and protobuf (brew install protobuf or apt install protobuf-compiler).

git clone https://github.com/benediktms/brain.git
cd brain
just install

Quick Start

brain init ~/notes
brain daemon start

brain init creates the brain configuration, downloads the embedding model (~130 MB on first run), and performs the initial index. brain daemon start launches the daemon and registers it to auto-start on login (launchd on macOS, systemd on Linux).

Model Cache

brain uses BGE-small-en-v1.5 for embeddings. The model is downloaded once and verified via BLAKE3 checksums at every startup.

~/.brain/models/bge-small-en-v1.5/
  config.json          BERT config (hidden_size=384)
  tokenizer.json       WordPiece tokenizer
  model.safetensors    Model weights (~130MB, memory-mapped)

Download the model (choose one):

# Option 1: Run the setup script directly (requires curl + installs HuggingFace CLI if needed)
curl -sSL https://raw.githubusercontent.com/benediktms/brain/master/scripts/setup-model.sh | bash

# Option 2: If you have the HuggingFace CLI already installed
hf download BAAI/bge-small-en-v1.5 config.json tokenizer.json model.safetensors \
  --local-dir ~/.brain/models/bge-small-en-v1.5

Override the model location with BRAIN_MODEL_DIR or BRAIN_HOME. If a file is corrupted or swapped, brain will report a checksum mismatch with expected and actual hashes — re-download the model to fix.

Connect to an AI agent

{
  "mcpServers": {
    "brain": {
      "command": "brain",
      "args": ["daemon"]
    }
  }
}

Agent Plugin Integration

brain distributes plugins for agent runtimes. The plugin surface is canonical — install via the CLI and restart the target client.

brain plugin install                  # Claude Code, default target
brain plugin install --target claude  # explicit Claude Code install
brain plugin install --target codex   # Codex install

Claude Code installs four domain plugins under ~/.claude/plugins/marketplaces/ and registers them with the claude CLI when available. This provides /tasks:*, /mem:*, /records:*, and /brain:* commands plus the Brain hooks (SessionStart, UserPromptSubmit, PreCompact, Stop, PreToolUse) without mutating your project's .claude/settings.json.

Codex installs one consolidated home-local plugin at ~/.agents/plugins/brain/ and upserts the brain entry in ~/.agents/plugins/marketplace.json. This exposes the bundled Brain skills for tasks, memory, records, and administration through the Codex plugin marketplace. Configure the Brain MCP server in Codex first so the skills can call the Brain MCP tools.

To remove a plugin:

brain plugin uninstall                  # Claude Code, default target
brain plugin uninstall --target codex   # Codex

Restart Claude Code or Codex after installing or uninstalling so the client reloads plugin metadata.

Advanced / manual setup

brain hooks install directly injects hook entries into .claude/settings.json. This path is retained for environments where the Claude Code plugin marketplace is unavailable (CI runners, air-gapped machines):

brain hooks install            # mutates .claude/settings.json
brain hooks install --dry-run  # preview changes without applying

Prefer brain plugin install --target claude for all standard Claude Code setups.


MCP Tools

Tool Description
memory.retrieve Search notes and retrieve content at requested level of detail (L0/L1/L2). Unifies search and expansion in one call. Supports hybrid ranking, metadata filters, and cross-brain search.
memory.write_episode Record an episodic memory (goal, actions, outcome) with tags and importance.
memory.reflect Two-phase: returns source material for synthesis, then stores the agent-generated reflection.
tasks.apply_event Apply a task event (create, update, status change, dependency, label, comment) via event sourcing.
tasks.get Get a single task by ID with full details, relationships, comments, labels, and linked notes. Supports brain param for cross-brain fetch.
tasks.list List tasks filtered by status (all, ready, blocked) or fetch specific tasks by ID. Supports brain param for cross-brain listing.
tasks.close Close one or more tasks by ID. Supports brain param for cross-brain close.
tasks.next Return highest-priority ready tasks, sorted by priority or due date.
records.create_document Create a document record with text (plain) or data (base64) content. Embedded, searchable, and included in scope summaries.
records.create_analysis Create an analysis record with text (plain) or data (base64) content. Embedded, searchable, and included in scope summaries.
records.create_plan Create a plan record with text (plain) or data (base64) content. Embedded, searchable, and included in scope summaries.
records.save_snapshot Save a snapshot record with text (plain) or data (base64) content.
records.get Get a record by ID with full metadata, tags, and links. Supports prefix resolution and brain param for cross-brain access.
records.list List records with optional filters (kind, status, tag, task_id). Supports brain param for cross-brain access.
records.fetch_content Fetch the raw content of a record as base64-encoded data. Supports brain param for cross-brain access.
records.archive Archive a record (metadata-only operation, payload preserved).
records.tag_add Add a tag to a record. Idempotent.
records.tag_remove Remove a tag from a record. Idempotent.
records.link_add Link a record to a task or note chunk.
records.link_remove Remove a link from a record to a task or note chunk.
status Get runtime health metrics: latency percentiles, queue depth, stuck files.

Record Kinds and Policy

  • records.create_document, records.create_analysis, and records.create_plan create typed records with deterministic policy: they are embedded into LanceDB, searchable via records and memory retrieval, and included in scope summaries.
  • records.save_snapshot creates snapshot records for opaque state capture; snapshots are stored durably but are not embedded or summarized.
  • Other canonical kinds follow the same per-kind policy: summary is embedded + summarized + searchable, implementation and review are embedded + searchable only, and custom defaults to embedded + searchable.

Unified Retrieval Pattern

Call memory.retrieve once with the LOD you need:

  • L0 (~100 tokens per result) for scanning many results, orienting cheaply
  • L1 (~2000 tokens per result) for balanced summary + detail, most common use
  • L2 (full content) when you need the complete source

Total cost: 100–2000 tokens per result depending on LOD, vs. 4,000–8,000 tokens for naive top-k expansion.


Architecture

graph TB
    subgraph Notes["Markdown Notes (Source of Truth)"]
        MD[("*.md files")]
    end

    subgraph Daemon["brain daemon"]
        subgraph Ingest["Ingest Pipeline"]
            FW[File Watcher] --> HG[Hash Gate<br/>BLAKE3]
            HG --> MP[Markdown Parser] --> CH[Chunker] --> EM[Embedder<br/>BGE-small]
        end

        subgraph Query["Query Engine"]
            HR[Hybrid Ranker] --- IP[Intent Profiles]
            HR --- TB[Token Budget]
        end

        subgraph Server["MCP Server"]
            SM["memory.*"] ~~~ TA["tasks.*"]
            TA ~~~ RC["records.*"]
            RC ~~~ NL["neural_link.*"]
        end

        subgraph Jobs["Job System"]
            JW[Job Worker] --> JS[Job Store]
        end
    end

    subgraph Storage["Dual Store (brain_persistence)"]
        SQLite["SQLite<br/>FTS5 · metadata · links · tasks · records · jobs<br/>all tables scoped by brain_id"]
        LanceDB["LanceDB<br/>384-dim embeddings (per-brain)"]
    end

    MD -->|fs events| FW
    CH -->|metadata| SQLite
    EM -->|vectors| LanceDB
    Server --> Query
    Query --> SQLite
    Query --> LanceDB
    JW --> SQLite
Loading

Memory Tiers

quadrantChart
    title Memory Tiers
    x-axis "High Token Cost" --> "Low Token Cost"
    y-axis "Low Recall" --> "High Recall"
    Tier 1 Episodic - Raw Chunks: [0.15, 0.85]
    Tier 1 Episodic - Full Markdown: [0.2, 0.8]
    Tier 1 Episodic - 384-dim Embeddings: [0.25, 0.75]
    Tier 2 Semantic - Tags & Backlinks: [0.45, 0.55]
    Tier 2 Semantic - Tasks & Timestamps: [0.55, 0.45]
    Tier 3 Procedural - Summaries: [0.75, 0.25]
    Tier 3 Procedural - Reflections: [0.8, 0.2]
    Tier 3 Procedural - 2-sent Stubs: [0.85, 0.15]
Loading

For full technical details — sequence diagrams, hybrid scoring formula, intent weight profiles, performance targets, storage role separation, and mathematical foundations — see docs/ARCHITECTURE.md.


Usage

brain index                                           # One-shot index all notes
brain watch                                           # Watch and index incrementally
brain memory retrieve "weekly review template"        # Search from CLI
brain memory retrieve --strategy planning "next steps"
brain memory retrieve --count 10 "async patterns"
brain daemon start                                    # Start + register auto-start
brain daemon stop                                     # Stop + deregister
brain daemon status                                   # Check daemon state

Multiple Brains

Brains are named containers with independent notes, indexes, and config. Managed via a central registry at ~/.brain/:

# ~/.brain/state_projection.toml
[brains.personal]
root = "~/notes"
notes = ["~/notes"]

[brains.work]
root = "~/code/my-project"
notes = ["~/code/my-project/docs", "~/code/my-project/notes"]

All derived data lives in ~/.brain/brains/<name>/, not in the project directory.

Cross-brain Operations

Tasks can be created, fetched, and closed across brains:

brain tasks create --title="..." --brain=work           # Create a task in another brain
brain tasks show <id> --brain=work                      # Fetch task details from another brain
brain tasks list --brain=work                           # List tasks from another brain
brain tasks close <id> --brain=work                     # Close a task in another brain
brain tasks label batch-add area:infra BRN-01ABC --brain=work  # Add label to remote tasks

Records can also be accessed across brains:

brain records list --brain=work                         # List records from another brain
brain records get <id> --brain=work                     # Get record details from another brain
brain records fetch-content <id> --brain=work           # Fetch record content from another brain

The MCP tools tasks.create, tasks.get, tasks.list, tasks.close, tasks_labels_batch, records.list, records.get, and records.fetch_content all accept an optional brain parameter for cross-brain operations.


Development

just build        # Build
just check        # Type check
just test         # Run tests
just lint         # Lint
just fmt           # Format
just clean        # Clean build artifacts
just clean-db     # Clean database (forces full reindex)

Workspace Layout

brain/
  brain_lib/          # Core library: domain logic, ports, MCP server
  brain_persistence/  # Persistence adapters: SQLite, LanceDB, object store
  cli/                # Thin binary: wires CLI commands to library functions
  docs/
    ARCHITECTURE.md
    RECORDS.md
    OPERATIONS.md
    RESEARCH.md
  justfile

Release

just tag patch    # 0.1.0 -> 0.1.1
just tag minor    # 0.1.0 -> 0.2.0
just tag major    # 0.1.0 -> 1.0.0
just changelog

Further Reading

  • docs/ARCHITECTURE.md — Full technical architecture, sequence diagrams, storage design, mathematical foundations
  • docs/OPERATIONS.md — Upgrading, backup, recovery, troubleshooting, performance tuning, model management
  • docs/RECORDS.md — Records domain: artifacts, snapshots, content-addressed storage, CLI/MCP reference
  • docs/RESEARCH.md — Research survey of agentic memory systems, retrieval design, token budget design
  • MemGPT — Virtual context management for long-running agents (Packer et al., 2023)
  • Generative Agents — Recency/relevance/importance scoring with reflective synthesis (Park et al., 2023)
  • Mem0 — Memory extraction, consolidation, and tiered retrieval
  • MCP Specification — AI agent tool integration over stdio JSON-RPC

License

MIT

About

Task and knowledgebase management for agentic workflows using retrieval-augmented generation (RAG)

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages