The master gland doesn't do the work — it regulates the whole system. Pituitary keeps specifications, code, and docs from drifting out of sync.
Pituitary is for maintaining consistency between intent (specs, decisions, policies) and implementation (code, docs, configs) across the full lifecycle. For teams managing 20-100+ specifications and the artifacts around them, it provides automated support for:
- Overlap detection — catching when a new spec covers ground already addressed
- Tradeoff analysis — comparing competing or overlapping specs
- Impact analysis — understanding what changes when a spec is accepted, modified, or deprecated
- Code compliance — validating that code adheres to accepted specs, or flagging gaps
- Documentation sync — keeping non-spec docs aligned with accepted specs
- Spec freshness — detecting specs that may be stale or superseded by decisions recorded in decision-bearing artifacts
The key constraint is that Pituitary should not be defined by one storage or workflow choice. Specifications may originate in local files, git repositories, PDFs, databases, or other systems. Pituitary's job is to normalize those inputs into a common governance model — a temporal, confidence-weighted dependency graph — and run consistency analysis against it, not to own source control, CI, or authoring.
Pituitary core is responsible for:
- Normalizing source material into canonical spec and document records
- Building a searchable index plus an explicit dependency graph
- Running overlap, comparison, impact, compliance, and doc-drift analysis
- Exposing those capabilities through a stable CLI and thin programmatic transports such as MCP
Pituitary core is not responsible for:
- Being tied to GitHub, pull requests, or any specific CI vendor
- Requiring git as the source of truth
- Requiring Markdown frontmatter as the source format
- Owning review workflows, comment posting, or issue tracking
Those concerns belong in adapters and integrations layered around the core.
The first shipping slice should be intentionally narrow. It exists to prove that Pituitary can ingest specs, build a consistent index, and answer the core spec-management questions without being entangled with CI vendors, source-control providers, or document-extraction complexity.
- Local filesystem only
- One metadata format for specs:
spec.toml - One body format for specs and docs: Markdown
- One embedded corpus backend: local SQLite +
sqlite-vecvia Stroma, plus a local Pituitary governance DB - One required transport: CLI
- Five required analysis capabilities:
search_specscheck_overlapcompare_specsanalyze_impactcheck_doc_drift
- One required compound workflow:
review_spec
- GitHub PR comments and vendor-specific CI or reporting flows
- PDF ingestion
- Database-backed source adapters
- Incremental index updates
- Stored code-summary embeddings
- Broad provider-backed code-compliance adjudication beyond the shipped deterministic CLI slice; only bounded re-adjudication of deterministic findings is in scope
- An optional MCP server transport that wraps the same analysis packages as the CLI
- A repository CI workflow that runs fmt, readiness, test, race, vet, and lightweight analyzer validation
These are shipped alongside the first slice, but only the CLI is required for first-ship completeness. The CI job is delivery plumbing, not a GitHub-specific product integration surface.
The first ship should use one repo-local config file:
pituitary.toml
schema_version = 3
[workspace]
root = "."
index_path = ".pituitary/pituitary.db"
[[sources]]
name = "specs"
adapter = "filesystem"
kind = "spec_bundle"
path = "specs"
[[sources]]
name = "docs"
adapter = "filesystem"
kind = "markdown_docs"
path = "docs"
include = ["guides/*.md", "runbooks/*.md"]
[[sources]]
name = "contracts"
adapter = "filesystem"
kind = "markdown_contract"
path = "rfcs"
include = ["**/*.md"]This keeps the first ship explicit and easy to reason about. The indexed config remains explicit even as the repo grows into inferred-contract sources: pituitary discover may propose a local .pituitary/pituitary.toml, but it must stay conservative, inspectable before write, and never introduce hidden indexing behavior behind the user's back.
When one logical workspace spans multiple repositories, the primary workspace.root may also declare workspace.repo_id, extra [[workspace.repos]] roots, and per-source repo = "..." bindings. Search, drift, impact, status, and rebuild outputs must then surface repo identity alongside source-relative paths so duplicate filenames from sibling repos remain distinguishable.
When the config schema changes, Pituitary should detect older known shapes explicitly and provide a migration path instead of failing with a generic unsupported-section error. The current schema is versioned with schema_version = 3, and legacy [project] configs should be rewritten through pituitary migrate-config. Schema 3 also reserves [sources.options] for adapter-specific typed settings while keeping the kernel-owned source fields explicit.
Inferred markdown_contract records must preserve confidence metadata alongside the normalized artifact so result surfaces can distinguish strong explicit extraction from weaker path/default fallbacks. Search should expose those confidence signals inline, while higher-stakes outputs such as impact and doc-drift should elevate weak inference as warnings instead of silently treating it as equally strong.
The index must also carry enough metadata to reject stale reuse. At minimum, rebuilds should persist the configured embedder fingerprint, a normalized source fingerprint, and a content fingerprint of the indexed artifacts. Search and analysis commands should compare those values against the current workspace before returning results and fail fast with pituitary index --rebuild guidance when they no longer match.
Before a rebuild or dry run touches SQLite, Pituitary should validate the explicit spec relation graph. Cycles in depends_on or supersedes, plus contradictory relation combinations, are repository-health failures and should be surfaced with the exact refs involved instead of silently entering the index.
When teams want more rigor, Pituitary may optionally generate an explicit spec bundle from one inferred contract. That canonicalization flow must preserve the stable inferred ref, preserve source provenance, preview the generated spec.toml and body.md before write, and remain incremental rather than forcing whole-repo migration.
┌──────────────────────────────────────────────────────────────┐
│ Source Systems / Files │
│ │
│ V1: local spec bundles + docs │
│ Later: git repos, PDFs, databases, remote APIs │
└──────────────────────────────┬───────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Source Adapters │
│ │
│ • filesystem adapter (V1) │
│ • pdf adapter (later) │
│ • database adapter (later) │
│ • git / GitHub adapter (later) │
└──────────────────────────────┬───────────────────────────────┘
│ normalized records
▼
┌──────────────────────────────────────────────────────────────┐
│ Core Pipeline │
│ │
│ 1. Normalize records │
│ 2. Chunk text │
│ 3. Generate embeddings │
│ 4. Build relations graph │
│ 5. Publish a Stroma snapshot + rebuild pituitary.db │
└──────────────────────────────┬───────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Local Stores │
│ │
│ • Stroma snapshot: records / chunks / chunks_vec / metadata │
│ • pituitary.db: artifacts / edges / ast_cache / metadata │
└──────────────────────────────┬───────────────────────────────┘
│ queries
▼
┌──────────────────────────────────────────────────────────────┐
│ Analysis Engine │
│ │
│ • check_overlap │
│ • compare_specs │
│ • analyze_impact │
│ • check_compliance │
│ • check_doc_drift │
│ • check_spec_freshness │
│ • search_specs │
│ • review_spec │
└──────────────────────────────┬───────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Transport / Extensions │
│ │
│ V1 required transport: CLI │
│ V1 optional wrapper: MCP │
│ Shipped repo validation: CI │
│ Later integrations: git hooks, PR comments, editors │
└──────────────────────────────────────────────────────────────┘
Pituitary is a tools-only system, not an autonomous agent. Each tool does one job: retrieve context from the index and graph, apply deterministic analysis in the current bootstrap, and return structured output. Orchestration lives outside Pituitary, in the calling client or automation layer.
This keeps the core simple, testable, and composable:
- MCP clients can call individual tools in any order
- CLI automation can invoke the same logic without a separate orchestration runtime
- Deterministic retrieval remains testable without involving an LLM
When to revisit this decision: if the same multi-step workflow keeps reappearing, add a thin compound tool such as review_spec on top of the primitives. Do not move orchestration policy into the storage or analysis layers.
Pituitary should reason over a canonical internal model, not over source-specific files.
At ingestion time, every adapter normalizes inputs into the same conceptual shape:
SpecRecord
ref stable Pituitary reference (for example "SPEC-042")
kind "spec"
title
status draft | review | accepted | deprecated | superseded
domain
authors[]
tags[]
relations[] depends_on / supersedes
applies_to[] code or config refs governed by the spec
source_ref where the record came from
body_format markdown | plaintext | extracted_pdf_text | ...
body_text
metadata adapter-specific extras
DocRecord
ref
kind "doc"
title
source_ref
body_format
body_text
metadata
Pituitary should distinguish between three different identifier classes:
ref: the canonical identifier used by the index and tool inputs.- Specs use their declared spec IDs such as
SPEC-042. - Docs use a canonical doc ref derived from the workspace-relative Markdown path, for example
docs/guides/api-rate-limits.md->doc://guides/api-rate-limits.
- Specs use their declared spec IDs such as
source_ref: provenance for where a record came from.- For the filesystem adapter, this should be a
file://URI rooted at the workspace, for examplefile://specs/rate-limit-v2/spec.tomlorfile://docs/guides/api-rate-limits.md.
- For the filesystem adapter, this should be a
applies_to: logical references governed by a spec.- V1 uses canonical scheme-specific refs such as
code://...andconfig://....
- V1 uses canonical scheme-specific refs such as
Tool inputs for indexed artifacts should use canonical ref values, not source_ref values. Provenance should remain available in outputs and stored metadata, but it is not the primary query surface.
- Valid persisted spec statuses are
draft,review,accepted,superseded, anddeprecated. - If spec
Adeclaressupersedes = ["B"], then persisted fixture data forBshould normally usestatus = "superseded"onceAis accepted. - Default semantic search should include
draft,review, andacceptedspecs, and should excludesupersededanddeprecatedspecs unless the caller explicitly asks for them. - Overlap analysis should include
supersededspecs as historical comparison candidates, but it should excludedeprecatedspecs by default. - Impact analysis should traverse explicit graph relations regardless of status, and should label superseded artifacts as historical findings in the response when they appear.
Pituitary v1 should ship exactly one first-party source format for specs:
specs/
rate-limit/
spec.toml
body.md
spec.toml
id = "SPEC-042"
title = "Rate Limiting for Public API Endpoints"
status = "accepted"
domain = "api"
authors = ["emanuele"]
tags = ["rate-limiting", "api", "security"]
body = "body.md"
depends_on = ["SPEC-012", "SPEC-031"]
supersedes = ["SPEC-008"]
applies_to = [
"code://src/api/middleware/ratelimiter.go",
"config://src/api/config/limits.yaml",
]body.md
## Overview
...
## Requirements
...
## Design Decisions
...Why this is a better v1 choice than YAML frontmatter:
- TOML is much simpler to parse and validate than YAML
- Metadata is not coupled to Markdown as a container format
- The split between
spec.tomlandbody.mdmaps cleanly to the internal model - Future adapters can emit the same model without pretending they have frontmatter
This does not mean TOML is the product's identity. It is only the first adapter format.
Source adapters are the boundary between external systems and the Pituitary core.
Each adapter has four jobs:
- Enumerate source items
- Load raw content
- Normalize into canonical records
- Report stable
source_refand content hashes for change detection
The core should not care whether a record came from:
- a local
spec.toml+body.mdbundle - a Markdown doc directory
- a PDF that has been text-extracted
- a database row
- a git revision or pull request diff
The adapter contract keeps that variability out of the analysis engine.
Current scope:
filesystemadapter for spec bundlesfilesystemadapter for docs directoriesfilesystemadapter for inferred Markdown contracts
V1 filesystem enumeration rules:
- For
kind = "spec_bundle", recursively walk the configured source root and treat each directory containing aspec.tomlfile as one bundle. - Selector matching for spec bundles is done against the source-relative
spec.tomlpath. - A valid bundle must contain exactly one
spec.toml; itsbodyfield must resolve to exactly one file relative to the bundle directory. - Nested bundles inside another bundle directory are invalid and should fail with a clear path-specific error.
- For
kind = "markdown_docs", recursively index*.mdfiles under the configured source root, then apply selectors against source-relative paths. - For
kind = "markdown_contract", recursively index*.mdfiles under the configured source root, infer spec metadata from common Markdown fields, and normalize the file into aSpecRecord. filesis an optional exact allowlist of source-relative files.includeandexcludeare optional glob filters over those same source-relative paths.- If
filesis present, it narrows the candidate set beforeinclude/excludeare applied. - For
kind = "spec_bundle",filesentries must point tospec.toml. - For
kind = "markdown_docs"andkind = "markdown_contract",filesentries must point to.mdfiles. - A doc title should come from the first H1 heading when present; otherwise it should fall back to the filename stem.
- A doc
refshould be derived from the Markdown path relative to the configured doc source root, without the.mdsuffix. - An inferred contract title should come from the first H1 heading when present; otherwise it should fall back to the filename stem.
- An inferred contract should use explicit
Ref:/ID:metadata when present; otherwise it should fall back to a stable workspace-relativecontract://...ref and defaultstatus = "draft"when no valid status is declared.
Later, as extensions:
pdfadapterdatabaseadaptergitadaptergithubadapter
Git and GitHub are therefore integration surfaces, not architectural assumptions.
The indexing pipeline should operate on normalized records, not on source-specific files.
Step 1: Load
├── Ask one or more adapters for canonical records
└── Validate required fields for specs and docs
Step 2: Normalize
├── Persist canonical metadata for each record
├── Extract explicit relations (depends_on, supersedes, applies_to)
└── Attach provenance (adapter, source_ref, content hash)
Step 3: Chunk
├── For markdown, split by heading-aware sections
├── For plaintext or extracted PDF text, split by paragraphs / headings
└── Store chunks keyed to the parent record
Step 4: Embed
├── Generate embeddings for spec and doc chunks
└── Store vectors in chunks_vec keyed by chunk_id
Step 5: Graph Build
├── Add explicit spec-to-spec relations
├── Add spec-to-code refs from applies_to
└── Keep all refs in canonical string form
Step 6: Atomic Swap
├── Publish the rebuilt Stroma snapshot
└── Replace the active pituitary.db with the rebuilt staging DB
Embedding model recommendation: At 20-100 specs, a local model such as nomic-embed-text is sufficient and keeps the system offline-friendly. Cloud embeddings remain easy to swap in behind an Embedder interface.
Bootstrap runtime contract (current implementation):
- The current runtime supports two embedder providers:
fixtureandopenai_compatible. - The current runtime supports two analysis providers:
disabledandopenai_compatible. - Retrieval, indexing, and candidate shortlisting remain deterministic even when provider-backed analysis is enabled.
- Provider-backed analysis currently applies only to bounded adjudication steps in
compare-specs,check-doc-drift, andcheck-compliance;review-specinherits those refined results. - Tests and CI should use the deterministic fixture embedder and require no live model credentials.
- Unsupported runtime providers should fail during config validation with clear, intentional errors.
- Provider-backed embeddings and bounded provider-backed analysis are both now part of the runtime contract.
V1 runtime configuration should be explicit in pituitary.toml under [runtime.embedder] and [runtime.analysis], with optional reusable named bases under [runtime.profiles.<name>]:
| Field | Embedder | Analysis | Notes |
|---|---|---|---|
provider |
optional, defaults to fixture |
optional, defaults to disabled |
Embedder and analysis currently support openai_compatible; analysis also supports disabled |
profile |
optional | optional | Selects one named reusable runtime profile and then applies per-block overrides on top |
model |
defaults to fixture-8d for fixture; required for openai_compatible |
required for openai_compatible, ignored when disabled |
Part of the embedder fingerprint stored in index metadata |
endpoint |
required for openai_compatible, ignored for fixture |
required for openai_compatible, ignored when disabled |
Expected to point at an OpenAI-compatible API root such as http://host:1234/v1 |
api_key_env |
optional | optional | Optional so local servers such as LM Studio can run without credentials |
timeout_ms |
optional, defaults to 1000 |
optional, defaults to 1000 |
Active for openai_compatible embedding requests |
max_retries |
optional, defaults to 0 |
optional, defaults to 0 |
Active for retryable openai_compatible runtime failures |
pituitary status should surface the resolved runtime assumptions actually in force: active profile, provider, model, endpoint, timeout, and retry settings for both embedder and analysis. pituitary status --check-runtime ... should probe those same resolved values rather than a hidden alternate contract.
Degraded behavior rules:
- The
fixtureembedder must be deterministic, require no network access, and be the default mode for CI and local tests. - Unsupported embedder or analysis providers must fail during config validation rather than degrading silently.
- Indexed metadata must store both embedder dimension and embedder fingerprint so provider/model changes fail clearly and require a rebuild.
- Provider-backed analysis must preserve the current storage and transport contracts rather than widening them implicitly.
Retry and timeout rules:
timeout_msandmax_retriesremain parsed for both runtime blocks so the config shape does not need a second contract change later.- For
openai_compatibleembeddings, those fields control the HTTP client timeout and retry behavior. - For
openai_compatibleanalysis, those fields control the HTTP client timeout and retry behavior. - For
fixtureembeddings anddisabledanalysis, those fields remain inert.
Chunking strategy: The current implementation uses a lightweight internal Markdown scanner that splits on ATX headings, preserves the nested heading path in each section title, and falls back to one title-scoped chunk when a document has no headings. For non-Markdown inputs, adapters should either provide text with lightweight structural markers or let the chunker fall back to paragraph-based splitting.
Per-kind chunking dispatch: Pituitary mixes two semantically distinct record kinds (specs and docs) whose optimal chunking shapes differ — specs benefit from tight heading-bounded chunks, docs benefit from context-preserving parent/leaf hierarchies. internal/chunk.Resolve adapts [runtime.chunking.spec] and [runtime.chunking.doc] TOML blocks into a stroma chunk.Policy, wired as BuildOptions.ChunkPolicy at the single call site in internal/index/rebuild.go. When no per-kind config is present the rebuild passes ChunkPolicy: nil so stroma's default MarkdownPolicy applies, preserving pre-override behavior byte-for-byte. When any kind is configured, the router's fallback for the unconfigured kind is a MarkdownPolicy that carries stroma/index.DefaultMaxChunkSections (the bounded DoS guard that stroma applies automatically on the nil-ChunkPolicy path), and each configured kind also defaults max_sections to that same bound — so enabling the spec side does not silently disable the per-record section cap on docs or vice versa. A negative max_sections is the explicit escape hatch that disables the cap for that one configured kind, and applies nowhere else.
Filtered vector queries: Stroma owns chunks_vec, chunks, and records, and should resolve corpus-local joins behind its library API. Pituitary should ask Stroma for candidate sections, then apply governance filters such as status, domain, and other business metadata against artifacts before ranking the final candidate set.
Indexed state is split into two local artifacts:
- a Stroma snapshot that owns corpus records, chunks, vectors, and neutral metadata
pituitary.db, which owns governance metadata, edges, AST cache, and the pointer to the active Stroma snapshot
At this scale, local SQLite is still enough, keeps deployment simple, and makes staged rebuilds straightforward. In the current Go implementation, vec0 is wired through github.com/mattn/go-sqlite3 plus github.com/asg017/sqlite-vec-go-bindings/cgo, so local and CI builds need a CGO-capable C toolchain.
-- Canonical records from any adapter
CREATE TABLE artifacts (
ref TEXT PRIMARY KEY, -- "SPEC-042", "doc://guides/api-rate-limits"
kind TEXT NOT NULL, -- "spec" | "doc"
title TEXT,
status TEXT, -- NULL for docs
domain TEXT,
source_ref TEXT NOT NULL, -- provenance such as "file://docs/guides/api-rate-limits.md"
adapter TEXT NOT NULL, -- "filesystem", "pdf", "database", ...
body_format TEXT NOT NULL, -- "markdown", "plaintext", ...
content_hash TEXT NOT NULL,
metadata_json TEXT NOT NULL -- adapter-specific metadata
);
-- Chunked body text
CREATE TABLE chunks (
id INTEGER PRIMARY KEY AUTOINCREMENT,
artifact_ref TEXT NOT NULL,
section TEXT,
content TEXT NOT NULL,
FOREIGN KEY (artifact_ref) REFERENCES artifacts(ref)
);
-- sqlite-vec virtual table for similarity search
CREATE VIRTUAL TABLE chunks_vec USING vec0(
chunk_id INTEGER PRIMARY KEY,
embedding float[EMBEDDING_DIM]
);
-- Canonical graph edges
CREATE TABLE edges (
from_ref TEXT NOT NULL,
to_ref TEXT NOT NULL,
edge_type TEXT NOT NULL, -- "depends_on" | "supersedes" | "applies_to"
PRIMARY KEY (from_ref, to_ref, edge_type)
);
CREATE INDEX idx_artifacts_kind_status_domain
ON artifacts(kind, status, domain);
CREATE INDEX idx_chunks_artifact_ref
ON chunks(artifact_ref);
CREATE INDEX idx_edges_from_ref_type
ON edges(from_ref, edge_type);
CREATE INDEX idx_edges_to_ref_type
ON edges(to_ref, edge_type);| Collection | Source | Used For |
|---|---|---|
spec artifacts |
Canonical spec records | Overlap detection, tradeoff analysis, impact analysis, compliance |
doc artifacts |
Canonical non-spec docs | Documentation drift detection |
Code is intentionally not indexed as a third stored semantic corpus in v1. For compliance checks, Pituitary embeds the current file or diff at request time and searches against spec chunks directly. That preserves the retrieval fallback without adding a second invalidation problem for stored code summaries.
The indexer writes a staged pituitary.db.new and publishes a content-addressed local Stroma snapshot. The live Pituitary DB then points at that immutable snapshot, so governance metadata and corpus retrieval stay aligned without Pituitary querying Stroma tables directly.
pituitary index --rebuild
1. Create pituitary.db.new
2. Load all records from configured adapters
3. Reuse unchanged chunk vectors from the active index when schema + fingerprints still match
4. Publish the content-addressed Stroma snapshot
5. Populate artifacts + edges + snapshot pointer metadata in pituitary.db.new
6. Run integrity checks
7. Rename pituitary.db.new -> pituitary.db
8. On failure: delete pituitary.db.new, keep existing index untouched
To make the swap visible to a running process, tool handlers should open a fresh read-only SQLite connection per request, or explicitly reload when the active index generation changes.
By default, pituitary index --rebuild should reuse unchanged chunk embeddings when the active index has the same schema version, embedder fingerprint, and source fingerprint. pituitary index --rebuild --full remains the escape hatch for a complete re-embed.
If SQLite stops being sufficient, the vector and storage layers can be abstracted behind interfaces. That is a later optimization, not a v1 requirement.
The current bootstrap implementation uses deterministic analysis end to end. References to LLM adjudication below describe the intended extension point for later runtime work, not a shipped requirement today.
All tools follow the same pattern:
- Deterministic retrieval narrows the candidate set using SQL, graph traversal, and vector search
- Deterministic analysis today produces the shipped result; richer provider-backed adjudication may be added later on the narrowed set
This keeps retrieval reproducible, testable, and cheap.
All shipped commands should also share one JSON envelope and one issue-item shape:
{
"request": { "...": "normalized tool input" },
"result": { "...": "tool-specific payload" },
"warnings": [
{
"code": "string",
"message": "human-readable warning",
"path": "optional/workspace-relative/path"
}
],
"errors": [
{
"code": "string",
"message": "human-readable error",
"path": "optional/workspace-relative/path"
}
]
}Contract rules:
requestechoes the normalized input after CLI parsing, using canonicalrefvalues rather thansource_ref.resultis command-specific and should benullwhen a command exits with errors before producing a domain result.pituitary schema <command> --format jsonexposes the machine-readable request and response contract for each shipped command.- Commands that return raw workspace excerpts or evidence should expose
result.content_trustso callers can treat returned repo text as untrusted input. warningsanderrorsuse the same object shape.pathis optional and should stay workspace-relative when present.- Common
errors[].codevalues in v1 arevalidation_error,config_error,not_found,dependency_unavailable, andinternal_error.
CLI exit codes should stay simple:
0success, including success-with-warnings2validation or configuration error3dependency unavailable, reserved for runtime dependencies such as future provider-backed analysis
init(pituitary init)- Request:
{ "path": ".", "config_path": "...", "dry_run": false } - Result:
{ "workspace_root": "...", "config_path": "...", "config_action": "preview" | "wrote", "discover": { ... discovered config proposal ... }, "index": { ... } | null, "status": { ... } | null }
- Request:
status(pituitary status)- Request:
{ "check_runtime": "none" | "embedder" | "analysis" | "all" } - Result:
{ "workspace_root": "...", "config_path": "...", "config_resolution": { "selected_by": "command_flag" | "global_flag" | "env" | "discovered_local", "reason": "...", "candidates": [{ "precedence": 1, "source": "...", "path": "...", "status": "selected" | "shadowed" | "not_set" | "missing", "detail": "..." }] }, "index_path": "...", "index_exists": true, "freshness": { "state": "missing" | "fresh" | "stale" | "incompatible", "action": "runpituitary index --rebuild", "issues": [{ "kind": "...", "message": "...", "indexed": "...", "current": "..." }] }, "spec_count": N, "doc_count": N, "chunk_count": N, "artifact_locations": { "index_dir": "...", "discover_config_path": "...", "canonicalize_bundle_root": "...", "ignore_patterns": [".pituitary/"], "relocation_hints": ["..."] }, "runtime": { ... } | null }
- Request:
index(pituitary index --rebuild)- Request:
{ "rebuild": true, "full": false } - Result:
{ "workspace_root": ".", "index_path": ".pituitary/pituitary.db", "artifact_counts": { "spec": N, "doc": N }, "chunk_count": N, "edge_count": N, "full_rebuild": false, "reused_artifact_count": N, "reused_chunk_count": N, "embedded_chunk_count": N }
- Request:
search_specs(pituitary search-specs)- Request:
{ "query": "...", "filters": { "domain": "...", "statuses": ["accepted"] }, "limit": 10 } - Result:
{ "matches": [{ "ref": "SPEC-042", "title": "...", "section_heading": "...", "score": 0.0, "excerpt": "...", "source_ref": "file://..." }] }
- Request:
check_overlap(pituitary check-overlap)- Request:
{ "spec_ref": "SPEC-042" }or{ "spec_record": { ... canonical spec record ... } } - Result:
{ "candidate": { "ref": "SPEC-042", "title": "..." }, "overlaps": [{ "ref": "SPEC-008", "score": 0.0, "overlap_degree": "high", "relationship": "extends", "guidance": "merge_candidate" }], "recommendation": "proceed_with_supersedes" }
- Request:
compare_specs(pituitary compare-specs)- Request:
{ "spec_refs": ["SPEC-008", "SPEC-042"] } - Result:
{ "spec_refs": ["SPEC-008", "SPEC-042"], "comparison": { "shared_scope": [...], "differences": [...], "tradeoffs": [...], "recommendation": "..." } }
- Request:
analyze_impact(pituitary analyze-impact)- Request:
{ "spec_ref": "SPEC-042", "change_type": "accepted" | "modified" | "deprecated" } - Result:
{ "spec_ref": "SPEC-042", "change_type": "accepted", "affected_specs": [...], "affected_refs": [...], "affected_docs": [{ "ref": "doc://guides/api-rate-limits", "score": 0.0, "classification": "semantic_neighbor" | "governed_surface_neighbor", "reasons": ["..."], "evidence": { "spec_ref": "SPEC-042", "spec_source_ref": "file://specs/...", "spec_section": "...", "doc_source_ref": "file://docs/...", "doc_section": "...", "link_reason": "..." }, "suggested_targets": [{ "source_ref": "file://docs/...", "section": "...", "excerpt": "...", "reason": "...", "suggested_bullets": ["..."] }] }] }
- Request:
check_terminology(pituitary check-terminology)- Request:
{ "terms": ["repo", "workflow"], "canonical_terms": ["locality", "continuity"], "spec_ref": "SPEC-LOCALITY", "scope": "all" | "docs" | "specs" }or{ "spec_ref": "SPEC-LOCALITY" }when config-backed[[terminology.policies]]should supply the governed terms - Config may also declare
[terminology].exclude_pathsto skip historically frozen files from terminology sweeps andcompilewithout removing them from indexing - Result:
{ "scope": { "mode": "workspace" | "spec_ref", "artifact_kinds": ["doc", "spec"], "spec_ref": "SPEC-LOCALITY" }, "terms": [...], "canonical_terms": [...], "anchor_specs": [...], "findings": [{ "ref": "...", "kind": "doc" | "spec", "terms": [...], "sections": [{ "section": "...", "terms": [...], "matches": [{ "term": "repo", "classification": "historical_alias", "context": "current_state" | "historical", "severity": "warning" | "error" | "ignore", "replacement": "locality", "tolerated": false }], "excerpt": "...", "assessment": "...", "evidence": { "spec_ref": "SPEC-LOCALITY", "section": "...", "score": 0.0 } | null }] }], "tolerated": [{ "ref": "...", "kind": "doc" | "spec", "terms": [...], "sections": [...] }] }
- Request:
check_doc_drift(pituitary check-doc-drift)- Request: exactly one of
{ "doc_ref": "doc://guides/api-rate-limits" },{ "doc_refs": ["doc://guides/api-rate-limits"] },{ "scope": "all" }, or{ "diff_text": "..." }; the CLI also accepts--diff-file PATH|-and resolves that into the same diff-backed request shape - Result:
{ "scope": { "mode": "doc_ref" | "doc_refs" | "all" | "diff", "doc_refs": [...] }, "changed_files": [{ "path": "...", "added_line_count": 1, "removed_line_count": 1 }], "implicated_specs": [{ "ref": "SPEC-042", "reasons": ["..."], "files": ["..."], "score": 0.0 }], "implicated_docs": [{ "doc_ref": "...", "reasons": ["..."], "files": ["..."], "score": 0.0 }], "drift_items": [{ "doc_ref": "...", "findings": [{ "spec_ref": "SPEC-042", "code": "...", "classification": "semantic_contradiction" | "role_mismatch", "message": "...", "rationale": "...", "evidence": { "spec_ref": "SPEC-042", "spec_source_ref": "file://specs/...", "spec_section": "...", "doc_source_ref": "file://docs/...", "doc_section": "...", "link_reason": "..." }, "confidence": { "level": "high" | "medium" | "low", "score": 0.0 } }] }], "assessments": [{ "doc_ref": "...", "status": "drift" | "aligned" | "possible_drift", "rationale": "...", "evidence": { ... }, "confidence": { ... } }], "remediation": { "items": [{ "doc_ref": "...", "suggestions": [{ "spec_ref": "SPEC-042", "code": "...", "classification": "semantic_contradiction" | "role_mismatch", "summary": "...", "link_reason": "...", "target_source_ref": "file://docs/...", "target_section": "...", "target_excerpt": "...", "suggested_bullets": ["..."], "evidence": { "spec_source_ref": "file://specs/...", "spec_section": "...", "doc_source_ref": "file://docs/...", "doc_section": "...", "expected": "...", "observed": "...", "link_reason": "..." }, "suggested_edit": { ... } }] }] } }
- Request: exactly one of
fix(pituitary fix)- Request: exactly one of
{ "path": "docs/guides/api-rate-limits.md", "dry_run": true },{ "scope": "all", "dry_run": true }, or{ "doc_refs": ["doc://guides/api-rate-limits"], "apply": true } - Result:
{ "selector": "docs/guides/api-rate-limits.md" | "all" | "doc_refs", "applied": false, "files": [{ "doc_ref": "...", "path": "...", "status": "planned" | "applied" | "skipped", "reason": "...", "warnings": ["..."], "edits": [{ "code": "...", "action": "replace_claim", "replace": "...", "with": "...", "line": N, "start_byte": N, "end_byte": N, "before": "...", "after": "..." }] }], "planned_file_count": N, "planned_edit_count": N, "applied_file_count": N, "applied_edit_count": N, "guidance": ["..."] }
- Request: exactly one of
review_spec(pituitary review-spec)- Request:
{ "spec_ref": "SPEC-042" }or{ "spec_record": { ... canonical spec record ... } } - Result:
{ "spec_ref": "SPEC-042", "overlap": { ... }, "comparison": { ... } | null, "impact": { ... }, "doc_drift": { ... } }
- Request:
The shared errors[] shape above applies to every shipped command. Path-specific validation errors should use code = "validation_error" or code = "config_error" with the offending workspace-relative path.
search_specs.limit defaults to 10 when omitted and must stay within 1..50 in v1 so retrieval work stays bounded.
| Tool | First shipping slice | Notes |
|---|---|---|
search_specs |
required | First proof that indexing and retrieval work |
check_overlap |
required | Primary product value |
compare_specs |
required | Used only on overlapping or user-selected specs |
analyze_impact |
required | Depends on explicit graph plus doc retrieval |
check_terminology |
shipped after first ship | Hybrid lexical-plus-semantic audit for conceptual migrations |
check_doc_drift |
required | Markdown docs only in first ship |
review_spec |
required | Compound wrapper over the required tools |
check_compliance |
shipped after first ship | CLI-first deterministic slice; MCP exposure can follow once the request shape settles |
The first shipping slice was intentionally spec-and-doc centric. Code remained in the model through applies_to references until the core spec workflows were shipped and validated. The current repo now includes a CLI-first deterministic check_compliance slice on top of that same index.
Purpose: detect when a new or changed spec overlaps existing specs.
Input:
{ spec_ref: "SPEC-042" }
OR { spec_record: { ... canonical record ... } } // draft not yet indexed
Process:
Phase 1 — retrieval
1. Parse or load the candidate spec body
2. Chunk and embed it
3. Ask Stroma for candidate sections
4. Filter candidates through `artifacts` to keep kind = "spec"
and status != "deprecated"
while still allowing `superseded` specs as historical candidates
5. Group by artifact_ref and rank by similarity
Phase 2 — adjudication
6. Ask the LLM for overlap degree, affected sections, and
whether the new spec extends, contradicts, or duplicates
Output:
overlaps[]
overlaps[].guidance = merge_candidate | boundary_review
recommendation = proceed_with_supersedes | merge_into_existing | review_boundaries | no_overlap
Purpose: compare two or more overlapping specs.
Input:
{ spec_refs: ["SPEC-008", "SPEC-042"] }
Output:
structured comparison of design decisions, tradeoffs,
compatibility, and recommendation
Purpose: determine what changes when a spec changes state or content.
Input:
{ spec_ref: "SPEC-042", change_type: "accepted" | "modified" | "deprecated" }
Process:
1. Traverse edges for dependent specs
2. Collect applies_to refs for code/config impact
3. Search docs semantically for related concepts
4. Use the LLM only to assess severity and explain why
Output:
affected_specs[]
affected_code[]
affected_docs[]
Purpose: determine whether code matches accepted specs.
Status: shipped in the CLI as a deterministic first follow-on after the first shipping slice.
Input:
{ paths: ["src/api/middleware/ratelimiter.go"] }
OR { diff_text: "..." }
Process:
1. Identify relevant specs:
a. via applies_to reverse lookups in the graph
b. via semantic search from the current file or diff into spec chunks
2. Read actual source or use the supplied diff as primary evidence
3. Deterministically classify findings into:
- compliant
- conflicting
- unspecified with explicit traceability reasons:
- no_governing_spec
- weak_traceability
- traceability_gap
- insufficient_evidence
4. Surface the limiting factor explicitly:
- spec_metadata_gap when accepted governance is missing or too weak
- code_evidence_gap when governance exists but literal code/diff evidence is insufficient
5. Cite spec refs, section headings, changed paths, and applies_to guidance when traceability is the limiting factor
Output:
compliant[]
conflicts[]
unspecified[]
unspecified_summary
runtime.analysis
Purpose: detect when non-spec docs contradict or lag behind specs.
Input:
{ doc_ref: "doc://guides/api-rate-limits" }
OR { doc_refs: ["doc://guides/api-rate-limits", "doc://runbooks/rate-limit-rollout"] }
OR { scope: "all" }
OR { diff_text: "..." }
Output:
changed_files[]
implicated_specs[]
implicated_docs[]
drift_items[]
runtime.analysis
Exactly one selector must be present in v1: doc_ref, doc_refs, scope, or diff_text. The only valid scope value is "all". The CLI additionally supports --diff-file PATH|- so git diffs can be piped in directly without manually embedding them in JSON.
Purpose: general semantic search across active specs by default.
Input:
{ query: "how do we handle websocket authentication?",
filters: { domain: "api", status: "accepted" } }
Output:
ranked spec sections with excerpts
Unless the caller explicitly asks otherwise, search_specs should search draft, review, and accepted specs and exclude superseded and deprecated specs.
Default semantic retrieval should also down-rank historical provenance and history sections so active normative content wins unless the query explicitly asks for historical context.
Purpose: convenience compound tool for the common spec-review workflow.
Input:
{ spec_ref: "SPEC-042" }
OR { spec_record: { ... canonical record ... } }
Process:
1. Run check_overlap
2. If overlaps exist, run compare_specs
3. Run analyze_impact
4. Run check_doc_drift with `doc_refs` from `analyze_impact.affected_docs`
5. Return one combined report
This tool adds convenience, not a new architectural layer.
It should not silently widen doc drift to { scope: "all" } in v1.
Pituitary core ships two first-party surfaces in this repo:
- CLI for local automation and scripts, and the required transport for v1 completeness
- MCP server as an optional thin wrapper over the same analysis packages
MCP must not introduce separate logic, state, or workflows. index remains CLI-only.
CLI examples:
pituitary index --rebuild
pituitary search-specs --query "rate limiting" --format json
pituitary check-overlap --path specs/rate-limit-v2
pituitary check-doc-drift --scope all --format json
pituitary review-spec --path specs/rate-limit-v2 --format json
When MCP is present, its tool names should mirror the shipped analysis tools:
check_overlapcompare_specsanalyze_impactcheck_doc_driftsearch_specsreview_spec
index remains a CLI-only operation in this architecture. MCP is a query-and-analysis wrapper over an already-built local workspace index, not a second orchestration surface for rebuilds.
Integrations should live outside the core and consume the CLI or MCP surface.
Examples:
- CI runner that calls
pituitary review-spec - git hook that rebuilds the index after local changes
- GitHub adapter that turns PR diffs into
diff_textand posts results back as comments - editor plugin that opens findings inline
- PDF ingestion adapter that emits canonical records into the indexer
This is the intended layering:
integration -> CLI/MCP -> core analysis engine -> SQLite index
Not:
GitHub workflow logic -> buried inside storage or analysis code
The repo may ship a CI job that runs the checked-in make ci flow, but GitHub-specific review, commenting, and reporting behavior still lives outside the core architecture.
pituitary/
├── go.mod
├── go.sum
├── main.go
├── cmd/
│ ├── index.go # rebuild index from configured adapters
│ ├── check.go # invoke core analysis from the CLI
│ ├── report.go # render JSON / markdown / table output
│ └── serve.go # optional MCP server mode
├── internal/
│ ├── model/
│ │ └── types.go # SpecRecord, DocRecord, relation types
│ ├── mcp/ # optional MCP transport
│ │ ├── server.go # MCP setup and tool registration
│ │ └── tools.go # MCP handlers -> core analysis calls
│ ├── source/
│ │ ├── adapter.go # SourceAdapter interface
│ │ ├── filesystem.go # V1 filesystem adapter
│ │ └── specbundle.go # spec.toml + body.md loader
│ ├── chunk/
│ │ └── markdown.go # heading-aware chunking
│ ├── index/
│ │ ├── store.go # SQLite metadata + vectors + graph
│ │ ├── graph.go # relation traversal helpers
│ │ ├── rebuild.go # full atomic rebuild
│ │ └── embedder.go # local or API embeddings
│ ├── analysis/
│ │ ├── overlap.go
│ │ ├── compare.go
│ │ ├── impact.go
│ │ ├── drift.go
│ │ └── llm.go # provider wrapper behind an interface
│ └── config/
│ └── config.go # adapter and runtime config
├── examples/
│ └── rate-limit/
│ ├── spec.toml
│ └── body.md
└── pituitary.json # optional later MCP manifest
| Choice | Purpose | Why |
|---|---|---|
github.com/mark3labs/mcp-go |
Optional MCP server framework | Keeps the shipped MCP wrapper thin over the same analysis packages |
github.com/asg017/sqlite-vec-go-bindings |
Vector search | Provides the vec0 virtual table used by the index |
github.com/mattn/go-sqlite3 |
SQLite engine | Reliable database/sql driver for the cgo-backed sqlite-vec path |
Go standard library flag package |
CLI parsing | The current command surface is small enough that stdlib flags keep startup and dependencies minimal |
Internal restricted TOML parsers in internal/config and internal/source |
pituitary.toml and spec.toml parsing |
The bootstrap only needs a narrow validated TOML subset, so the parser stays internal instead of adding a TOML dependency |
Internal heading-aware Markdown chunker in internal/chunk |
Markdown sectioning | Retrieval only needs ATX heading splits plus title-scoped fallback chunks, not a full Markdown AST |
Why this works well in Go:
- Single binary distribution
- Fast startup for the CLI today, with room for an on-demand MCP process later
- Easy parallel embedding calls during rebuilds
- Clean interfaces between adapters, index, and analysis
| Goal | Current status | Trigger | Key Data Path |
|---|---|---|---|
| 1. Overlap detection | yes | New or changed spec | Spec record -> embed -> candidate retrieval -> deterministic overlap analysis today |
| 2. Tradeoff analysis | yes | Overlap detected | Spec refs -> full text retrieval -> deterministic comparison today |
| 3. Impact analysis | yes | Spec accepted/modified/deprecated | Graph traversal + doc search -> deterministic impact analysis today |
| 4. Code compliance | yes (CLI) | Changed code or diff | Source/diff -> applies_to lookup + semantic fallback -> deterministic compliance findings today |
| 5. Doc sync | yes | Changed docs or changed spec | Doc chunks vs spec chunks -> deterministic drift detection today |
All tools keep the same discipline: retrieval first, then deterministic analysis today or provider-backed adjudication later.
- Freeze canonical ref, source-ref, and applies-to schemes
- Freeze status and supersession semantics
- Freeze JSON request, response, and error envelopes
- Freeze the bootstrap runtime contract while preserving the config shape for later provider-backed runtime work
- Lock fixture expectations for overlap, impact, and doc drift
- Parse
pituitary.toml - Implement the filesystem spec-bundle loader
- Implement the filesystem Markdown-doc loader
- Normalize both into canonical
artifacts - Reject invalid records with actionable errors
- Build the SQLite schema
- Implement full atomic rebuild into the split-store layout (
pituitary.db+ Stroma snapshot) - Chunk Markdown by heading
- Generate embeddings for spec and doc chunks
- Implement filtered retrieval via Stroma search plus Pituitary artifact filters
- Ship
search_specs
Retrieval-precision benchmark reports that exercise this workstream:
- docs/development/retrieval-precision-344.md — doc-level precision@k on the bootstrap corpus, gating the
LateChunkPolicydefault (see #344). - docs/development/retrieval-precision-358.md — chunk-level precision benchmark on the next-iteration corpus (post-#344 follow-up; see #358).
- Implement
check_overlap - Implement
compare_specs - Implement
analyze_impact - Implement CLI-first deterministic
check_compliance - Implement
check_doc_drift - Implement
review_spec
- Ship a JSON-first CLI for every required command
- Add Markdown rendering for human-readable reports
- Keep transport code as a thin layer over the same analysis packages
- Keep the shipped MCP wrapper thin and aligned with CLI behavior without blocking the first ship
- Non-filesystem source adapters
- GitHub-specific flows and vendor-specific CI/reporting integrations
- Stored code embeddings or code-summary corpora
The first shipping slice is done when all of the following are true:
pituitary index --rebuildreadspituitary.toml, reuses unchanged chunk embeddings when safe, builds a fresh SQLite index in a staging DB, and swaps it atomically.- A fixture workspace with at least three specs and two docs can be indexed without manual intervention.
pituitary search-specs --query "..." --format jsonreturns ranked spec sections with stable artifact refs.pituitary check-overlap --path specs/<bundle> --format jsondetects a known overlapping fixture pair without requiring a ref lookup first.pituitary compare-specs --path path/to/spec-a --path path/to/spec-b --format jsonreturns a structured comparison for indexed specs.pituitary analyze-impact --path path/to/spec --format jsonreturns dependent specs and affected docs from the graph and retrieval layers.pituitary check-doc-drift --scope all --format jsonflags at least one known contradictory fixture doc.pituitary review-spec --path path/to/spec --format jsoncomposes overlap, comparison, impact, and doc-drift findings in one response.- All required commands work without GitHub, git metadata, or network-only integrations.
- All required commands fail with clear validation errors when a spec bundle is malformed.
- All shipped commands follow the documented JSON envelope, and unsupported runtime providers fail clearly during config validation.
- Embedding storage: low single-digit MBs
- Full index rebuild: comfortably under 20s with the fixture embedder on the bootstrap corpus
- Per-query latency: typically subsecond to low-single-digit seconds on the bootstrap corpus, depending on command
- Binary size: roughly 15-20MB
- Marginal analysis cost: none in the current bootstrap runtime
The important v1 property is not raw speed. It is that the system stays simple, deterministic in retrieval, and decoupled from any one source or workflow stack.