diff --git a/CHANGELOG.md b/CHANGELOG.md
index c96a8e5..c288541 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -9,7 +9,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Added
 
-- (nothing yet)
+- **Deterministic entities extractor** (`kb.extract.deterministic.entities`): a fully static
+  (tree-sitter) extractor that emits one `entity` artifact per domain class — pydantic `BaseModel`,
+  `@dataclass`, and SQLAlchemy declarative model — with its fields, grounded on the class-definition
+  span. Detection signals and limits are recorded in the payload (transitive bases / imperative
+  SQLAlchemy mapping are documented gaps, not silent losses); `framework_versions` (pydantic /
+  sqlalchemy) is folded into the artifact key. Surfaced via MCP `get_knowledge`/`search_knowledge`.
+- **Tier-1 entities gate** (`kb.eval.tier1_entities_test`): a hand-labeled HARD gate — extracted
+  entities + fields match the oracle, a bare declarative `Base` is not an entity, a `create_model(...)`
+  model is asserted as a known gap, and every entity is grounded on a `class` span. Brings the headline
+  HARD gates to **eight**.
 
 ## [0.2.0] - 2026-06-02
 
diff --git a/DESIGN.md b/DESIGN.md
index f8145fd..b33d72c 100644
--- a/DESIGN.md
+++ b/DESIGN.md
@@ -305,7 +305,7 @@ freshness(current|stale@sha)`, with a deterministic tie-break for reproducible e
 | Module | Responsibility | Key tech |
 |--------|----------------|----------|
 | `kb.structural` | Parse Python without executing it; enumerate symbols/imports/call-sites with per-SHA byte/line ranges; compute content-addressed span identity; incremental reparse. Hidden behind a `StructuralIndex`/`PathEngine` interface so a SCIP backend can replace tree-sitter later. | tree-sitter + tree-sitter-python (canonical bindings) |
-| `kb.extract.deterministic` | No-LLM extractors → exact artifacts (confidence=1.0): import graph; FastAPI API contract (static, cross-file grounded); griffe library surface (planned). | grimp, tree-sitter queries, griffe (static) |
+| `kb.extract.deterministic` | No-LLM extractors → exact artifacts (confidence=1.0): import graph; FastAPI API contract (static, cross-file grounded); domain entities (pydantic/dataclass/SQLAlchemy, static, hand-labeled gate); griffe library surface (planned). | grimp, tree-sitter queries, griffe (static) |
 | `kb.introspect` | Eval-only runtime oracle: runs a FastAPI app in a network-blocked sandbox and emits `app.openapi()` for the Tier-1 API gate. Never on the index path. | subprocess sandbox, fastapi |
 | `kb.embed` | Replaceable embedding adapters + snapshot population for `search_knowledge`. Torch isolated behind the `embed` extra and a lazy import. | sentence-transformers (default), OpenAI (optional), pgvector |
 | `kb.rag` | Frozen pgvector RAG-over-source baseline — the "other arm" of the knowledge-vs-RAG A/B (no provenance/grounding). | deterministic line-window chunker, pgvector |
@@ -382,8 +382,8 @@ Review fact-checked these against current (2026) sources. Caveats are first-clas
 
 ## 14. Roadmap (post-MVP, indicative)
 
-1. Second deterministic family fully (entities via griffe/SQLAlchemy/pydantic; events where a
-   real oracle exists).
+1. Second deterministic family: **entities (pydantic/dataclass/SQLAlchemy) — shipped** (static
+   tree-sitter, hand-labeled Tier-1 gate); events where a real oracle exists (next).
 2. The **one** grounded business-process extractor (named real path + labeler + validator +
    deterministic sub-property gate).
 3. Recursive invalidation (`artifact_depends_on`), multi-branch dedup, freshness precompute.
diff --git a/README.md b/README.md
index 646888d..cad81f7 100644
--- a/README.md
+++ b/README.md
@@ -83,12 +83,12 @@ flowchart LR
 **v0.2 — spine + the first knowledge extractors, MCP serving, and the knowledge-vs-RAG gate.** Everything here grounds what it claims, and nothing it cannot:
 
 - **Provenance spine** — content-addressed `span_id` (LOCKED); tree-sitter spans with a normalized S-expression fingerprint and per-SHA location; a single-Postgres, Alembic-managed store with content-addressed idempotent writes; the ≥ 1 `derived_from` anti-hallucination invariant enforced in-app *and* by a deferred DB trigger; pygit2 git ingest (no checkout) with a diff-based invalidation seed.
-- **Deterministic extractors** — the **import / dependency graph** (grimp resolves the edge, tree-sitter grounds it on the exact import statement, with an honest `approximate` fallback for re-exports / relative / unmappable imports — never a silent loss), and the **FastAPI API-contract** extractor, which grounds a single route **across files** (handler in `routes.py` + `response_model` class in `schemas.py`).
+- **Deterministic extractors** — the **import / dependency graph** (grimp resolves the edge, tree-sitter grounds it on the exact import statement, with an honest `approximate` fallback for re-exports / relative / unmappable imports — never a silent loss); the **FastAPI API-contract** extractor, which grounds a single route **across files** (handler in `routes.py` + `response_model` class in `schemas.py`); and the **domain-entity** extractor (pydantic / dataclass / SQLAlchemy classes and their fields, grounded on the class definition — purely static, with documented detection limits).
 - **`kb introspect`** — a sandboxed, network-blocked `app.openapi()` oracle, eval-only and never on the index path, that the API gate scores the static contract against.
 - **Read-only MCP server** — `find_provenance`, `get_knowledge`, and `search_knowledge`, each returning provenance-carrying units (method + confidence + freshness).
 - **pgvector embeddings + semantic search** — a replaceable embedding provider (sentence-transformers by default, OpenAI optional) populated by a separate `kb embed` pass; torch stays out of the index path.
 - **A frozen RAG-over-source baseline** and the **Tier-3 knowledge-vs-RAG recall gate** — the honest A/B that backs the "knowledge > RAG" thesis.
-- **Seven HARD CI eval gates** (see [Development](#development)).
+- **Eight HARD CI eval gates** (see [Development](#development)).
 
 **Not done yet** (and deliberately not faked): the semantic / **LLM-grounded** extraction layer, the nightly LLM-judged A/B, ADR mining from git history, grounded business-process extraction, incremental re-index on git push, and languages beyond Python. See the [Roadmap](#roadmap).
 
@@ -112,7 +112,7 @@ The base `--extra dev` install stays torch-free; the `embed` extra pulls sentenc
 ### Run the gates
 
 ```bash
-uv run pytest src/kb/eval -q   # the seven HARD gates (spins an ephemeral local Postgres)
+uv run pytest src/kb/eval -q   # the eight HARD gates (spins an ephemeral local Postgres)
 ```
 
 ### Index a commit
@@ -173,12 +173,13 @@ A Python package `kb` (uv, src-layout). Modules and their responsibilities:
 | `kb.git` | pygit2 ingest — reads blobs at a SHA (no checkout) — plus the diff-based invalidation seed. |
 | `kb.extract.deterministic.imports` | Deterministic import / dependency edges: tree-sitter spans grounded by line, grimp edge resolution. |
 | `kb.extract.deterministic.fastapi_contract` | Static FastAPI API-contract extractor; grounds a route across files (handler + `response_model` class), never imports user code. |
+| `kb.extract.deterministic.entities` | Static domain-entity extractor — pydantic / dataclass / SQLAlchemy classes + their fields, grounded on the class definition; detection signals and limits recorded in the payload. |
 | `kb.introspect` | Sandboxed, network-blocked `app.openapi()` oracle — eval-only ground truth for the API gate, never on the index path. |
 | `kb.mcp` | Read-only MCP server and its provenance-carrying records: `find_provenance`, `get_knowledge`, `search_knowledge`. |
 | `kb.embed` | Replaceable embedding adapters (sentence-transformers default, OpenAI optional) + snapshot population. Torch isolated behind the `embed` extra and a lazy import. |
 | `kb.rag` | The frozen pgvector RAG-over-source baseline — the "other arm" of the knowledge-vs-RAG A/B (no provenance, no grounding). |
 | `kb.daemon.cli` | The `kb` CLI: `index`, `embed`, `serve` (MCP), and `introspect` — all functional. |
-| `kb.eval` | Seven HARD CI gates (identity reproducibility, adversarial grounding, Tier-1 import oracle, Tier-1 API oracle, Tier-3 knowledge-vs-RAG recall, Tier-4 one-hop invalidation, invariants) plus the supporting MCP / embed / store suite. |
+| `kb.eval` | Eight HARD CI gates (identity reproducibility, adversarial grounding, Tier-1 import oracle, Tier-1 API oracle, Tier-1 entities oracle, Tier-3 knowledge-vs-RAG recall, Tier-4 one-hop invalidation, invariants) plus the supporting MCP / embed / store suite. |
 
 Core tables: `commit_ref`, `branch_ref`, `code_span`, `span_occurrence`, `artifact` (now with `embedding vector(384)` + `embedding_model_id`), `artifact_derived_from`, `snapshot_entry`, and `rag_chunk` (the baseline arm).
 
@@ -188,18 +189,19 @@ Core tables: `commit_ref`, `branch_ref`, `code_span`, `span_occurrence`, `artifa
 uv sync --extra dev            # venv + install
 uv run ruff check src/kb       # lint
 uv run mypy                    # strict type-check
-uv run pytest src/kb/eval -q   # the seven HARD eval gates
+uv run pytest src/kb/eval -q   # the eight HARD eval gates
 ```
 
-CI (GitHub Actions, workflow **"CI"**, `.github/workflows/ci.yml`) runs ruff, `mypy --strict`, and the eval gates against a `pgvector/pgvector:pg17` service (with the embedding model cached). The **seven HARD gates** that block a merge:
+CI (GitHub Actions, workflow **"CI"**, `.github/workflows/ci.yml`) runs ruff, `mypy --strict`, and the eval gates against a `pgvector/pgvector:pg17` service (with the embedding model cached). The **eight HARD gates** that block a merge:
 
 1. **Identity reproducibility** — formatting / comment / docstring / location changes must NOT change `span_id`; a rename MUST. Pure identity core, no database.
 2. **Adversarial grounding** — an ungrounded artifact is rejected by *both* layers (the app's `GroundingError` and the DB's deferred `artifact_grounded_check` trigger); a genuinely grounded artifact commits cleanly.
 3. **Tier-1 import oracle** — extracted import edges match a hand-labeled oracle, grounded on the actual import statement span; a dynamic import is asserted as a *known* gap, not a silent loss.
 4. **Tier-1 API oracle** — the statically-extracted FastAPI contract equals the app's own `openapi()` (from the sandboxed introspect oracle), and the route's cross-file grounding (handler + `response_model`) is asserted.
-5. **Tier-3 knowledge-vs-RAG recall** — knowbase cross-file recall@k == 1.0 for every contract question (a *structural* floor: one artifact already spans both files, so it holds regardless of embedding quality); the RAG arm is reported but **never asserted**, so a model bump can't redden CI.
-6. **Tier-4 one-hop invalidation** — a content diff invalidates *exactly* the artifacts whose grounding span changed (set-equality: no over-invalidation, no stale survivors); a version bump invalidates everything.
-7. **Invariants** — zero orphans (every snapshot artifact is grounded), and re-indexing the same SHA yields the identical set of artifact ids.
+5. **Tier-1 entities oracle** — extracted pydantic / dataclass / SQLAlchemy entities + their fields match a hand-labeled oracle, each grounded on its class span; a bare declarative `Base` is correctly *not* an entity and a `create_model(...)` model is asserted as a *known* gap.
+6. **Tier-3 knowledge-vs-RAG recall** — knowbase cross-file recall@k == 1.0 for every contract question (a *structural* floor: one artifact already spans both files, so it holds regardless of embedding quality); the RAG arm is reported but **never asserted**, so a model bump can't redden CI.
+7. **Tier-4 one-hop invalidation** — a content diff invalidates *exactly* the artifacts whose grounding span changed (set-equality: no over-invalidation, no stale survivors); a version bump invalidates everything.
+8. **Invariants** — zero orphans (every snapshot artifact is grounded), and re-indexing the same SHA yields the identical set of artifact ids.
 
 The identity rules in `kb.ids` (and `kb.structural`) are **LOCKED**: changing one is a breaking change, gated behind a `NORMALIZATION_VERSION` / `extractor_version` bump so existing digests are invalidated rather than silently colliding.
 
diff --git a/src/kb/daemon/cli.py b/src/kb/daemon/cli.py
index d8372b9..dd8703b 100644
--- a/src/kb/daemon/cli.py
+++ b/src/kb/daemon/cli.py
@@ -11,6 +11,7 @@
 import typer
 
 from kb.daemon.pipeline import index_commit
+from kb.extract.deterministic.entities import EntityExtractor
 from kb.extract.deterministic.fastapi_contract import FastAPIExtractor
 from kb.extract.deterministic.imports import ImportExtractor
 from kb.introspect import introspect_app
@@ -27,7 +28,9 @@ def index(
 ) -> None:
     """Index one commit: ingest, parse spans, run deterministic extractors, write the snapshot."""
     engine = make_engine(db_url)
-    result = index_commit(engine, repo, sha, extractors=[ImportExtractor(), FastAPIExtractor()])
+    result = index_commit(
+        engine, repo, sha, extractors=[ImportExtractor(), FastAPIExtractor(), EntityExtractor()]
+    )
     engine.dispose()
     typer.echo(
         f"indexed {result.sha[:12]}: {result.files_indexed} files, {result.spans} spans, "
diff --git a/src/kb/eval/tier1_entities_test.py b/src/kb/eval/tier1_entities_test.py
new file mode 100644
index 0000000..fe6ecbb
--- /dev/null
+++ b/src/kb/eval/tier1_entities_test.py
@@ -0,0 +1,134 @@
+"""HARD GATE — Tier 1: domain entities vs a hand-labeled oracle (DESIGN.md §4, §9).
+
+The hand-labeled ``EXPECTED_ENTITIES`` / ``EXPECTED_FIELDS`` are the real oracle (importing the
+models to introspect them would execute user code). A bare declarative ``Base`` must NOT be an
+entity, and a dynamically-built model (``create_model``) is a deliberate static-analysis blind spot,
+asserted as a KNOWN gap — not a silent loss. Every entity is grounded on its class-definition span.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+from sqlalchemy import Engine, select
+
+from kb.daemon.pipeline import index_commit
+from kb.eval._fixtures import make_git_repo
+from kb.extract.deterministic.entities import EntityExtractor
+from kb.store import models as m
+
+# A src-layout module: a pydantic model, a dataclass, a SQLAlchemy model (plus a bare declarative
+# Base that is NOT an entity), and a dynamically-built model (invisible to static parsing).
+FILES = {
+    "src/shop/__init__.py": "",
+    "src/shop/models.py": (
+        "from dataclasses import dataclass\n"
+        "from pydantic import BaseModel, create_model\n"
+        "from sqlalchemy import Column, Integer\n"
+        "from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column\n"
+        "\n\n"
+        "class Order(BaseModel):\n"
+        "    id: int\n"
+        "    total: float = 0.0\n"
+        "    note: str | None = None\n"
+        "\n\n"
+        "@dataclass\n"
+        "class LineItem:\n"
+        "    sku: str\n"
+        "    qty: int = 1\n"
+        "\n\n"
+        "class Base(DeclarativeBase):\n"
+        "    pass\n"
+        "\n\n"
+        "class User(Base):\n"
+        '    __tablename__ = "users"\n'
+        "    id: Mapped[int] = mapped_column(primary_key=True)\n"
+        "    name: Mapped[str] = mapped_column()\n"
+        "    legacy = Column(Integer)\n"
+        "\n\n"
+        'Dynamic = create_model("Dynamic", x=(int, ...))\n'
+    ),
+}
+
+# Hand-labeled oracle: (framework, fq class). `Base` and `Dynamic` are deliberately absent.
+EXPECTED_ENTITIES = {
+    ("pydantic", "shop.models.Order"),
+    ("dataclass", "shop.models.LineItem"),
+    ("sqlalchemy", "shop.models.User"),
+}
+EXPECTED_FIELDS = {
+    "shop.models.Order": {"id", "total", "note"},
+    "shop.models.LineItem": {"sku", "qty"},
+    "shop.models.User": {"id", "name", "legacy"},  # __tablename__ is metadata, not a field
+}
+KNOWN_GAP = "shop.models.Dynamic"  # create_model(): dynamic, invisible to static analysis
+
+
+def _index(engine: Engine, tmp_path: Path) -> str:
+    sha = make_git_repo(tmp_path, [FILES])[0]
+    index_commit(engine, str(tmp_path), sha, extractors=[EntityExtractor()], first_party_root="src")
+    return sha
+
+
+def _entity_payloads(engine: Engine, sha: str) -> list[dict]:
+    join = m.snapshot_entry.join(
+        m.artifact, m.artifact.c.artifact_id == m.snapshot_entry.c.artifact_id
+    )
+    with engine.connect() as conn:
+        return list(
+            conn.execute(
+                select(m.artifact.c.payload)
+                .select_from(join)
+                .where(m.snapshot_entry.c.sha == sha, m.artifact.c.kind == "entity")
+            ).scalars()
+        )
+
+
+def test_entities_match_oracle(engine: Engine, tmp_path: Path) -> None:
+    sha = _index(engine, tmp_path)
+    found = {(p["framework"], p["qualified_name"]) for p in _entity_payloads(engine, sha)}
+    assert found == EXPECTED_ENTITIES
+
+
+def test_fields_match_oracle(engine: Engine, tmp_path: Path) -> None:
+    sha = _index(engine, tmp_path)
+    by_key = {p["qualified_name"]: p for p in _entity_payloads(engine, sha)}
+    for qualified_name, expected in EXPECTED_FIELDS.items():
+        names = {f["name"] for f in by_key[qualified_name]["fields"]}
+        assert names == expected, qualified_name
+
+
+def test_bare_declarative_base_is_not_an_entity(engine: Engine, tmp_path: Path) -> None:
+    sha = _index(engine, tmp_path)
+    keys = {p["qualified_name"] for p in _entity_payloads(engine, sha)}
+    assert "shop.models.Base" not in keys  # no __tablename__, no columns -> not a domain entity
+
+
+def test_dynamic_model_is_a_known_gap(engine: Engine, tmp_path: Path) -> None:
+    sha = _index(engine, tmp_path)
+    keys = {p["qualified_name"] for p in _entity_payloads(engine, sha)}
+    assert KNOWN_GAP not in keys  # documented blind spot, surfaced — not silently "found"
+
+
+def test_entities_grounded_on_class_spans(engine: Engine, tmp_path: Path) -> None:
+    sha = _index(engine, tmp_path)
+    join = (
+        m.snapshot_entry.join(
+            m.artifact, m.artifact.c.artifact_id == m.snapshot_entry.c.artifact_id
+        )
+        .join(
+            m.artifact_derived_from,
+            m.artifact_derived_from.c.artifact_id == m.artifact.c.artifact_id,
+        )
+        .join(m.code_span, m.code_span.c.span_id == m.artifact_derived_from.c.span_id)
+    )
+    with engine.connect() as conn:
+        rows = conn.execute(
+            select(m.artifact.c.payload, m.code_span.c.span_kind)
+            .select_from(join)
+            .where(m.snapshot_entry.c.sha == sha, m.artifact.c.kind == "entity")
+        ).all()
+    assert rows  # every entity is grounded (>=1 derived_from)
+    for row in rows:
+        assert row.span_kind == "class"
+        assert row.payload["span_mapping"] == "exact"
diff --git a/src/kb/extract/deterministic/entities.py b/src/kb/extract/deterministic/entities.py
new file mode 100644
index 0000000..9c10f9f
--- /dev/null
+++ b/src/kb/extract/deterministic/entities.py
@@ -0,0 +1,372 @@
+"""Deterministic domain-entity extractor — pydantic / dataclass / SQLAlchemy (DESIGN.md §4, §14).
+
+Produces one ``entity`` artifact per domain class, grounded on that class's span (role
+``class_definition``). Fully static: re-parses each class span's source with tree-sitter (the same
+discipline as the FastAPI contract extractor); it never imports or executes user code.
+
+Detection is best-effort and the signals are recorded in the payload (never a silent guess):
+  * **dataclass**   — a decorator whose dotted name ends in ``dataclass``.
+  * **pydantic**    — a direct base named ``BaseModel`` / ``BaseSettings``.
+  * **sqlalchemy**  — a ``__tablename__`` assignment, or a field via ``Mapped[...]`` /
+                      ``mapped_column(...)`` / ``Column(...)`` (so a bare declarative ``Base`` with
+                      neither is correctly NOT treated as an entity).
+``framework_versions`` (pydantic / sqlalchemy) is read from the ANALYZED repo at the SHA and folded
+into the artifact key, since field interpretation can shift across major versions (DESIGN.md §6).
+"""
+
+from __future__ import annotations
+
+import textwrap
+import tomllib
+from collections.abc import Sequence
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+import tree_sitter_python as tsp
+from tree_sitter import Language, Node, Parser
+
+from kb.extract.base import DerivedEdge, ExtractContext, ExtractedArtifact
+from kb.structural.interface import ParsedSpan
+
+EXTRACTOR_ID = "entities"
+EXTRACTOR_VERSION = "1"
+
+_LANGUAGE = Language(tsp.language())
+_PYDANTIC_BASES = frozenset({"BaseModel", "BaseSettings"})
+_SA_COLUMN_CALLS = frozenset({"Column", "mapped_column"})
+_OPTIONAL_MARKERS = ("Optional[", "| None", "None |")
+_VERSIONED = ("pydantic", "sqlalchemy")
+
+
+@dataclass(frozen=True)
+class _RawField:
+    name: str
+    annotation: str | None
+    has_default: bool
+    value_callee: str | None  # innermost name of a call on the RHS, e.g. "mapped_column" | "Column"
+
+
+class EntityExtractor:
+    extractor_id = EXTRACTOR_ID
+    extractor_version = EXTRACTOR_VERSION
+
+    def __init__(self) -> None:
+        self._parser = Parser(_LANGUAGE)
+
+    def extract(self, ctx: ExtractContext) -> list[ExtractedArtifact]:
+        versions = _framework_versions(ctx, _VERSIONED)
+        artifacts: list[ExtractedArtifact] = []
+        for module, spans in ctx.spans_by_module.items():
+            for span in spans:
+                if span.span_kind != "class":
+                    continue
+                art = self._build_artifact(module, span, versions)
+                if art is not None:
+                    artifacts.append(art)
+        return artifacts
+
+    def _build_artifact(
+        self, module: str, span: ParsedSpan, versions: dict[str, str]
+    ) -> ExtractedArtifact | None:
+        root = self._parser.parse(textwrap.dedent(span.raw_text).encode("utf-8")).root_node
+        deco = _first_child_of_type(root, "decorated_definition")
+        cls = (
+            _first_child_of_type(deco, "class_definition")
+            if deco is not None
+            else _first_child_of_type(root, "class_definition")
+        )
+        if cls is None:
+            return None
+
+        decorators = _decorator_names(deco) if deco is not None else []
+        bases = _base_names(cls)
+        body = cls.child_by_field_name("body")
+        tablename, raw_fields, relationships = _parse_body(body)
+
+        framework, signals, limitations = _classify(decorators, bases, tablename, raw_fields)
+        if framework is None:
+            return None
+
+        fields = _select_fields(framework, raw_fields)
+        payload: dict[str, Any] = {
+            "framework": framework,
+            "class_name": span.fq_symbol_path.rsplit(".", 1)[-1],
+            "qualified_name": span.fq_symbol_path,
+            "module": module,
+            "bases": bases,
+            "fields": [
+                {
+                    "name": f.name,
+                    "annotation": f.annotation,
+                    "has_default": f.has_default,
+                    "required": f.required,
+                    "source": f.source,
+                }
+                for f in fields
+            ],
+            "tablename": tablename,
+            "relationships": relationships,
+            "detection_signals": signals,
+            "span_mapping": "exact",
+            "limitations": limitations,
+        }
+        framework_versions = (
+            {} if framework == "dataclass" else {framework: versions.get(framework, "unknown")}
+        )
+        return ExtractedArtifact(
+            kind="entity",
+            logical_key=f"entity:{span.fq_symbol_path}",
+            payload=payload,
+            derived_from=[DerivedEdge(span.span_id, "class_definition")],
+            extractor_id=self.extractor_id,
+            extractor_version=self.extractor_version,
+            framework_versions=framework_versions,
+        )
+
+
+# --- selected field (post-classification) ----------------------------------
+
+
+@dataclass(frozen=True)
+class _Field:
+    name: str
+    annotation: str | None
+    has_default: bool
+    required: bool
+    source: str  # "annotated" | "column"
+
+
+def _select_fields(framework: str, raw: Sequence[_RawField]) -> list[_Field]:
+    out: list[_Field] = []
+    for rf in raw:
+        if _is_dunder(rf.name):
+            continue
+        annotated = rf.annotation is not None
+        is_column = rf.value_callee in _SA_COLUMN_CALLS
+        if framework == "sqlalchemy":
+            is_mapped = rf.annotation is not None and rf.annotation.startswith("Mapped[")
+            if not (is_mapped or is_column):
+                continue
+            source = "annotated" if annotated else "column"
+        else:  # pydantic / dataclass fields are always annotated
+            if not annotated:
+                continue
+            source = "annotated"
+        out.append(
+            _Field(
+                name=rf.name,
+                annotation=rf.annotation,
+                has_default=rf.has_default,
+                required=not rf.has_default and not _is_optional(rf.annotation),
+                source=source,
+            )
+        )
+    return out
+
+
+def _classify(
+    decorators: Sequence[str],
+    bases: Sequence[str],
+    tablename: str | None,
+    raw: Sequence[_RawField],
+) -> tuple[str | None, list[str], list[str]]:
+    is_dataclass = any(d.rsplit(".", 1)[-1] == "dataclass" for d in decorators)
+    has_column_field = any(
+        rf.value_callee in _SA_COLUMN_CALLS
+        or (rf.annotation is not None and rf.annotation.startswith("Mapped["))
+        for rf in raw
+    )
+    is_sqlalchemy = tablename is not None or has_column_field
+    is_pydantic = any(b in _PYDANTIC_BASES for b in bases)
+
+    signals: list[str] = []
+    if is_dataclass:
+        signals.append("dataclass_decorator")
+    if tablename is not None:
+        signals.append("sqlalchemy_tablename")
+    if has_column_field:
+        signals.append("sqlalchemy_column_field")
+    if is_pydantic:
+        signals.append("pydantic_base")
+
+    limitations: list[str] = []
+    if sum((is_dataclass, is_sqlalchemy, is_pydantic)) > 1:
+        limitations.append("multiple_framework_signals")
+
+    # precedence: a dataclass decorator wins; then SQLAlchemy table/columns; then a pydantic base.
+    if is_dataclass:
+        return "dataclass", signals, limitations
+    if is_sqlalchemy:
+        return "sqlalchemy", signals, limitations
+    if is_pydantic:
+        return "pydantic", signals, limitations
+    return None, signals, limitations
+
+
+# --- tree-sitter parsing of the class body ----------------------------------
+
+
+def _parse_body(
+    body: Node | None,
+) -> tuple[str | None, list[_RawField], list[dict[str, str | None]]]:
+    """Return ``(__tablename__ literal, raw fields, relationships)`` from a class ``block``.
+
+    Only DIRECT body statements are inspected, so assignments inside method bodies are not mistaken
+    for fields.
+    """
+    if body is None:
+        return None, [], []
+    tablename: str | None = None
+    fields: list[_RawField] = []
+    relationships: list[dict[str, str | None]] = []
+    for stmt in body.named_children:
+        assign = _unwrap_assignment(stmt)
+        if assign is None:
+            continue
+        left = assign.child_by_field_name("left")
+        if left is None or left.type != "identifier":
+            continue
+        name = _text(left)
+        if name is None:
+            continue
+        right = assign.child_by_field_name("right")
+        callee = _innermost_call_name(right) if right is not None else None
+        if name == "__tablename__":
+            tablename = _string_value(right) if right is not None else None
+            continue
+        if callee == "relationship":
+            relationships.append({"name": name, "target": _first_argument_text(right)})
+        fields.append(
+            _RawField(
+                name=name,
+                annotation=_text(assign.child_by_field_name("type")),
+                has_default=right is not None,
+                value_callee=callee,
+            )
+        )
+    return tablename, fields, relationships
+
+
+def _unwrap_assignment(stmt: Node) -> Node | None:
+    """A class-body field is an ``assignment`` (possibly wrapped in an ``expression_statement``)."""
+    if stmt.type == "assignment":
+        return stmt
+    if stmt.type == "expression_statement":
+        inner = _first_child_of_type(stmt, "assignment")
+        if inner is not None:
+            return inner
+    return None
+
+
+def _decorator_names(deco: Node) -> list[str]:
+    names: list[str] = []
+    for child in deco.named_children:
+        if child.type != "decorator":
+            continue
+        target = child.named_children[0] if child.named_children else None
+        if target is None:
+            continue
+        if target.type == "call":
+            target = target.child_by_field_name("function")
+        text = _text(target)
+        if text is not None:
+            names.append(text)
+    return names
+
+
+def _base_names(cls: Node) -> list[str]:
+    supers = cls.child_by_field_name("superclasses")
+    if supers is None:
+        return []
+    names: list[str] = []
+    for arg in supers.named_children:
+        if arg.type == "keyword_argument":  # e.g. metaclass=...
+            continue
+        text = _text(arg)
+        if text is not None:
+            names.append(text.rsplit(".", 1)[-1])
+    return names
+
+
+# --- small tree-sitter helpers (kept local; mirror the fastapi extractor) ----
+
+
+def _first_child_of_type(node: Node, type_name: str) -> Node | None:
+    for child in node.named_children:
+        if child.type == type_name:
+            return child
+    return None
+
+
+def _innermost_call_name(node: Node) -> str | None:
+    """If ``node`` is (or wraps) a call, return the innermost identifier of its callee."""
+    if node.type != "call":
+        return None
+    fn = node.child_by_field_name("function")
+    if fn is None:
+        return None
+    text = _text(fn)
+    return text.rsplit(".", 1)[-1] if text is not None else None
+
+
+def _first_argument_text(node: Node | None) -> str | None:
+    if node is None or node.type != "call":
+        return None
+    args = node.child_by_field_name("arguments")
+    if args is None:
+        return None
+    first = next((c for c in args.named_children), None)
+    return _text(first) if first is not None else None
+
+
+def _string_value(node: Node) -> str | None:
+    if node.type != "string":
+        return None
+    contents = [
+        (child.text or b"").decode("utf-8", errors="replace")
+        for child in node.named_children
+        if child.type == "string_content"
+    ]
+    if contents:
+        return "".join(contents)
+    return (node.text or b"").decode("utf-8", errors="replace").strip("\"'")
+
+
+def _is_optional(annotation: str | None) -> bool:
+    return annotation is not None and any(marker in annotation for marker in _OPTIONAL_MARKERS)
+
+
+def _is_dunder(name: str) -> bool:
+    return name.startswith("__") and name.endswith("__")
+
+
+def _text(node: Node | None) -> str | None:
+    if node is None or node.text is None:
+        return None
+    return node.text.decode("utf-8")
+
+
+def _framework_versions(ctx: ExtractContext, names: tuple[str, ...]) -> dict[str, str]:
+    """Best-effort versions of ``names`` from the repo's lockfiles / pyproject at the SHA."""
+    root = Path(ctx.materialized_root)
+    targets = {name.lower(): name for name in names}
+    found: dict[str, str] = {}
+    for lock in ("uv.lock", "poetry.lock"):
+        path = root / lock
+        if path.exists():
+            data = tomllib.loads(path.read_text())
+            for pkg in data.get("package", []):
+                key = str(pkg.get("name", "")).lower()
+                if key in targets and "version" in pkg and targets[key] not in found:
+                    found[targets[key]] = str(pkg["version"])
+    pyproject = root / "pyproject.toml"
+    if pyproject.exists() and any(name not in found for name in names):
+        data = tomllib.loads(pyproject.read_text())
+        deps = data.get("project", {}).get("dependencies", []) or []
+        for spec in deps:
+            normalized = spec.replace("-", "_").lower()
+            for key, canonical in targets.items():
+                if canonical not in found and normalized.startswith(key):
+                    found[canonical] = f"spec:{spec}"
+    return {name: found.get(name, "unknown") for name in names}
diff --git a/src/kb/mcp/records.py b/src/kb/mcp/records.py
index 8b91b60..0a6fa34 100644
--- a/src/kb/mcp/records.py
+++ b/src/kb/mcp/records.py
@@ -69,6 +69,10 @@ def summarize(kind: str, payload: dict[str, Any]) -> str:
     if kind == "api_route":
         model = payload.get("response_model_base") or "?"
         return f"{payload.get('method', '?')} {payload.get('path', '?')} -> {model}"
+    if kind == "entity":
+        framework = payload.get("framework", "?")
+        n_fields = len(payload.get("fields", []))
+        return f"{payload.get('qualified_name', '?')} ({framework}, {n_fields} fields)"
     return kind