feat: per-module LLM-grounded descriptions (kb describe)#15
Merged
Conversation
Extend the key-gated `kb describe` pass to describe each first-party module (file), not just `api_route` / `entity` artifacts. A module is not an artifact, so it is enumerated from its span occurrences at the snapshot SHA (`store.queries.module_targets` -> `ModuleTarget`) and grounded on ALL of the file's spans (module + its classes/functions/imports). `describe.py` is refactored to a shared `_describe_one(...)` reused by the artifact loop and a new module loop; module descriptions use `target_kind="module"` and logical key `desc:module:<fqname>`. The prompt source body is capped while validation still runs over every span. No new invariants: the same deterministic sub-property gate (`grounding.validate_claims`) drops any claim whose cited symbol does not occur in the file's spans, so a module is described only if a real symbol survives. The `semantic_grounding` HARD gate is extended with the module path (adversarial fabricated claim dropped; a module with no matching symbol gets no description). Headline gate count stays nine. README / DESIGN / CHANGELOG updated; `kb index` stays offline.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Second slice of the LLM-grounded semantic layer: the key-gated
kb describepass now also describes each first-party module (file), not justapi_route/entityartifacts.store.queries.module_targets(conn, sha)+ModuleTarget— a module is not an artifact, so it is enumerated from its span occurrences at the snapshot SHA (first-party-only, since the pipeline indexes only files under the first-party root). The module name is the fq path of the file'smodulespan; the target carries all of the file's spans (module + classes/functions/imports).describe.pyrefactor — extracted a shared_describe_one(...)reused by the existing artifact loop and a new module loop. Module descriptions usetarget_kind="module"and logical keydesc:module:<fqname>, grounded on all of the file's spans (roledescribes). The prompt source body is capped (~6000 chars) while validation still runs over every span.grounding.validate_claims) drops any claim whose cited symbol does not occur in the file's spans; a module is described only if ≥1 real symbol survives.Gate
kb.eval.semantic_grounding_testis extended with the module path (run on a stub LLM, no API key): a module where the real symbol occurs is described with the fabricated symbol dropped; a module with no matching symbol (e.g.app.main,app.__init__) gets no description. Headline HARD gate count stays nine (extended the existing gate, no new gate file).Out of scope
Per-package / whole-repo architecture overviews; business-process / call-graph extraction; real-LLM describe in the CI gate (nightly only). No release in this cycle — accumulates in
[Unreleased].Checks
ruff check src/kb+mypyclean (65 files)pytest— 55 passed, 1 skipped (key-gated LLM judge)kb indexstays offline; LLM only inkb describe