feat: local Ollama backend for /understand and /understand-ollama#475
feat: local Ollama backend for /understand and /understand-ollama#475naraypv wants to merge 11 commits into
Conversation
Speckit-style specification artifacts under docs/superpowers/: - spec/2026-06-19-ollama-backend-design.md — module shape, error model, per-phase responsibilities, out-of-scope items - plans/2026-06-19-ollama-backend-impl.md — task-by-task TDD plan with per-step acceptance checks No code change in this commit.
- Add graphify-out/ to .gitignore (local AST cache + graph artifacts; regenerated by 'graphify update .') - Bump core tsconfig lib ES2022 -> ES2024 so OllamaClient can use Promise.withResolvers (runtime in Node 22+).
OllamaClient wraps the Ollama HTTP API (/api/version, /api/chat, /api/generate, /api/tags) with: - exponential-backoff retry on 5xx and connection errors - AbortSignal-aware timeouts (caller signal composed with timeout) - structured error classes (OllamaConnectionError, OllamaModelMissingError, OllamaResponseError, OllamaTimeoutError) - test injection point via fetchImpl - format:"json" pass-through for structured output 10 unit tests cover isHealthy, chat shape, JSON format, retry on 5xx, no retry on 4xx, timeout, and caller-supplied AbortSignal.
Adds the optional ollama: { baseUrl, model, concurrency } block to
ProjectConfig so users can persist their Ollama settings in
.understand-anything/config.json instead of passing flags every run.
4 unit tests cover: default config when missing, round-trip of the
ollama block, additive update of existing fields, and graceful
recovery from a corrupted config file.
New bundle at understand-anything-plugin/skills/understand-ollama/: - SKILL.md: prerequisites (install + ollama pull + ollama serve), CLI flags, what each phase does on the local path, differences from the cloud path - run-pipeline.mjs: seven-phase driver that mirrors skills/understand/SKILL.md but routes every LLM call to Ollama via OllamaClient; deterministic steps reuse the existing scan-project.mjs, extract-import-map.mjs, compute-batches.mjs, extract-structure.mjs, merge-batch-graphs.py, and build-fingerprints.mjs - smoke-client.mjs: standalone client smoke test - validate-output.mjs: schema validation of produced knowledge graph Default model: qwen2.5-coder:1.5b (works in 1 GB). Recommended for 16 GB GPU: qwen2.5-coder:7b. Validated end-to-end against real local Ollama on homepage/ fixture: 17 files, 3 LLM-derived layers, 4 tour steps, schema passed.
Adds a top-of-Phase-2 gate that, when $ARGUMENTS contains --ollama, shells out to skills/understand-ollama/run-pipeline.mjs and exits. The cloud-path dispatch below is unchanged. The argument-hint and Options block are updated to advertise the flag. OLLAMA_URL, OLLAMA_MODEL, and OLLAMA_CONCURRENCY are read from the persisted config.json ollama block (added in the prior commit) and fall back to the driver's own defaults when absent.
- README.md: new "5. Run fully locally with Ollama" subsection under Quick Start, plus an Ollama row in the Platform Compatibility table - READMEs/README.ollama.md: stub linking to the main README and the spec/plan; the actual documentation lives in the main README - Bump plugin version 2.8.0 -> 2.9.0 across all five manifests: understand-anything-plugin/package.json understand-anything-plugin/.claude-plugin/plugin.json .claude-plugin/plugin.json .cursor-plugin/plugin.json .copilot-plugin/plugin.json
|
Hi, the Claude Code / Codex support user choose the upstream model provider itself. |
tests/skill/understand-ollama/test_run_pipeline.test.mjs boots a Node http.createServer stub that mimics the Ollama HTTP API surface used by run-pipeline.mjs (/api/version, /api/tags, /api/chat, /api/generate), spawns the full seven-phase driver against a minimal fixture project, then asserts: - exit code 0 - knowledge-graph.json exists - structural fields (nodes, project, version, kind) populated - knowledge graph validates against the dashboard's Zod schema - meta.json records the ollamaModel and ollamaUrl used Total test count: 975 (was 974). No new dependencies. Lint clean.
|
Thanks for the pointer — I missed that ollama.com ships first-party Claude Code and Codex integrations. A few clarifications on why this PR is still valuable: 1. Claude Code / Codex are two of 14 supported platforms. The plugin's platform compatibility table lists: Claude Code, Cursor, VS Code + GitHub Copilot, Copilot CLI, Codex, OpenCode, OpenClaw, Antigravity, Gemini CLI, Pi Agent, Vibe CLI, Hermes, Cline, KIMI CLI, Trae, Nanobot, Kiro CLI / IDE. Only Claude Code and Codex have a "user picks the upstream model" knob; the other 12+ inherit whatever the platform vendor ships. 2. The Ollama integrations for Claude Code / Codex are client-side model overrides. They still send every byte of the codebase (README, source files, agent prompts) through the Claude Code / Codex process — which on managed / corporate deployments may forward to a vendor gateway regardless of the user-chosen model. The new 3. Privacy and air-gap use cases. Some users (academic, government, medical, IP-sensitive) need to know that no prompt ever traverses a vendor boundary. The Ollama native integrations don't offer that guarantee on managed deployments; I'll add a short note to |
…/understand-ollama SKILL.md Addresses reviewer feedback that Ollama ships first-party integrations for Claude Code and Codex. The new callout sits at the top of the Prerequisites section and points users on those two hosts to the lighter native route, while listing three concrete reasons to use /understand-ollama instead: 1. any other supported host (Cursor, Copilot, OpenCode, Kiro, Gemini CLI, etc.) — Ollama's native integrations only cover two platforms 2. guarantee that no prompt ever leaves the host machine 3. air-gapped or vendor-restricted environments where managed platforms may still forward traffic
…nderstand-ollama callouts Addresses second review comment (Lum1104) pointing at https://docs.ollama.com/integrations. The prior callout only named Claude Code / Codex and listed Cursor / Copilot / OpenCode / Kiro / Gemini CLI as 'no native integration' hosts; in fact Ollama ships native integrations for Copilot CLI, Cline CLI, OpenCode, VS Code (incl. Copilot), JetBrains, Zed, and Roo Code. Rewrites both callouts to: - point at the full integrations index - list the hosts with a native integration (Claude Code, Codex App, Codex CLI, Copilot CLI, Cline CLI, OpenCode, VS Code, JetBrains, Zed, Roo Code) and recommend the native integration when one exists - narrow the /understand-ollama recommendation to the hosts that actually have no native integration (Cursor, Gemini CLI, OpenClaw, Hermes, Goose, Kiro CLI / IDE, Antigravity, Pi Agent, Vibe CLI, Trae, Nanobot, Droid, Pool, ...) and to the two non-host reasons (no-vendor-process guarantee; one-command cross-host stability) No code change.
|
You're right — I was looking at https://docs.ollama.com/integrations/claude-code only. Pulling the integrations index gives a much wider list: Claude Code, Codex App, Codex CLI, Copilot CLI, Cline CLI, OpenCode, Droid, Goose, Oh My Pi, Pi, Pool (coding agents); OpenClaw, Hermes Agent, Hermes Desktop (assistants); VS Code, Cline, Roo Code, JetBrains, Xcode, Zed (IDEs/editors). That covers several of the platforms I had wrongly listed as "no native Ollama integration" — Copilot CLI, Cline CLI, OpenCode, VS Code (incl. the Copilot entry my README already lists), JetBrains, Zed, Roo Code. Native integration is the lightest path on those hosts and that's now the recommended route in both the SKILL.md and README. The hosts in the plugin's platform compatibility table that genuinely still have no native Ollama integration are: Cursor, Gemini CLI, Kiro CLI / IDE, Antigravity, Pi Agent, Vibe CLI, Trae, Nanobot, plus the agents my reply listed. That's still a real (if narrower) addressable audience for The two non-host reasons — guarantee that no prompt ever leaves the host, and one-command cross-host stability — also still hold regardless of which host is in use. Pushed as |
Adopt upstream's token-usage and local-model README callouts
across all 8 README variants. Keep this branch's `2.9.0`
version bump in all five plugin manifests — the bump is
the version claim of the Ollama backend feature, not a
maintainable post-merge increment.
Resolution summary:
- .claude-plugin/plugin.json keep ours (2.9.0)
- .copilot-plugin/plugin.json keep ours (2.9.0)
- .cursor-plugin/plugin.json keep ours (2.9.0)
- understand-anything-plugin/.claude-plugin/plugin.json keep ours (2.9.0)
- understand-anything-plugin/package.json keep ours (2.9.0)
- README.md + READMEs/README.{es-ES,ja-JP,ko-KR,ru-RU,tr-TR,zh-CN,zh-TW}.md
auto-merged (8 README callouts from 7f5a717 added cleanly
alongside the Ollama callouts added in commit 457e5c7).
Post-merge validation:
- pnpm lint clean
- pnpm --filter @understand-anything/core test -- --run 767/767 pass
- pnpm test 208/208 pass
- node .../smoke-client.mjs pass on real Ollama
- node .../run-pipeline.mjs --project-root homepage 17 files, 2 layers,
6 tour steps, schema OK
|
Conflicts are resolved. Rebased Resolution strategy: kept this branch's
PR now reports Post-merge local validation on
|
Adds a fully local LLM backend that drives the Understand Anything analysis pipeline against a user-run Ollama server. No cloud API key, no network egress from the host.
What's new
/understand-ollamaskill — new entry point that runs the seven phases through Ollama.--ollamaflag on/understand— same path, accessible from the existing entry point.OllamaClientin@understand-anything/core— wraps the Ollama HTTP API with retry, timeout, abort, and structured-output helpers.ollama: { baseUrl, model, concurrency }block in.understand-anything/config.json.Why
Today, every LLM call in the pipeline is performed by a host-platform agent (Claude Code, Cursor, Copilot, etc.) that loads the markdown agent definitions from
understand-anything-plugin/agents/. The host picks the model; we have no control over it. The local Ollama path gives users a privacy-preserving, cloud-free option while producing the sameknowledge-graph.jsonschema the dashboard already understands.Architecture
run-pipeline.mjsis a single Node driver that mirrorsskills/understand/SKILL.md. Deterministic steps (scan, import-map, batching, structure extraction) reuse the existing bundled scripts. Semantic steps (project narrative, per-file enrichment, layer detection, tour) callOllamaClientfor every prompt the cloud path would have dispatched a subagent for. The output schema and intermediate file layout are identical to the cloud path.Spec + plan
docs/superpowers/specs/2026-06-19-ollama-backend-design.mddocs/superpowers/plans/2026-06-19-ollama-backend-impl.mdTests
OllamaClient(packages/core/src/__tests__/ollama-client.test.ts)ollamaconfig blockpnpm lintcleanpnpm --filter @understand-anything/core buildcleanhomepage/— 17 files, 3 LLM-derived layers, 4 tour steps, schema validation passed via the dashboard's Zod schema.Usage