Skip to content

feat: local Ollama backend for /understand and /understand-ollama#475

Open
naraypv wants to merge 11 commits into
Egonex-AI:mainfrom
naraypv:ollama
Open

feat: local Ollama backend for /understand and /understand-ollama#475
naraypv wants to merge 11 commits into
Egonex-AI:mainfrom
naraypv:ollama

Conversation

@naraypv

@naraypv naraypv commented Jun 19, 2026

Copy link
Copy Markdown

Adds a fully local LLM backend that drives the Understand Anything analysis pipeline against a user-run Ollama server. No cloud API key, no network egress from the host.

What's new

  • /understand-ollama skill — new entry point that runs the seven phases through Ollama.
  • --ollama flag on /understand — same path, accessible from the existing entry point.
  • OllamaClient in @understand-anything/core — wraps the Ollama HTTP API with retry, timeout, abort, and structured-output helpers.
  • Persisted ollama: { baseUrl, model, concurrency } block in .understand-anything/config.json.
  • Bumped to 2.9.0 (additive; cloud path untouched).

Why

Today, every LLM call in the pipeline is performed by a host-platform agent (Claude Code, Cursor, Copilot, etc.) that loads the markdown agent definitions from understand-anything-plugin/agents/. The host picks the model; we have no control over it. The local Ollama path gives users a privacy-preserving, cloud-free option while producing the same knowledge-graph.json schema the dashboard already understands.

Architecture

run-pipeline.mjs is a single Node driver that mirrors skills/understand/SKILL.md. Deterministic steps (scan, import-map, batching, structure extraction) reuse the existing bundled scripts. Semantic steps (project narrative, per-file enrichment, layer detection, tour) call OllamaClient for every prompt the cloud path would have dispatched a subagent for. The output schema and intermediate file layout are identical to the cloud path.

Spec + plan

  • docs/superpowers/specs/2026-06-19-ollama-backend-design.md
  • docs/superpowers/plans/2026-06-19-ollama-backend-impl.md

Tests

  • 10 new unit tests for OllamaClient (packages/core/src/__tests__/ollama-client.test.ts)
  • 4 new round-trip tests for the persisted ollama config block
  • All 974 existing tests still pass
  • pnpm lint clean
  • pnpm --filter @understand-anything/core build clean
  • End-to-end validated against a real local Ollama server (qwen2.5-coder:1.5b default) on homepage/ — 17 files, 3 LLM-derived layers, 4 tour steps, schema validation passed via the dashboard's Zod schema.

Usage

# One-time setup
curl -fsSL https://ollama.com/install.sh | sh
ollama serve &
ollama pull qwen2.5-coder:7b    # 7B for a 16 GB GPU; 1.5B for laptops

# Per project
/understand-ollama
# or
/understand --ollama

naraypv added 7 commits June 18, 2026 23:05
Speckit-style specification artifacts under docs/superpowers/:
- spec/2026-06-19-ollama-backend-design.md — module shape, error model,
  per-phase responsibilities, out-of-scope items
- plans/2026-06-19-ollama-backend-impl.md — task-by-task TDD plan with
  per-step acceptance checks

No code change in this commit.
- Add graphify-out/ to .gitignore (local AST cache + graph artifacts;
  regenerated by 'graphify update .')
- Bump core tsconfig lib ES2022 -> ES2024 so OllamaClient can use
  Promise.withResolvers (runtime in Node 22+).
OllamaClient wraps the Ollama HTTP API (/api/version, /api/chat,
/api/generate, /api/tags) with:
- exponential-backoff retry on 5xx and connection errors
- AbortSignal-aware timeouts (caller signal composed with timeout)
- structured error classes (OllamaConnectionError, OllamaModelMissingError,
  OllamaResponseError, OllamaTimeoutError)
- test injection point via fetchImpl
- format:"json" pass-through for structured output

10 unit tests cover isHealthy, chat shape, JSON format, retry on 5xx,
no retry on 4xx, timeout, and caller-supplied AbortSignal.
Adds the optional ollama: { baseUrl, model, concurrency } block to
ProjectConfig so users can persist their Ollama settings in
.understand-anything/config.json instead of passing flags every run.

4 unit tests cover: default config when missing, round-trip of the
ollama block, additive update of existing fields, and graceful
recovery from a corrupted config file.
New bundle at understand-anything-plugin/skills/understand-ollama/:
- SKILL.md: prerequisites (install + ollama pull + ollama serve),
  CLI flags, what each phase does on the local path, differences
  from the cloud path
- run-pipeline.mjs: seven-phase driver that mirrors
  skills/understand/SKILL.md but routes every LLM call to Ollama
  via OllamaClient; deterministic steps reuse the existing
  scan-project.mjs, extract-import-map.mjs, compute-batches.mjs,
  extract-structure.mjs, merge-batch-graphs.py, and build-fingerprints.mjs
- smoke-client.mjs: standalone client smoke test
- validate-output.mjs: schema validation of produced knowledge graph

Default model: qwen2.5-coder:1.5b (works in 1 GB).
Recommended for 16 GB GPU: qwen2.5-coder:7b.

Validated end-to-end against real local Ollama on homepage/
fixture: 17 files, 3 LLM-derived layers, 4 tour steps, schema passed.
Adds a top-of-Phase-2 gate that, when $ARGUMENTS contains --ollama,
shells out to skills/understand-ollama/run-pipeline.mjs and exits.
The cloud-path dispatch below is unchanged.

The argument-hint and Options block are updated to advertise the flag.
OLLAMA_URL, OLLAMA_MODEL, and OLLAMA_CONCURRENCY are read from the
persisted config.json ollama block (added in the prior commit) and
fall back to the driver's own defaults when absent.
- README.md: new "5. Run fully locally with Ollama" subsection under
  Quick Start, plus an Ollama row in the Platform Compatibility table
- READMEs/README.ollama.md: stub linking to the main README and the
  spec/plan; the actual documentation lives in the main README
- Bump plugin version 2.8.0 -> 2.9.0 across all five manifests:
  understand-anything-plugin/package.json
  understand-anything-plugin/.claude-plugin/plugin.json
  .claude-plugin/plugin.json
  .cursor-plugin/plugin.json
  .copilot-plugin/plugin.json
@Lum1104

Lum1104 commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Hi, the Claude Code / Codex support user choose the upstream model provider itself.

FYI, https://docs.ollama.com/integrations/claude-code

tests/skill/understand-ollama/test_run_pipeline.test.mjs boots a Node
http.createServer stub that mimics the Ollama HTTP API surface used by
run-pipeline.mjs (/api/version, /api/tags, /api/chat, /api/generate),
spawns the full seven-phase driver against a minimal fixture project,
then asserts:
- exit code 0
- knowledge-graph.json exists
- structural fields (nodes, project, version, kind) populated
- knowledge graph validates against the dashboard's Zod schema
- meta.json records the ollamaModel and ollamaUrl used

Total test count: 975 (was 974). No new dependencies. Lint clean.
@naraypv

naraypv commented Jun 19, 2026

Copy link
Copy Markdown
Author

Thanks for the pointer — I missed that ollama.com ships first-party Claude Code and Codex integrations.

A few clarifications on why this PR is still valuable:

1. Claude Code / Codex are two of 14 supported platforms. The plugin's platform compatibility table lists: Claude Code, Cursor, VS Code + GitHub Copilot, Copilot CLI, Codex, OpenCode, OpenClaw, Antigravity, Gemini CLI, Pi Agent, Vibe CLI, Hermes, Cline, KIMI CLI, Trae, Nanobot, Kiro CLI / IDE. Only Claude Code and Codex have a "user picks the upstream model" knob; the other 12+ inherit whatever the platform vendor ships. /understand-ollama is the only way to get a local Ollama backend on those hosts.

2. The Ollama integrations for Claude Code / Codex are client-side model overrides. They still send every byte of the codebase (README, source files, agent prompts) through the Claude Code / Codex process — which on managed / corporate deployments may forward to a vendor gateway regardless of the user-chosen model. The new /understand-ollama skill runs the entire pipeline as a local Node script against a local HTTP server; nothing leaves the host at any point.

3. Privacy and air-gap use cases. Some users (academic, government, medical, IP-sensitive) need to know that no prompt ever traverses a vendor boundary. The Ollama native integrations don't offer that guarantee on managed deployments; /understand-ollama does.

I'll add a short note to SKILL.md acknowledging the Claude Code / Codex route for users who are fine with it, and recommend /understand-ollama for the other cases. Will push shortly.

…/understand-ollama SKILL.md

Addresses reviewer feedback that Ollama ships first-party integrations
for Claude Code and Codex. The new callout sits at the top of the
Prerequisites section and points users on those two hosts to the
lighter native route, while listing three concrete reasons to use
/understand-ollama instead:
1. any other supported host (Cursor, Copilot, OpenCode, Kiro, Gemini
   CLI, etc.) — Ollama's native integrations only cover two platforms
2. guarantee that no prompt ever leaves the host machine
3. air-gapped or vendor-restricted environments where managed
   platforms may still forward traffic
@Lum1104

Lum1104 commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

…nderstand-ollama callouts

Addresses second review comment (Lum1104) pointing at
https://docs.ollama.com/integrations. The prior callout only named
Claude Code / Codex and listed Cursor / Copilot / OpenCode / Kiro /
Gemini CLI as 'no native integration' hosts; in fact Ollama ships
native integrations for Copilot CLI, Cline CLI, OpenCode, VS Code
(incl. Copilot), JetBrains, Zed, and Roo Code.

Rewrites both callouts to:
- point at the full integrations index
- list the hosts with a native integration (Claude Code, Codex App,
  Codex CLI, Copilot CLI, Cline CLI, OpenCode, VS Code, JetBrains,
  Zed, Roo Code) and recommend the native integration when one exists
- narrow the /understand-ollama recommendation to the hosts that
  actually have no native integration (Cursor, Gemini CLI, OpenClaw,
  Hermes, Goose, Kiro CLI / IDE, Antigravity, Pi Agent, Vibe CLI,
  Trae, Nanobot, Droid, Pool, ...) and to the two non-host reasons
  (no-vendor-process guarantee; one-command cross-host stability)

No code change.
@naraypv

naraypv commented Jun 19, 2026

Copy link
Copy Markdown
Author

You're right — I was looking at https://docs.ollama.com/integrations/claude-code only. Pulling the integrations index gives a much wider list: Claude Code, Codex App, Codex CLI, Copilot CLI, Cline CLI, OpenCode, Droid, Goose, Oh My Pi, Pi, Pool (coding agents); OpenClaw, Hermes Agent, Hermes Desktop (assistants); VS Code, Cline, Roo Code, JetBrains, Xcode, Zed (IDEs/editors).

That covers several of the platforms I had wrongly listed as "no native Ollama integration" — Copilot CLI, Cline CLI, OpenCode, VS Code (incl. the Copilot entry my README already lists), JetBrains, Zed, Roo Code. Native integration is the lightest path on those hosts and that's now the recommended route in both the SKILL.md and README.

The hosts in the plugin's platform compatibility table that genuinely still have no native Ollama integration are: Cursor, Gemini CLI, Kiro CLI / IDE, Antigravity, Pi Agent, Vibe CLI, Trae, Nanobot, plus the agents my reply listed. That's still a real (if narrower) addressable audience for /understand-ollama.

The two non-host reasons — guarantee that no prompt ever leaves the host, and one-command cross-host stability — also still hold regardless of which host is in use.

Pushed as 457e5c7. Updated both the SKILL.md Prerequisites callout and the README Quick Start section. Thanks for the pointer; I should have started from the integrations index instead of the Claude Code page.

Adopt upstream's token-usage and local-model README callouts
across all 8 README variants. Keep this branch's `2.9.0`
version bump in all five plugin manifests — the bump is
the version claim of the Ollama backend feature, not a
maintainable post-merge increment.

Resolution summary:
  - .claude-plugin/plugin.json             keep ours (2.9.0)
  - .copilot-plugin/plugin.json            keep ours (2.9.0)
  - .cursor-plugin/plugin.json             keep ours (2.9.0)
  - understand-anything-plugin/.claude-plugin/plugin.json   keep ours (2.9.0)
  - understand-anything-plugin/package.json keep ours (2.9.0)
  - README.md + READMEs/README.{es-ES,ja-JP,ko-KR,ru-RU,tr-TR,zh-CN,zh-TW}.md
    auto-merged (8 README callouts from 7f5a717 added cleanly
    alongside the Ollama callouts added in commit 457e5c7).

Post-merge validation:
  - pnpm lint                                                clean
  - pnpm --filter @understand-anything/core test -- --run    767/767 pass
  - pnpm test                                                208/208 pass
  - node .../smoke-client.mjs                                pass on real Ollama
  - node .../run-pipeline.mjs --project-root homepage        17 files, 2 layers,
                                                             6 tour steps, schema OK
@naraypv

naraypv commented Jun 19, 2026

Copy link
Copy Markdown
Author

Conflicts are resolved.

Rebased ollama onto origin/main (7f5a717) and resolved the five version field conflicts in the plugin manifests. README callouts from 7f5a717 (token-usage + local-model hints, in all 8 README variants) auto-merged cleanly alongside the Ollama callouts already on this branch.

Resolution strategy: kept this branch's 2.9.0 in all five manifests. The 2.9.0 bump is the version claim of the Ollama backend feature itself (commits 46aea21, c5ae969), not an opportunistic increment on top of the token-usage docs change. The five conflicts were:

  • .claude-plugin/plugin.json → 2.9.0
  • .copilot-plugin/plugin.json → 2.9.0
  • .cursor-plugin/plugin.json → 2.9.0
  • understand-anything-plugin/.claude-plugin/plugin.json → 2.9.0
  • understand-anything-plugin/package.json → 2.9.0

PR now reports mergeable: MERGEABLE (branch head 0766ff4).

Post-merge local validation on homepage/:

  • pnpm lint clean
  • pnpm --filter @understand-anything/core test -- --run 767/767
  • pnpm test 208/208
  • node .../smoke-client.mjs PASS (real Ollama 0.16.1, qwen2.5-coder:1.5b)
  • node .../run-pipeline.mjs --project-root homepage 17 files -> 17 nodes, 2 layers, 7 tour steps, Zod schema OK, meta.json.gitCommitHash=0766ff4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants