[TRACKING] Feature roadmap

Status: ✅ in use · 🔄 in progress · ⬜ to adopt

| Feature we want | Library | Status |
|---|---|---|
| Agent loop, tool calling, structured output, streaming | pydantic-ai | ✅ |
| Provider/model switching (cloud + local Ollama) | pydantic-ai models | ✅ |
| Tool server (MCP) | fastmcp | ✅ |
| Config from env / `.env` | pydantic-settings | ✅ (#10) |
| Recover from failing tool calls | pydantic-ai (`ModelRetry`) | ✅ (#11) |
| CLI commands, flags, subcommands (entry point) | click (reuse `aiida.cmdline` options/param types) | ⬜ currently a raw `input()` loop |
| HITL / approval gate before writes | pydantic-ai built-in tool approval (deferred tools) | 🔄 PR #8 is custom; move onto the primitive |
| Multi-line input + vim keybindings + history | prompt_toolkit (`PromptSession(multiline=True, vi_mode=True)`) | ⬜ replaces raw `input()` |
| Spinner, live output, Markdown rendering | rich (`console.status`, `Live`) | ⬜ replaces the local custom spinner |
| Eval harness (golden queries, scoring, reports) | pydantic-evals (`Dataset`/`Case`/`LLMJudge`) | ⬜ ADR-06; don't hand-roll |
| Tracing / per-run tokens + latency | Pydantic Logfire (OTel) | ⬜ optional, high-leverage for cloud-vs-local study |
| Full TUI (only if we outgrow a REPL) | Textual | ⬜ later, maybe never |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TRACKING] Feature roadmap #14

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature we want	Library	Status
Agent loop, tool calling, structured output, streaming	pydantic-ai	✅
Provider/model switching (cloud + local Ollama)	pydantic-ai models	✅
Tool server (MCP)	fastmcp	✅
Config from env / `.env`	pydantic-settings	✅ (#10)
Recover from failing tool calls	pydantic-ai (`ModelRetry`)	✅ (#11)
CLI commands, flags, subcommands (entry point)	click (reuse `aiida.cmdline` options/param types)	⬜ currently a raw `input()` loop
HITL / approval gate before writes	pydantic-ai built-in tool approval (deferred tools)	🔄 PR #8 is custom; move onto the primitive
Multi-line input + vim keybindings + history	prompt_toolkit (`PromptSession(multiline=True, vi_mode=True)`)	⬜ replaces raw `input()`
Spinner, live output, Markdown rendering	rich (`console.status`, `Live`)	⬜ replaces the local custom spinner
Eval harness (golden queries, scoring, reports)	pydantic-evals (`Dataset`/`Case`/`LLMJudge`)	⬜ ADR-06; don't hand-roll
Tracing / per-run tokens + latency	Pydantic Logfire (OTel)	⬜ optional, high-leverage for cloud-vs-local study
Full TUI (only if we outgrow a REPL)	Textual	⬜ later, maybe never

[TRACKING] Feature roadmap #14

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions