OllamaTerm

A keyboard-first, fully local AI chat interface for the terminal

Powered by Ollama and Textual — no cloud, no API keys, no data leaving your machine.

🖥️ Preview

┌─────────────────────────────────────────────────────────┐
│  OllamaTerm                         [llama3.2] 🟢 Online │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  You  ────────────────────────────────────────────────  │
│  Explain async/await in Python in one paragraph.        │
│                                                         │
│  Assistant  ──────────────────────────────────────────  │
│  async/await is Python's syntax for writing coroutines  │
│  — functions that can pause execution with `await`,     │
│  yielding control back to the event loop while waiting  │
│  for I/O, then resuming where they left off...          │
│                                                         │
├─────────────────────────────────────────────────────────┤
│  > Type a message...                         ⌘P for help│
└─────────────────────────────────────────────────────────┘

⚠️ Disclaimer

Warning

OllamaTerm is highly experimental software. This project is under active development and may contain bugs, incomplete features, or breaking changes. Use at your own risk, especially when using coding tools that can modify files. Always review changes before committing and maintain backups of important data.

📑 Table of Contents

OllamaTerm

❓ Why OllamaTerm?

Feature	OllamaTerm	Web-based chat UIs
🔒 Privacy	100% local — data never leaves your machine	Depends on provider
📡 Offline use	Works after initial model pull	Requires internet
💰 Cost	Free (you own the hardware)	Often metered
⚡ Speed	No network latency to the model	Round-trip to cloud
🎛️ Customization	Full TOML config, rebindable keys	Usually limited
💻 Terminal native	Keyboard-first, scriptable	Browser tab

✨ Features

💬 Core Chat

Note

All responses stream in real-time with smooth, flicker-free rendering.

Streaming responses with batched rendering for smooth, flicker-free output
Animated "thinking" placeholder shown while the model starts generating
Bounded context window — automatically trims history to stay within token limits
Retry with backoff for resilient streaming on transient Ollama failures

🤖 Model Management

Multi-model config — list multiple models and switch at runtime
Clickable model picker in the status bar, or ctrl+m keyboard shortcut
Auto-pull on startup — optionally pull the configured model if it is not present
Traffic-light connection indicator — always know if Ollama is reachable

📁 Conversation Persistence

Save and load conversation history (JSON format)
Export conversations as Markdown transcripts
Search messages and cycle through results from the input box
Copy the latest assistant reply to clipboard in one shortcut
Conversation picker & auto-save — quickly switch between saved chats (via /conversations) with automatic saving enabled by default

🧠 Capabilities

Tip

Capabilities are auto-detected from Ollama's /api/show endpoint — no manual configuration needed!

Auto-detected per model — thinking, tool calling, and vision support are read from Ollama's /api/show at load time; no config required
Seamless model switching — capabilities update instantly when you switch models mid-conversation
Chain-of-thought reasoning for models that support it (e.g. qwen3, deepseek-r1, deepseek-v3.1, gpt-oss)
Tool calling and a full agent loop for multi-step model actions
Custom coding tools (read, grep, glob, ls, write, edit, multiedit, apply_patch, bash, batch, planning/todo/task tools, and more)
Web search via Ollama's built-in tools (requires an Ollama API key)
Vision / image attachments for vision-capable models (e.g. gemma3, llava)
Context window alignment — max_context_tokens is forwarded to Ollama as options.num_ctx so the server-side context window always matches the client-side trim budget

🎨 Interface & Integration

Command palette (ctrl+p) with searchable list of all actions
Fully configurable keybinds via TOML
Structured JSON logging with optional file output for debugging
Terminal title and window class set on startup for WM rules
Desktop entry and Hyprland/Ghostty integration examples included
Slash commands for actions like /new, /save, /load, /model, /preset, /image, /file, /conversations, and /help
Theme system with a theme picker (ctrl+t), built-in themes (e.g. Textual dark/light, Nord, Gruvbox, Tokyo Night) and customizable color themes with persistence

📋 Requirements

Requirement	Details
🐍 Python	3.11 or newer
🦙 Ollama	Installed and on your `PATH` (install guide)
▶️ Ollama daemon	Running — `ollama serve`
🌐 Internet	Only needed once, to pull models

📦 Installation

🔧 From source (recommended)

# Clone the repository
git clone https://github.com/Web-Dev-Codi/OllamaTerm.git
cd OllamaTerm

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate

# Install the package
pip install -e .

👨‍💻 Developer / contributor install

pip install -e '.[dev]'

📦 Arch Linux (PKGBUILD)

A PKGBUILD is included for building a native Arch package:

makepkg -si

🚀 Quick Start

1️⃣ Start Ollama and pull a model

ollama serve
ollama pull llama3.2

2️⃣ (Optional) Copy the example config

mkdir -p ~/.config/ollamaterm
cp config.example.toml ~/.config/ollamaterm/config.toml

3️⃣ Launch the app

ollamaterm
# or
python -m ollama_chat

4️⃣ Basic workflow

Action	How
💬 Send a message	Type in the input field → `ctrl+enter`
🔄 Switch model	Click `Model` in the status bar, or `ctrl+m`
📄 New conversation	`ctrl+n`
🔍 Search messages	`ctrl+f`, type query, press again to cycle
📋 Copy last reply	`ctrl+y`
🎯 Open all actions	`ctrl+p`
❌ Quit	`ctrl+q`

⚙️ Configuration

📍 Config File Location

~/.config/ollamaterm/config.toml

If the file does not exist, built-in defaults are used automatically.
Use config.example.toml from the repo as your starting point.

All Options

[app]
# Window title shown in the TUI header
title = "OllamaTerm"
# WM window class set on startup (useful for Hyprland/i3 rules)
class = "ollamaterm"
# How often (seconds) to check Ollama connectivity
connection_check_interval_seconds = 15

[ollama]
# Ollama API endpoint
host = "http://localhost:11434"
# Default active model
model = "llama3.2"
# All models available in the picker
models = ["llama3.2", "qwen2.5", "mistral"]
# Request timeout in seconds
timeout = 120
# System prompt injected at the start of every conversation
system_prompt = "You are a helpful assistant."
# Maximum messages kept in history
max_history_messages = 200
# Token budget for context trimming
max_context_tokens = 4096
# Pull the model on startup if not present locally
pull_model_on_start = true

[ui]
font_size = 14
background_color = "#1a1b26"
user_message_color = "#7aa2f7"
assistant_message_color = "#9ece6a"
border_color = "#565f89"
show_timestamps = true
# Number of streaming chunks to buffer before rendering
stream_chunk_size = 8

[theme]
# Theme selection: "textual-dark", "textual-light", "nord", "gruvbox", "tokyo-night",
# "monokai", "dracula", "solarized-light", "solarized-dark", "atom-one-dark", "atom-one-light"
# or "custom" to use the ui colors above
name = "textual-dark"
# Persist theme choice across sessions
persist = true
# Custom theme definitions (optional)
[theme.custom]
# Define custom themes here - example for a "catppuccin" theme:
# [theme.custom.catppuccin]
# primary = "#89B4FA"
# secondary = "#74C7EC"
# accent = "#F5C2E7"
# foreground = "#CDD6F4"
# background = "#1E1E2E"
# surface = "#313244"
# panel = "#45475A"
# success = "#A6E3A1"
# warning = "#F9E2AF"
# error = "#F38BA8"
# dark = true

[keybinds]
send_message = "ctrl+enter"
new_conversation = "ctrl+n"
quit = "ctrl+q"
scroll_up = "ctrl+k"
scroll_down = "ctrl+j"
command_palette = "ctrl+p"
toggle_model_picker = "ctrl+m"
toggle_theme_picker = "ctrl+t"
save_conversation = "ctrl+s"
load_conversation = "ctrl+l"
export_conversation = "ctrl+e"
search_messages = "ctrl+f"
copy_last_message = "ctrl+y"
interrupt_stream = "escape"

[security]
# Set true to allow non-localhost Ollama endpoints
allow_remote_hosts = false
allowed_hosts = ["localhost", "127.0.0.1", "::1"]

[logging]
level = "INFO"          # DEBUG | INFO | WARNING | ERROR
structured = true       # JSON-formatted log lines
log_to_file = false
log_file_path = "~/.local/state/ollamaterm/app.log"

[persistence]
enabled = true
auto_save = true
directory = "~/.local/state/ollamaterm/conversations"
metadata_path = "~/.local/state/ollamaterm/conversations/index.json"

[tools]
# Enable schema-first custom coding tools
enabled = true
# Base root for file/search/edit tools
workspace_root = "."
# Allow temporary external roots via external-directory tool
allow_external_directories = false
command_timeout_seconds = 30
max_output_lines = 200
max_output_bytes = 50000
max_read_bytes = 200000
max_search_results = 200
default_external_directories = []

[capabilities]
# Show the model's reasoning trace inside the assistant bubble.
# Thinking support itself is auto-detected — this controls only the UI display.
show_thinking = true

# Built-in web_search / web_fetch (requires OLLAMA_API_KEY or web_search_api_key).
# Only active when the model also supports tool calling (auto-detected).
web_search_enabled = false
web_search_api_key = ""

# Max tool-call iterations per message before the agent loop stops.
max_tool_iterations = 10

# NOTE: thinking support, tool calling, and vision are detected automatically
# from Ollama's /api/show response — no manual flags needed.

⌨️ Keybinds

Note

All keybinds are rebindable in [keybinds]. These are the defaults:

Keybind	Action
`ctrl+enter`	Send message
`ctrl+n`	New conversation
`ctrl+q`	Quit
`ctrl+k`	Scroll up
`ctrl+j`	Scroll down
`ctrl+p`	Open command palette
`ctrl+m`	Open model picker
`ctrl+t`	Open theme picker
`ctrl+s`	Save conversation (requires persistence enabled)
`ctrl+l`	Load latest saved conversation (requires persistence enabled)
`ctrl+e`	Export Markdown transcript (requires persistence enabled)
`ctrl+f`	Search messages (press again to cycle results)
`ctrl+y`	Copy last assistant message to clipboard
`escape`	Interrupt a streaming response

🔮 Capabilities

Important

Thinking, tool calling, and vision support are detected automatically from Ollama's /api/show endpoint each time a model is loaded or switched. No manual configuration is required — the status bar icons (🧠 🔧 👁) reflect what the active model actually supports.

Note

Requires Ollama ≥ 0.6 for capability metadata. Older Ollama versions fall back gracefully — all features are assumed enabled and gated only by whether the model responds correctly.

🧠 Chain-of-thought reasoning

Automatically active when the model reports "thinking" in its capabilities (e.g. qwen3, deepseek-r1, deepseek-v3.1, gpt-oss). The model's internal reasoning trace is shown above the final answer when show_thinking = true in [capabilities].

GPT-OSS note: GPT-OSS requires a string think level rather than a boolean. OllamaTerm detects GPT-OSS by name and automatically sends think="medium".

🔧 Tool calling

Automatically active when the model reports "tools" in its capabilities. The agent loop allows the model to invoke tools multiple times before producing a final answer. Control the upper bound with max_tool_iterations in [capabilities].

In addition to Ollama web tools, OllamaTerm now ships a schema-first local coding toolset designed for agentic workflows:

File and search tools: read, ls, glob, grep, codesearch
Editing tools: write, edit, multiedit, apply_patch
Runtime tools: bash, batch, external-directory
Planning/state tools: plan-enter, plan-exit, plan, todo, todoread, todowrite, task, question
Introspection tools: registry, tool, truncation, invalid

These tools are controlled by the [tools] config section and are constrained by workspace-root path checks, command timeouts, and output truncation limits.

🔌 Function tools with Ollama (alpha/experimental)

OllamaTerm passes tools to the Ollama Python SDK in two forms:

JSON function tools generated from the schema-first tool specs (the majority of tools below)
Python callables for built-in Ollama integrations when enabled (e.g. web_search, web_fetch)

The model emits tool_calls, the app executes them, appends a tool role message with the result, and continues the loop until the assistant returns a final answer.

Warning: This tool suite is experimental. Most tools are untested and may be buggy or missing edge-case handling. Use with caution and review changes carefully, especially file edits. Outputs may be truncated according to configured limits.

Available tools (names and key parameters)

Files & search
- list (built-in) — List files and directories.
  - path?: string (default: workspace root)
- ls (custom) — Alternate directory listing with tree-style output.
  - path?: string, ignore?: string[]
- read — Read a file window.
  - path: string, offset?: int, limit?: int
- glob — Find files by glob.
  - pattern: string, path?: string, max_results?: int
- grep / codesearch — Search file contents.
  - query: string, path?: string, case_sensitive?: bool, fixed_strings?: bool, max_results?: int
Editing
- write — Atomic full-file write.
  - path: string, content: string, overwrite?: bool, create_dirs?: bool
- edit — Single snippet replace.
  - path: string, old_text: string, new_text: string, replace_all?: bool
- multiedit — Multiple snippet edits atomically.
  - path: string, edits: { old_text, new_text, replace_all? }[]
- apply_patch — Apply structured patch hunks.
  - path: string, hunks: { old_text, new_text, replace_all? }[]
Runtime
- bash — Run a shell command (capped by time/output limits).
  - command: string, cwd?: string
- batch — Run a sequence of tool calls.
  - calls: { name: string, arguments: object }[], continue_on_error?: bool
- external-directory — Manage temporary external directory allowlist for this session.
  - action: string, path?: string
Planning & state
- plan-enter | plan-exit | plan
  - plan-enter: { goal?: string }
  - plan: { action?: string, content?: string }
- todo | todoread | todowrite | task
  - todo: { item: string }
  - todowrite: { items: string[], mode?: "append"|"replace" }
  - task: { action?: string, name?: string, status?: string }
- question — Emit a structured clarification question.
  - prompt: string, context?: string
Introspection & utility
- registry — List available tools.
- tool — Inspect a tool definition.
- truncation — Show output truncation limits.
- invalid — Always fails (for error-path testing).
Web (requires tool-capable model; web_search_enabled = true and an API key)
- websearch — Perform a web search via Ollama integration.
  - query: string, max_results?: int
- webfetch — Fetch a URL via Ollama integration.
  - url: string

Notes:

Directory listing may appear as list (built-in) or ls (custom) depending on which tool set is active. Both list files; prefer list when available.
File and command tools will prompt for permission. Paths are restricted to the configured workspace by default.
Large outputs are truncated. Use offset/limit (for read) and max_results (for grep/glob) to scope results.

Quick examples

List files here → Call tool: list { "path": "." }
Search for a string → Call tool: grep { "query": "TODO", "path": "." }
Read a file window → Call tool: read { "path": "src/main.py", "offset": 1, "limit": 120 }
Make an edit → Call tool: edit { "path": "README.md", "old_text": "foo", "new_text": "bar", "replace_all": true }

🌐 Web search

Set web_search_enabled = true in [capabilities] and provide an Ollama API key (via web_search_api_key or the OLLAMA_API_KEY environment variable). Web search also requires the active model to support tool calling (auto-detected) — it is silently disabled for models that do not.

👁️ Vision / image attachments

Automatically active when the model reports "vision" in its capabilities (e.g. gemma3, llava). Attach images with /image <path> in the input box or use the Attach button in the toolbar. Use /file <path> or the file attachment button to include non-image context files.

📏 Context window alignment

max_context_tokens (in [ollama]) serves two purposes:

Client-side — conversation history is trimmed to stay within this token budget before being sent
Server-side — the value is forwarded to Ollama as options.num_ctx so the model's context window matches; without this, Ollama may use a smaller default and silently truncate longer conversations

Increase this value for models with larger native context windows (e.g. set max_context_tokens = 32768 for llama3.2 or qwen3).

🖥️ Desktop Integration

🌀 Hyprland + Ghostty

The app sets the terminal window class from app.class on startup. For the most reliable behavior on Wayland, also pass the class directly to your terminal:

ghostty --class=ollamaterm-tui -e ollamaterm

Suggested Hyprland window rules (~/.config/hypr/hyprland.conf):

windowrulev2 = float,          class:^(ollamaterm-tui)$
windowrulev2 = size 1200 800,  class:^(ollamaterm-tui)$
windowrulev2 = center,         class:^(ollamaterm-tui)$
windowrulev2 = opacity 0.95,   class:^(ollamaterm-tui)$

bind = $mainMod, O, exec, ghostty --class=ollamaterm-tui -e ollamaterm

📑 Desktop Entry

Create ~/.local/share/applications/ollamaterm.desktop:

[Desktop Entry]
Type=Application
Name=OllamaTerm
Comment=ChatGPT-style TUI for Ollama local LLMs
Exec=ghostty --class=ollamaterm-tui -e ollamaterm
Icon=utilities-terminal
Terminal=false
Categories=Utility;TerminalEmulator;Development;

📦 Packaging / Building

🎡 Local wheel build (isolated)

python -m pip install build
python -m build --wheel
# optional: install into current env
python -m pip install --force-reinstall dist/*.whl
# or install for user with pipx
pipx install .

Troubleshooting: if you see BackendUnavailable: Cannot import 'setuptools.build_meta', either run the isolated build above, or install/upgrade in your active environment:

python -m pip install -U setuptools wheel build
# For Python 3.14 pre-releases, you may need:
python -m pip install --pre -U setuptools

🏗️ Arch package (PKGBUILD)

This repo ships a PKGBUILD that builds without network access using system makedepends:

sudo pacman -S --needed base-devel python-setuptools python-build python-installer python-wheel
makepkg -si

The build() step uses python -m build --wheel --no-isolation so it relies on the above makedepends instead of downloading during build.

👨‍💻 Development

# Full test suite
pytest -q

# With coverage report
pytest --cov=ollama_chat --cov-report=term-missing -q

# Lint
ruff check .

# Format check
black --check .

# Type check
mypy ollama_chat/

Run all checks before submitting changes:

ruff check . && black --check . && mypy ollama_chat/ && pytest -q

🛠️ Troubleshooting

Symptom	Fix
`Connection error` on startup	Ensure `ollama serve` is running; verify `ollama.host` in config
"Model not found" warning	Set `pull_model_on_start = true`, or run `ollama pull <model>` manually
Empty or cut-off response	Check `ollama list` to confirm the model name; review Ollama logs
Thinking / tools / vision not activating	Requires Ollama ≥ 0.6; run `ollama show <model>` and confirm `capabilities` is listed
Response cuts off mid-conversation	Increase `max_context_tokens` in `[ollama]` to match the model's native context window
Keybind not responding	Verify the syntax in `[keybinds]` and restart the app
Colors not applied	Use valid hex format: `#RRGGBB` or `#RGB`
Window class rule not matching	Ensure `app.class` is set; prefer launching with `ghostty --class=ollamaterm-tui`
Tool loop not stopping	Lower `max_tool_iterations` in `[capabilities]`
Web search not working	Confirm the model supports tool calling (`ollama show <model>`); set `web_search_enabled = true` and provide `OLLAMA_API_KEY`

🤝 Contributing

Contributions are always welcome! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Please read the Code of Conduct and review our Contributing Guide for details.

📄 License

⭐ Show Your Support

If you find OllamaTerm useful, please consider giving it a star!

🙏 Acknowledgments

Ollama — Local LLM runtime
Textual — TUI framework

_{Built with ❤️ and lots of ☕ by the OllamaTerm contributors}
_{🚀 Made for the terminal, by terminal lovers}

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github/workflows		.github/workflows
.windsurf/rules		.windsurf/rules
src/ollama_chat		src/ollama_chat
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
PKGBUILD		PKGBUILD
README.md		README.md
config.example.toml		config.example.toml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

OllamaTerm

🖥️ Preview

⚠️ Disclaimer

📑 Table of Contents

❓ Why OllamaTerm?

✨ Features

💬 Core Chat

🤖 Model Management

📁 Conversation Persistence

🧠 Capabilities

🎨 Interface & Integration

📋 Requirements

📦 Installation

🔧 From source (recommended)

👨‍💻 Developer / contributor install

📦 Arch Linux (PKGBUILD)

🚀 Quick Start

1️⃣ Start Ollama and pull a model

2️⃣ (Optional) Copy the example config

3️⃣ Launch the app

4️⃣ Basic workflow

⚙️ Configuration

📍 Config File Location

All Options

⌨️ Keybinds

🔮 Capabilities

🧠 Chain-of-thought reasoning

🔧 Tool calling

🔌 Function tools with Ollama (alpha/experimental)

Available tools (names and key parameters)

Quick examples

🌐 Web search

👁️ Vision / image attachments

📏 Context window alignment

🖥️ Desktop Integration

🌀 Hyprland + Ghostty

📑 Desktop Entry

📦 Packaging / Building

🎡 Local wheel build (isolated)

🏗️ Arch package (PKGBUILD)

👨‍💻 Development

🛠️ Troubleshooting

🤝 Contributing

📄 License

⭐ Show Your Support

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Contributors

Uh oh!

Languages