██╗ ██╗ ██████╗ ██╗ ██████╗███████╗██╗ ██████╗
██║ ██║██╔═══██╗██║██╔════╝██╔════╝██║██╔═══██╗
██║ ██║██║ ██║██║██║ █████╗ ██║██║ ██║
╚██╗ ██╔╝██║ ██║██║██║ ██╔══╝ ██║██║ ██║
╚████╔╝ ╚██████╔╝██║╚██████╗███████╗██║╚██████╔╝
╚═══╝ ╚═════╝ ╚═╝ ╚═════╝╚══════╝╚═╝ ╚═════╝
Speak → text, locally, instantly.
voiceio-demo.mp4
# 1. Install system dependencies (Ubuntu/Debian)
sudo apt install pipx ibus gir1.2-ibus-1.0 python3-gi python3-dev portaudio19-dev
# 2. Install voiceio
pipx install python-voiceio
# 3. Run the setup wizard
voiceio setupThat's it. Press Ctrl+Alt+V (or your chosen hotkey) to start dictating.
Fedora
sudo dnf install pipx ibus python3-gobject python3-devel portaudio-devel
pipx install python-voiceio
voiceio setupArch Linux
sudo pacman -S python-pipx ibus python-gobject portaudio
# Note: Arch includes Python headers by default with the python package
pipx install python-voiceio
voiceio setupWindows
# Option A: Install with pip (requires Python 3.11+)
pip install python-voiceio
voiceio setup
# Option B: Download the installer from GitHub Releases (no Python needed)
# https://github.com/Hugo0/voiceio/releases
# Also available as a portable .zip if you prefer no installation.Windows uses pynput for hotkeys and text injection. No extra system dependencies required.
macOS
pipx install python-voiceio
voiceio setupBuild from source
If you want the source code locally to hack on or customize for personal use. PRs are welcome!
git clone https://github.com/Hugo0/voiceio
cd voiceio
uv pip install -e ".[linux,dev]"
# Bootstrap CLI commands onto PATH (creates ~/.local/bin/voiceio)
uv run voiceio setupNote: Source installs live inside a virtualenv, so
voiceioisn't on PATH until setup creates symlinks in~/.local/bin/. Ifvoiceioisn't found after setup, restart your terminal or runexport PATH="$HOME/.local/bin:$PATH".
You can also install with
uv tool install python-voiceioorpip install python-voiceio.
hotkey → mic capture → whisper (local) → text at cursor
pre-buffered streaming IBus / clipboard
Press your hotkey to start recording (1s pre-buffer catches the first syllable). Text streams into the focused app as an underlined preview. Press again to commit. Transcription runs locally via faster-whisper, text is injected through IBus (any GTK/Qt app) with clipboard fallback for terminals.
- Streaming: text appears as you speak, not after you stop
- Works everywhere: IBus input method for GUI apps, clipboard for terminals
- Wayland + X11: evdev hotkeys work on both, no root required
- Pre-buffer: never miss the first syllable
- Voice commands: "new line", "comma", "scratch that", punctuation by name
- Autocorrect: LLM-powered review of recurring Whisper mistakes (
voiceio correct) - Text-to-speech: hear selected text spoken back (Piper, eSpeak, Edge TTS)
- Smart post-processing: numbers ("twenty five" → "25"), punctuation, capitalization
- Auto-healing: falls back to the next working backend if one fails
- Autostart: optional systemd service, restarts on crash
- Self-diagnosing:
voiceio doctorchecks everything,--fixrepairs it
| Model | Size | Speed | Accuracy | Good for |
|---|---|---|---|---|
tiny |
75 MB | ~10x realtime | Basic | Quick notes, low-end hardware |
base |
150 MB | ~7x realtime | Good | Daily use (default) |
small |
500 MB | ~4x realtime | Better | Longer dictation |
medium |
1.5 GB | ~2x realtime | Great | Accuracy-sensitive work |
large-v3 |
3 GB | ~1x realtime | Best | Maximum quality, GPU recommended |
Models download automatically on first use. Switch anytime: voiceio --model small.
voiceio Start the daemon
voiceio setup Interactive setup wizard
voiceio doctor Health check (--fix to auto-repair)
voiceio test Test microphone + live transcription
voiceio demo Interactive guided tour of all features
voiceio toggle Toggle recording on a running daemon
voiceio correct Review and fix recurring transcription errors
voiceio history View transcription history
voiceio update Update to latest version
voiceio service install Autostart on login (systemd / Windows Startup)
voiceio logs View recent logs
voiceio uninstall Remove all system integrations
voiceio setup handles everything interactively. To tweak later, edit the config file or override at runtime:
- Linux/macOS:
~/.config/voiceio/config.toml - Windows:
%LOCALAPPDATA%\voiceio\config\config.toml
voiceio --model large-v3 --language auto -vSee config.example.toml for all options.
voiceio doctor # see what's working
voiceio doctor --fix # auto-fix issues
voiceio logs # check debug output| Problem | Fix |
|---|---|
| No text appears | voiceio doctor --fix - usually a missing IBus component or GNOME input source |
| Hotkey doesn't work on Wayland | sudo usermod -aG input $USER then log out and back in |
| Transcription too slow | Use a smaller model: voiceio --model tiny |
| Want to start fresh | voiceio uninstall then voiceio setup |
| Windows: antivirus blocks hotkeys | pynput uses global keyboard hooks — add an exception for voiceio |
| Windows: no sound feedback | Check voiceio logs for audio device info |
| macOS issues | Experimental — consider aquavoice.com or contribute a PR |
| Platform | Status | Text injection | Hotkeys | Streaming preview |
|---|---|---|---|---|
| Ubuntu / Debian (GNOME, Wayland) | Tested daily | IBus | evdev / GNOME shortcut | Yes |
| Ubuntu / Debian (GNOME, X11) | Supported | IBus | evdev / pynput | Yes |
| Fedora (GNOME) | Supported | IBus | evdev / GNOME shortcut | Yes |
| Arch Linux | Supported | IBus | evdev | Yes |
| KDE / Sway / Hyprland | Should work | IBus / ydotool / wtype | evdev | Yes |
| Windows 10/11 | Experimental | pynput / clipboard | pynput | Type-and-correct (no preedit) |
| macOS | Experimental | pynput / clipboard | pynput | Type-and-correct (no preedit) |
voiceio auto-detects your platform and picks the best available backends. Run voiceio doctor to see what's working on your system.
voiceio uninstall # removes service, IBus, shortcuts, symlinks
pipx uninstall python-voiceio # removes the packageContributions welcome! See CONTRIBUTING.md and open issues.
Now
- macOS polish (IMKit for native preedit, Accessibility API for text injection)
Soon
- Per-app context awareness (detect focused app, adapt formatting/behavior)
- File/audio transcription mode (
voiceio transcribe recording.mp3)
Backlog
- Multiple engine backends (whisper.cpp for Vulkan/AMD, VOSK for low-end hardware)
- Echo cancellation (filter system audio for meeting use)
- Wake word activation ("Hey voiceio") Done
- Text-to-speech output (Piper/eSpeak/Edge TTS — completes the "io")
- LLM auto-audit dictionary (
voiceio correct --auto— scan history with LLM, interactive correction) - LLM post-processing via Ollama (grammar cleanup, spelling fixes on final pass)
- Corrections dictionary — auto-replace misheard words, "correct that" voice command
- Transcription history — searchable log of everything you've dictated
- Number-to-digit conversion ("three hundred forty two" → "342")
- VAD-based silence filtering (Silero VAD, prevents Whisper hallucinations)
- Voice commands — "new line", "new paragraph", "scratch that", punctuation by name
- Custom vocabulary / personal dictionary (bias Whisper via
initial_prompt) - Smart punctuation & capitalization post-processing
- Windows support
- System tray icon with animated states
- Auto-stop on silence
MIT