macinput is a macOS-only automation server that exposes keyboard, mouse, and screenshot primitives through MCP. The project separates two layers:
- Core automation library: direct wrappers around Quartz and
screencapture. - MCP server layer: validated, documented, rate-limited tool wrappers intended for AI agents.
This split keeps the low-level code reusable while allowing the MCP surface to evolve independently.
The exported MCP capabilities are intentionally small:
get_mouse_positionmove_mouseclick_mousescroll_mousepress_keyboard_keykeyboard_key_downkeyboard_key_uptype_text_inputpaste_text_inputcapture_screenshotcleanup_screenshot_fileget_server_settings
This is the practical boundary for a UI-control server:
- enough control to operate arbitrary macOS apps
- small enough to keep agent plans interpretable
- minimal enough to reduce accidental destructive actions
For keyboard tools, the MCP layer accepts either string digits like "3" or numeric digits like 3 and normalizes both to the digit key. This avoids brittle failures when hosts serialize a single digit as a JSON number.
The server applies conservative runtime settings through environment variables:
MACINPUT_DEFAULT_SCREENSHOT_TTLMACINPUT_MAX_SCREENSHOT_TTLMACINPUT_MAX_TYPING_LENGTHMACINPUT_MIN_ACTION_DELAYMACINPUT_DEFAULT_TYPING_INTERVAL
These settings are intended for operators, not end users. They encode host-level policy without forcing custom code changes.
- Grant Accessibility and Screen Recording permissions before first use.
- Run the server over
stdiofor desktop AI clients unless you explicitly need network transport. - Keep the host session unlocked and on the expected Space/Desktop.
- Treat screenshot files as sensitive data and keep TTLs short.
- Prefer a dedicated macOS user account or VM when automation can touch sensitive apps.
- Always observe before acting: take a screenshot on every unknown screen state.
- Use one state-changing action at a time.
- Re-screenshot after a click, shortcut, or text submission.
- Avoid cached coordinates across screen transitions.
- Keep typing short and task-specific.
- Clean up screenshots when the task step completes.
- Stop when the UI diverges from expectation rather than compounding errors.
Use stdio as the default transport for:
- Claude Desktop
- Cursor
- VS Code MCP hosts
- local orchestrators launching subprocesses
Use streamable-http only when:
- the host application cannot spawn subprocess MCP servers
- you need to multiplex the server behind an internal automation gateway
uv syncuv run pytestuv run ruff check .- Validate on a real macOS session with permissions enabled.
- Test one desktop host integration over
stdio. - Cut a version tag and publish the package.
- Python 3.10+ is required for current MCP SDK releases.
- The server requires a real macOS GUI session.
- Headless CI can validate imports, config, and docs, but not UI injection behavior.