macinput MCP engineering notes

Positioning

macinput is a macOS-only automation server that exposes keyboard, mouse, and screenshot primitives through MCP. The project separates two layers:

Core automation library: direct wrappers around Quartz and screencapture.
MCP server layer: validated, documented, rate-limited tool wrappers intended for AI agents.

This split keeps the low-level code reusable while allowing the MCP surface to evolve independently.

Recommended server surface

The exported MCP capabilities are intentionally small:

get_mouse_position
move_mouse
click_mouse
scroll_mouse
press_keyboard_key
keyboard_key_down
keyboard_key_up
type_text_input
paste_text_input
capture_screenshot
cleanup_screenshot_file
get_server_settings

This is the practical boundary for a UI-control server:

enough control to operate arbitrary macOS apps
small enough to keep agent plans interpretable
minimal enough to reduce accidental destructive actions

For keyboard tools, the MCP layer accepts either string digits like "3" or numeric digits like 3 and normalizes both to the digit key. This avoids brittle failures when hosts serialize a single digit as a JSON number.

Safety defaults

The server applies conservative runtime settings through environment variables:

MACINPUT_DEFAULT_SCREENSHOT_TTL
MACINPUT_MAX_SCREENSHOT_TTL
MACINPUT_MAX_TYPING_LENGTH
MACINPUT_MIN_ACTION_DELAY
MACINPUT_DEFAULT_TYPING_INTERVAL

These settings are intended for operators, not end users. They encode host-level policy without forcing custom code changes.

Best practices for users and agent builders

User-side best practices

Grant Accessibility and Screen Recording permissions before first use.
Run the server over stdio for desktop AI clients unless you explicitly need network transport.
Keep the host session unlocked and on the expected Space/Desktop.
Treat screenshot files as sensitive data and keep TTLs short.
Prefer a dedicated macOS user account or VM when automation can touch sensitive apps.

Agent-side best practices

Always observe before acting: take a screenshot on every unknown screen state.
Use one state-changing action at a time.
Re-screenshot after a click, shortcut, or text submission.
Avoid cached coordinates across screen transitions.
Keep typing short and task-specific.
Clean up screenshots when the task step completes.
Stop when the UI diverges from expectation rather than compounding errors.

Transport recommendation

Use stdio as the default transport for:

Claude Desktop
Cursor
VS Code MCP hosts
local orchestrators launching subprocesses

Use streamable-http only when:

the host application cannot spawn subprocess MCP servers
you need to multiplex the server behind an internal automation gateway

Suggested release and operational workflow

uv sync
uv run pytest
uv run ruff check .
Validate on a real macOS session with permissions enabled.
Test one desktop host integration over stdio.
Cut a version tag and publish the package.

Compatibility notes

Python 3.10+ is required for current MCP SDK releases.
The server requires a real macOS GUI session.
Headless CI can validate imports, config, and docs, but not UI injection behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

macinput MCP engineering notes

Positioning

Recommended server surface

Safety defaults

Best practices for users and agent builders

User-side best practices

Agent-side best practices

Transport recommendation

Suggested release and operational workflow

Compatibility notes

FilesExpand file tree

mcp-engineering.md

Latest commit

History

mcp-engineering.md

File metadata and controls

macinput MCP engineering notes

Positioning

Recommended server surface

Safety defaults

Best practices for users and agent builders

User-side best practices

Agent-side best practices

Transport recommendation

Suggested release and operational workflow

Compatibility notes