A proxy for Ollama to easily disable thinking output for Home Assistant integrations.
- Proxies
/api/chatrequests to Ollama AI server, automatically settingthink=falseto disable thinking output. - Streams responses back to the client.
- Supports fetching tags via
/api/tags. - Simple health check endpoint
/.
- Python 3.9+
- FastAPI, httpx, uvicorn
-
Install dependencies:
pip install fastapi httpx uvicorn
-
Configure the Ollama host URL in your code (
OLLAMA_HOST). -
Run the proxy server:
uvicorn proxy:app --host 0.0.0.0 --port 11435
POST /api/chat: Forward chat requests to Ollama with thinking output disabled.GET /api/tags: Retrieve available tags.GET /: Health check.