Model Context Protocol (MCP) server for web interaction using crawl4ai.
This project aims to provide AI agents with advanced web browsing and data extraction capabilities via the MCP runner tool.
Install this MCP server in Cursor with one click:
<a href="cursor://anysphere.cursor-deeplink/mcp/install?name=c4a-mcp&config=eyJjb21tYW5kIjogImRvY2tlciIsICJhcmdzIjogWyJydW4iLCAiLWkiLCAiLS1ybSIsICJnaGNyLmlvL2JsZ2h0ci9jNGEtbWNwOmxhdGVzdCJdfQ=="><img src="https://cursor.com/deeplink/mcp-install-dark.png" alt="Add c4a-mcp MCP server to Cursor" style="max-height: 32px;" />``</a>
Note: Requires Docker to be installed and running. The server will run in a container from ghcr.io/blghtr/c4a-mcp:latest.
Click the button above to install automatically in Cursor, or use the deeplink:
[](https://cursor.com/en-US/install-mcp?name=c4a-mcp&config=eyJjb21tYW5kIjoiZG9ja2VyIHJ1biAtaSAtdiBnaGNyLmlvL2JsZ2h0ci9jNGEtbWNwOmxhdGVzdCJ9)
Add to your mcp.json file (typically located at ~/.cursor/mcp.json or %APPDATA%\Cursor\User\mcp.json on Windows):
{
"mcpServers": {
"c4a-mcp": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"ghcr.io/blghtr/c4a-mcp:latest"
]
}
}
}With environment variables (for LLM-based extraction):
{
"mcpServers": {
"c4a-mcp": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"-e", "OPENAI_API_KEY",
"-e", "GEMINI_API_KEY",
"ghcr.io/blghtr/c4a-mcp:latest"
],
"env": {
"OPENAI_API_KEY": "your-key-here",
"GEMINI_API_KEY": "your-key-here"
}
}
}
}Requirements:
- Docker must be installed and running
- For private repositories, authenticate with GitHub Container Registry:
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
- Python 3.11+
- uv for package management
# Install dependencies
uv pip install --system -e ".[dev]"This project uses pre-commit hooks to automatically format and lint code before commits.
Initial Setup:
# Install pre-commit hooks
uv run pre-commit installUsage:
Pre-commit hooks will run automatically on git commit. They will:
- Format code with
blackandruff format - Fix linting issues with
ruff - Check YAML/JSON files for syntax errors
- Remove trailing whitespace and fix end-of-file issues
Manual Run:
# Run hooks on all files
uv run pre-commit run --all-files
# Run hooks on staged files only
uv run pre-commit run# Run all tests
uv run pytest
# Run with verbose output
uv run pytest -vThis project uses GitHub Actions for continuous integration and deployment.
The CI/CD pipeline (/.github/workflows/ci-cd.yml) performs the following:
- Testing: Runs tests on Python 3.11 and 3.12
- Docker Build: Builds Docker image on push to
mainor tag creation - Docker Push: Publishes image to GitHub Container Registry (ghcr.io)
Docker images are automatically built and pushed to:
ghcr.io/blghtr/c4a-mcp
Available Tags:
latest- Latest commit onmainbranchv<version>- Semantic version tags (e.g.,v0.1.0)
Usage:
# Pull the latest image
docker pull ghcr.io/blghtr/c4a-mcp:latest
# Run the container
docker run ghcr.io/blghtr/c4a-mcp:latestNote: For private repositories, you'll need to authenticate:
# Login to GitHub Container Registry
echo $GITHUB_TOKEN | docker login ghcr.io -u USERNAME --password-stdin
# Pull the image
docker pull ghcr.io/blghtr/c4a-mcp:latestTo build the Docker image locally:
# Build the image
docker build -t c4a-mcp:local .
# Run the container
docker run c4a-mcp:localThe MCP server itself does not require any environment variables. However, if you plan to use LLM-based extraction strategies with crawl4ai, you may need to set API keys for your chosen provider:
OPENAI_API_KEY- For OpenAI models (gpt-4o, gpt-4o-mini, o1-mini, etc.)ANTHROPIC_API_KEY- For Anthropic models (claude-3-5-sonnet, etc.)GEMINI_API_KEY- For Google Gemini modelsGROQ_API_KEY- For Groq modelsDEEPSEEK_API_KEY- For DeepSeek models
These are only needed if you use LLM-based extraction strategies in your crawl configuration. The server will work fine without them for standard crawling.
The project uses python-dotenv, so you can create a .env file in the project root:
OPENAI_API_KEY=your_key_here
GEMINI_API_KEY=your_key_hereIssue: playwright install fails or browsers are not found.
Solutions:
-
Local Development:
# Run crawl4ai setup command uv run crawl4ai-setup # Or manually install browsers uv run playwright install chromium
-
Docker:
- Ensure the Dockerfile includes all required system libraries (see Dockerfile for full list)
- Verify Playwright installation step runs:
RUN playwright install --with-deps chromium
-
Check Installation:
uv run crawl4ai-doctor
Issue: Cannot connect to MCP server or tools not available.
Solutions:
-
Verify Server is Running:
# Start the server uv run c4a-mcp -
Check MCP Client Configuration:
- Ensure the server command points to:
c4a-mcporpython -m c4a_mcp - Verify transport method (stdio, SSE, etc.) matches your client
- Ensure the server command points to:
-
Check Logs:
- Enable debug logging to see detailed error messages
- Look for connection errors in the server logs
Issue: Docker build fails with dependency or permission errors.
Solutions:
-
Clear Build Cache:
docker build --no-cache -t c4a-mcp:local . -
Check System Dependencies:
- Ensure all Playwright system libraries are included in Dockerfile
- Verify Python version matches (3.11+)
-
Permission Issues:
- The Dockerfile now runs as non-root user (appuser)
- If you need to modify files, ensure proper ownership
-
Network Issues:
- Check if you can reach PyPI and GitHub Container Registry
- Consider using build-time network settings if behind a proxy
Issue: Tests fail in CI/CD or locally.
Solutions:
-
Install Dev Dependencies:
uv pip install --system -e ".[dev]" -
Run Tests with Verbose Output:
uv run pytest -v
-
Check Python Version:
- Ensure Python 3.11+ is installed
- CI/CD tests on 3.11 and 3.12
Issue: Pre-commit hooks fail or skip.
Solutions:
-
Update Hooks:
uv run pre-commit autoupdate
-
Run Manually:
uv run pre-commit run --all-files
-
Skip Hooks (not recommended):
git commit --no-verify