Curate and index documentation from any website into collections like tailwind/, horses/, etc. Reference collection indexes in your AI chats (e.g. @tailwind/INDEX.xml what's a utility?) so that only relevant docs are analysed. Much cleaner than a web-fetch and more focussed than a web-search. Keep your AI context sharp.
| Collection | Collection Index | Description | Scraped | Source |
|---|---|---|---|---|
📦 anthropic/ |
📄 anthropic/INDEX.xml |
Claude Agent SDK | 2025-11-06 | Official |
📦 biome/ |
📄 biome/INDEX.xml |
Fast linter/formatter | 2025-11-04 | Official |
📦 marimo/ |
📄 marimo/INDEX.xml |
Reactive Python notebooks | 2025-11-11 | Official |
📦 nextjs/ |
📄 nextjs/INDEX.xml |
React framework | 2025-11-05 | Official |
📦 playwright/ |
📄 playwright/INDEX.xml |
Browser testing | 2025-11-07 | Official |
📦 shiny/ |
📄 shiny/INDEX.xml |
Python web apps | 2025-11-02 | Official |
📦 tailwind/ |
📄 tailwind/INDEX.xml |
CSS framework | 2025-10-15 | Official |
📦 uv/ |
📄 uv/INDEX.xml |
Python projects | 2025-10-15 | Official |
📦 vercel/ |
📄 vercel/INDEX.xml |
Deployment platform | 2025-10-20 | Official |
📦 vitest/ |
📄 vitest/INDEX.xml |
Testing framework | 2025-11-05 | Official |
Curate your own collections. For most Anthropic docs use this tool.
# 1. Install UV
# 👉 https://docs.astral.sh/uv/getting-started/installation/
# 2. Clone repository
git clone https://github.com/michellepace/docs-for-ai.git
cd docs-for-ai
# 3. Get free FireCrawl API key
# Visit: https://www.firecrawl.dev/app/api-keys
# 4. Add to shell profile (.zshrc, .bashrc, .profile)
echo 'export API_KEY_MCP_FIRECRAWL=your-api-key-here' >> ~/.zshrc
source ~/.zshrc| Command | Purpose | .md Files | INDEX <source> |
|---|---|---|---|
/curate-doc <directory> <url> |
Add / re-scrape doc | ✅ Write | ✅ Add/replace |
/rescrape-docs <directory> |
Re-scrape all docs | ✅ Write all | ✅ Selective update |
To curate and keep docs fresh in this repo:
# Curate a new doc from a URL
/curate-doc tailwind https://tailwindcss.com/docs/customizing-colors
# → Scrapes page, writes .md file, adds source to INDEX.xml
# Re-scrape existing doc (refresh content from same URL)
/curate-doc tailwind https://tailwindcss.com/docs/customizing-colors
# → Re-scrapes, writes .md file, replaces source in INDEX.xml
# Start a new collection
/curate-doc reflex https://reflex.dev/docs/getting-started/installation
# → Creates reflex/ directory, README.md, INDEX.xml, and first curated doc
# Re-scrape all docs in collection (monthly maintenance)
/rescrape-docs tailwind
# → Re-scrapes all URLs in INDEX.xml, writes all .md files, replaces all sourcesTo use the docs (from other projects):
# From a different project
# 1. Give Claude Code access to the repo
/add-dir /home/mp/projects/docs-for-ai
# 2. Then reference as normal to ask your question
@/home/mp/projects/docs-for-ai/tailwind/INDEX.xml what's a utility?Workflow: Python script scrapes URL → writes .md file → creates INDEX.xml entry with PLACEHOLDER description → Claude Code generates semantic description.
/curate-doc- Always regenerates description for the doc/rescrape-docs- Only regenerates descriptions for files with content changes (ignores whitespace)
Usage: Reference @INDEX.xml [question] to let Claude Code use descriptions to find relevant docs.
Directory Structure:
uv/
├── INDEX.xml # Index of all docs
├── README.md
├── api-reference.md # Scraped doc
├── getting-started.md # Scraped doc
└── ...
INDEX.xml Schema:
<docs_index>
<source>
<title>Hello Document Title</title>
<description>20-30 word dense summary optimised for semantic search...</description>
<source_url>https://docs.example.com/hello</source_url>
<local_file>hello-document-title.md</local_file>
<scraped_at>2025-10-15</scraped_at>
</source>
<!-- Multiple <source> entries, one per .md file -->
</docs_index>Scripts use FireCrawl Python SDK for scraping. MCP server configured (.mcp.json, .claude/settings.json) for Claude Code self-healing beyond scripts.
