Curate Docs For AI (with Claude Code)

Curate and index documentation from any website into collections like tailwind/, horses/, etc. Reference collection indexes in your AI chats (e.g. @tailwind/INDEX.xml what's a utility?) so that only relevant docs are analysed. Much cleaner than a web-fetch and more focussed than a web-search. Keep your AI context sharp.

📦 Repo Collections

Collection	Collection Index	Description	Scraped	Source
📦 `anthropic/`	📄 `anthropic/INDEX.xml`	Claude Agent SDK	2025-11-06	Official
📦 `biome/`	📄 `biome/INDEX.xml`	Fast linter/formatter	2025-11-04	Official
📦 `marimo/`	📄 `marimo/INDEX.xml`	Reactive Python notebooks	2025-11-11	Official
📦 `nextjs/`	📄 `nextjs/INDEX.xml`	React framework	2025-11-05	Official
📦 `playwright/`	📄 `playwright/INDEX.xml`	Browser testing	2025-11-07	Official
📦 `shiny/`	📄 `shiny/INDEX.xml`	Python web apps	2025-11-02	Official
📦 `tailwind/`	📄 `tailwind/INDEX.xml`	CSS framework	2025-10-15	Official
📦 `uv/`	📄 `uv/INDEX.xml`	Python projects	2025-10-15	Official
📦 `vercel/`	📄 `vercel/INDEX.xml`	Deployment platform	2025-10-20	Official
📦 `vitest/`	📄 `vitest/INDEX.xml`	Testing framework	2025-11-05	Official

Curate your own collections. For most Anthropic docs use this tool.

🚀 Setup

# 1. Install UV
# 👉 https://docs.astral.sh/uv/getting-started/installation/

# 2. Clone repository
git clone https://github.com/michellepace/docs-for-ai.git
cd docs-for-ai

# 3. Get free FireCrawl API key
# Visit: https://www.firecrawl.dev/app/api-keys

# 4. Add to shell profile (.zshrc, .bashrc, .profile)
echo 'export API_KEY_MCP_FIRECRAWL=your-api-key-here' >> ~/.zshrc
source ~/.zshrc

📖 Curate With Slash Commands

Command	Purpose	.md Files	INDEX `<source>`
`/curate-doc <directory> <url>`	Add / re-scrape doc	✅ Write	✅ Add/replace
`/rescrape-docs <directory>`	Re-scrape all docs	✅ Write all	✅ Selective update

Terminal showing three-step workflow: (1) Running /curate-doc biome command, (2) Curation success output showing scraped documentation and generated INDEX.xml entry, (3) Reference command @biome/INDEX.xml to ask questions. Handwritten annotations highlight each step.

Complete workflow: curate → scrape → reference

💡 Usage Examples

To curate and keep docs fresh in this repo:

# Curate a new doc from a URL
/curate-doc tailwind https://tailwindcss.com/docs/customizing-colors
# → Scrapes page, writes .md file, adds source to INDEX.xml

# Re-scrape existing doc (refresh content from same URL)
/curate-doc tailwind https://tailwindcss.com/docs/customizing-colors
# → Re-scrapes, writes .md file, replaces source in INDEX.xml

# Start a new collection
/curate-doc reflex https://reflex.dev/docs/getting-started/installation
# → Creates reflex/ directory, README.md, INDEX.xml, and first curated doc

# Re-scrape all docs in collection (monthly maintenance)
/rescrape-docs tailwind
# → Re-scrapes all URLs in INDEX.xml, writes all .md files, replaces all sources

To use the docs (from other projects):

# From a different project

# 1. Give Claude Code access to the repo
/add-dir /home/mp/projects/docs-for-ai

# 2. Then reference as normal to ask your question
@/home/mp/projects/docs-for-ai/tailwind/INDEX.xml what's a utility?

🏗️ How This Repo Works

Workflow: Python script scrapes URL → writes .md file → creates INDEX.xml entry with PLACEHOLDER description → Claude Code generates semantic description.

/curate-doc - Always regenerates description for the doc
/rescrape-docs - Only regenerates descriptions for files with content changes (ignores whitespace)

Usage: Reference @INDEX.xml [question] to let Claude Code use descriptions to find relevant docs.

Directory Structure:

uv/
├── INDEX.xml               # Index of all docs
├── README.md
├── api-reference.md        # Scraped doc
├── getting-started.md      # Scraped doc
└── ...

INDEX.xml Schema:

<docs_index>
  <source>
    <title>Hello Document Title</title>
    <description>20-30 word dense summary optimised for semantic search...</description>
    <source_url>https://docs.example.com/hello</source_url>
    <local_file>hello-document-title.md</local_file>
    <scraped_at>2025-10-15</scraped_at>
  </source>
  <!-- Multiple <source> entries, one per .md file -->
</docs_index>

Scripts use FireCrawl Python SDK for scraping. MCP server configured (.mcp.json, .claude/settings.json) for Claude Code self-healing beyond scripts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Curate Docs For AI (with Claude Code)

📦 Repo Collections

🚀 Setup

📖 Curate With Slash Commands

💡 Usage Examples

🏗️ How This Repo Works

About

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
.claude		.claude
.vscode		.vscode
anthropic		anthropic
biome		biome
marimo		marimo
nextjs		nextjs
playwright		playwright
scripts		scripts
shiny		shiny
tailwind		tailwind
tests		tests
uv		uv
vercel		vercel
vitest		vitest
x_docs		x_docs
.gitattributes		.gitattributes
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
.mcp.json		.mcp.json
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

michellepace/docs-for-ai

Folders and files

Latest commit

History

Repository files navigation

Curate Docs For AI (with Claude Code)

📦 Repo Collections

🚀 Setup

📖 Curate With Slash Commands

💡 Usage Examples

🏗️ How This Repo Works

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages