CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

DeepScientist is a local-first autonomous research studio that manages long-horizon research workflows. It's a Python-based system with a Node.js launcher, web UI, and TUI, designed to keep research projects moving through baselines, experiments, and paper outputs.

Core principle: One quest = one Git repository. All durable state lives in files and Git.

Quick Start Commands

Installation and Setup

# Install from repository
bash install.sh

# Install with LaTeX runtime
bash install.sh --with-tinytex

# Start DeepScientist
ds

# Check system health
ds doctor

# Check specific runner
ds doctor --runner codex
ds doctor --runner claude
ds doctor --runner opencode

Development Commands

# Run tests
pytest

# Build web UI
npm --prefix src/ui install
npm --prefix src/ui run build

# Build TUI
npm --prefix src/tui install
npm --prefix src/tui run build

# Validate Python syntax
python3 -m compileall src/deepscientist

# Validate packaging
npm pack --dry-run --ignore-scripts

Runtime Commands

# Quest management
ds init                    # Create new quest
ds status                  # Show quest status
ds pause                   # Pause quest
ds resume                  # Resume quest

# LaTeX runtime
ds latex status
ds latex install-runtime

# Configuration
ds config                  # Manage configuration

# Memory and baselines
ds memory                  # Memory operations
ds baseline                # Baseline registry operations

Architecture

Launch Chain

User runs ds (npm global bin)
bin/ds.js ensures uv-managed Python runtime exists at ~/DeepScientist/runtime/python-env
Launcher starts Python daemon
Daemon serves web workspace at http://127.0.0.1:20999
Web UI and TUI consume same daemon API

Runtime Home: `~/DeepScientist/`

~/DeepScientist/
├── runtime/          # Launcher-managed runtime (uv Python env, tools)
├── config/           # YAML configuration and baseline registry
├── memory/           # Global memory cards
├── quests/           # One quest per Git repository
├── logs/             # Daemon and runtime logs
└── cache/            # Reusable caches (synced skills)

Quest Structure

Each quest lives at ~/DeepScientist/quests/<quest_id>/ as a Git repository:

quest_id/
├── quest.yaml                        # Quest metadata
├── brief.md                          # Research brief
├── plan.md                           # Implementation plan
├── status.md                         # Current status
├── SUMMARY.md                        # Quest summary
└── .ds/
    ├── runtime_state.json            # Runtime state
    ├── user_message_queue.json       # User message queue
    ├── events.jsonl                  # Event log
    └── interaction_journal.jsonl     # Interaction history

Core Subsystems

Python Runtime (src/deepscientist/)

cli.py - CLI commands
daemon/ - Web server, API routes, quest execution coordination
quest/ - Quest creation, snapshots, state persistence
artifact/ - Git-backed structured artifacts
memory/ - Global and quest-scoped memory
bash_exec/ - Managed shell sessions
mcp/ - MCP server implementation
runners/ - Runner implementations (Codex, Claude, OpenCode)
bridges/ - Connector transport adaptation
channels/ - Connector delivery and runtime
skills/ - Skill discovery and installation
prompts/ - Prompt builder
runtime_tools/ - Managed local tools (TinyTeX, etc.)

Prompts (src/prompts/)

system.md - Core system prompt
system_copilot.md - Copilot mode prompt
connectors/ - Connector-specific prompts
contracts/ - Contract definitions
benchstore/ - BenchStore prompts
start_setup/ - Setup prompts

Skills (src/skills/) Stage skills that define research workflow:

intake-audit/ - Initial quest setup
scout/ - Literature review
baseline/ - Baseline reproduction
idea/ - Idea generation
experiment/ - Experiment execution
analysis-campaign/ - Analysis
write/ - Paper writing
figure-polish/ - Figure refinement
review/ - Paper review
rebuttal/ - Rebuttal generation
finalize/ - Final deliverables

Web UI (src/ui/) - React-based workspace TUI (src/tui/) - Terminal interface

Key Contracts

Public MCP Surface

Only three public MCP namespaces exist:

memory - Memory operations
artifact - Git-backed artifacts and quest operations
bash_exec - Managed shell execution

Do not add new public namespaces like git, connector, or runtime_tool.

Bash Execution Contract

Critical: All terminal operations MUST use bash_exec(...). Native shell_command is forbidden.

This includes: ls, cat, git, python, npm, uv, file inspection, package management, etc.

Quest Layout Contract

Defined in src/deepscientist/quest/layout.py. Changes require updating:

Quest services
Daemon handlers
UI/TUI consumers
Tests

Skill Contract

Skills are discovered from src/skills/<skill_id>/SKILL.md with frontmatter:

---
name: skill-name
description: One-line purpose
skill_role: stage|companion|custom
skill_order: 60
---

Runner Contract

Three built-in runners:

codex (primary, battle-tested)
claude (experimental)
opencode (experimental)

Runners are registered in src/deepscientist/runners/ and discovered via registry.

Development Workflow

Adding a New MCP Tool

Add handler in src/deepscientist/mcp/server.py under appropriate namespace
Update src/deepscientist/runners/codex.py approval policy if needed
Update docs: docs/en/07_MEMORY_AND_MCP.md, docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md
Add tests in tests/test_mcp_servers.py

Adding a New Skill

Create src/skills/<skill_id>/SKILL.md with frontmatter
Update src/deepscientist/skills/registry.py if canonical stage/companion
Update src/deepscientist/prompts/builder.py if stage needs memory plan
Add tests in tests/test_stage_skills.py, tests/test_skill_contracts.py

Adding a New Connector

Add config in src/deepscientist/config/models.py
Add validation in src/deepscientist/config/service.py
Add bridge in src/deepscientist/bridges/connectors.py
Register in src/deepscientist/bridges/builtins.py
Register channel in src/deepscientist/channels/builtins.py
Wire daemon lifecycle in src/deepscientist/daemon/app.py if needed
Add prompt in src/prompts/connectors/<connector>.md if needed
Add tests for config, bridge, and API

Adding a Managed Runtime Tool

Create provider in src/deepscientist/runtime_tools/<tool>.py
Implement: tool_name, status(), install(), resolve_binary()
Register in src/deepscientist/runtime_tools/builtins.py
Access via RuntimeToolService, not direct imports
Install under ~/DeepScientist/runtime/tools/

Testing

# Run all tests
pytest

# Run specific test file
pytest tests/test_daemon_api.py

# Run with verbose output
pytest -v

# Run with coverage
pytest --cov=src/deepscientist

Key test files:

tests/test_daemon_api.py - API contract tests
tests/test_mcp_servers.py - MCP tool tests
tests/test_stage_skills.py - Skill tests
tests/test_codex_runner.py - Runner tests
tests/test_connector_bridges.py - Connector tests
tests/test_benchstore.py - BenchStore tests

Code Style and Principles

Keep files simple with direct control flow
Avoid unnecessary abstraction layers
Use small, explicit registries
Prefer file- and Git-based state over hidden runtime state
One quest = one Git repository (never violate this)
Python is authoritative runtime, npm is launcher
Prompts and skills carry workflow behavior, not rigid schedulers

Documentation

User docs: docs/en/ and docs/zh/
Architecture: docs/en/90_ARCHITECTURE.md
Development: docs/en/91_DEVELOPMENT.md
Contributing: CONTRIBUTING.md

Update docs when behavior or architecture changes.

Common Pitfalls

Do not use native shell_command - always use bash_exec(...)
Do not add new public MCP namespaces beyond memory, artifact, bash_exec
Do not bypass quest layout contracts - update all consumers together
Do not commit node_modules/, dist/, __pycache__/, or local secrets
Do not make quests anything other than Git repositories
Do not add connector-specific logic to unrelated quest code
Do not put workflow logic in daemon schedulers - use prompts and skills

Release Checklist

Before publishing:

Python tests pass (pytest)
Web and TUI bundles build (npm run ui:build, npm run tui:build)
Packaging validates (npm pack --dry-run --ignore-scripts)
README and docs match current behavior
New config/route/state fields have tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Quick Start Commands

Installation and Setup

Development Commands

Runtime Commands

Architecture

Launch Chain

Runtime Home: `~/DeepScientist/`

Quest Structure

Core Subsystems

Key Contracts

Public MCP Surface

Bash Execution Contract

Quest Layout Contract

Skill Contract

Runner Contract

Development Workflow

Adding a New MCP Tool

Adding a New Skill

Adding a New Connector

Adding a Managed Runtime Tool

Testing

Code Style and Principles

Documentation

Common Pitfalls

Release Checklist

Support

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Quick Start Commands

Installation and Setup

Development Commands

Runtime Commands

Architecture

Launch Chain

Runtime Home: ~/DeepScientist/

Quest Structure

Core Subsystems

Key Contracts

Public MCP Surface

Bash Execution Contract

Quest Layout Contract

Skill Contract

Runner Contract

Development Workflow

Adding a New MCP Tool

Adding a New Skill

Adding a New Connector

Adding a Managed Runtime Tool

Testing

Code Style and Principles

Documentation

Common Pitfalls

Release Checklist

Support

Runtime Home: `~/DeepScientist/`