Skip to content

Make integration test model fully configurable #34

Description

@w-winter

Problem
The integration test harness documents PI_TEST_MODEL as the way to choose the model used by LLM-backed tests, but the test agent definitions also hardcode a provider-specific model in frontmatter.

Current relevant paths:

  • .pi/skills/run-integration-tests/SKILL.md
    • documents PI_TEST_MODEL
  • test/integration/harness.ts
    • defines TEST_MODEL = process.env.PI_TEST_MODEL ?? ...
    • starts the parent Pi session with --model ${TEST_MODEL}
    • copies test agents from test/integration/agents/ into the temp project .pi/agents/
  • test/integration/agents/test-echo.md
    • hardcodes model: anthropic/claude-haiku-4-5
  • test/integration/agents/test-ping.md
    • hardcodes model: anthropic/claude-haiku-4-5

So, PI_TEST_MODEL only reliably affects the parent/orchestrator Pi session. Subagents launched via agent: "test-echo" or agent: "test-ping" can still inherit the hardcoded model from the copied agent frontmatter.

That can create blockers for contributors and fork maintainers who aren't running Anthropic models over API or "extra usage." Hardcoding a different provider-specific model locally works as a temporary hack, but IMO it shouldn't be necessary

--
Expected behavior

I think there should be one integration-test model setting, and it should apply consistently to both:

  1. the parent/orchestrator Pi session started by the integration harness
  2. subagent sessions launched through test agents such as test-echo and test-ping

So, if someone runs:

PI_TEST_MODEL=openai-codex/gpt-5.4-mini npm run test:integration

or:

PI_TEST_MODEL=anthropic/claude-haiku-4-5 npm run test:integration

then all LLM-backed parts of the integration test run would use that model without requiring edits to checked-in markdown agent files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions