Problem
The integration test harness documents PI_TEST_MODEL as the way to choose the model used by LLM-backed tests, but the test agent definitions also hardcode a provider-specific model in frontmatter.
Current relevant paths:
.pi/skills/run-integration-tests/SKILL.md
test/integration/harness.ts
- defines
TEST_MODEL = process.env.PI_TEST_MODEL ?? ...
- starts the parent Pi session with
--model ${TEST_MODEL}
- copies test agents from
test/integration/agents/ into the temp project .pi/agents/
test/integration/agents/test-echo.md
- hardcodes
model: anthropic/claude-haiku-4-5
test/integration/agents/test-ping.md
- hardcodes
model: anthropic/claude-haiku-4-5
So, PI_TEST_MODEL only reliably affects the parent/orchestrator Pi session. Subagents launched via agent: "test-echo" or agent: "test-ping" can still inherit the hardcoded model from the copied agent frontmatter.
That can create blockers for contributors and fork maintainers who aren't running Anthropic models over API or "extra usage." Hardcoding a different provider-specific model locally works as a temporary hack, but IMO it shouldn't be necessary
--
Expected behavior
I think there should be one integration-test model setting, and it should apply consistently to both:
- the parent/orchestrator Pi session started by the integration harness
- subagent sessions launched through test agents such as
test-echo and test-ping
So, if someone runs:
PI_TEST_MODEL=openai-codex/gpt-5.4-mini npm run test:integration
or:
PI_TEST_MODEL=anthropic/claude-haiku-4-5 npm run test:integration
then all LLM-backed parts of the integration test run would use that model without requiring edits to checked-in markdown agent files.
Problem
The integration test harness documents
PI_TEST_MODELas the way to choose the model used by LLM-backed tests, but the test agent definitions also hardcode a provider-specific model in frontmatter.Current relevant paths:
.pi/skills/run-integration-tests/SKILL.mdPI_TEST_MODELtest/integration/harness.tsTEST_MODEL = process.env.PI_TEST_MODEL ?? ...--model ${TEST_MODEL}test/integration/agents/into the temp project.pi/agents/test/integration/agents/test-echo.mdmodel: anthropic/claude-haiku-4-5test/integration/agents/test-ping.mdmodel: anthropic/claude-haiku-4-5So,
PI_TEST_MODELonly reliably affects the parent/orchestrator Pi session. Subagents launched viaagent: "test-echo"oragent: "test-ping"can still inherit the hardcoded model from the copied agent frontmatter.That can create blockers for contributors and fork maintainers who aren't running Anthropic models over API or "extra usage." Hardcoding a different provider-specific model locally works as a temporary hack, but IMO it shouldn't be necessary
--
Expected behavior
I think there should be one integration-test model setting, and it should apply consistently to both:
test-echoandtest-pingSo, if someone runs:
or:
then all LLM-backed parts of the integration test run would use that model without requiring edits to checked-in markdown agent files.