Priority: Medium
PawBench requires a live LLM endpoint. Can't run in CI without a real server.
Proposal
-
--mock mode: Ship recorded responses for each built-in scenario. Tests run against these without needing an endpoint. Good for CI, contributor testing.
-
--docker mode: Spin up a local vLLM/Ollama container with a small model (qwen3-0.6b) for integration testing. Slow but fully self-contained.
-
Record mode: pawbench --record responses/ saves actual API responses as fixtures for future --mock runs.
Priority: Medium
PawBench requires a live LLM endpoint. Can't run in CI without a real server.
Proposal
--mockmode: Ship recorded responses for each built-in scenario. Tests run against these without needing an endpoint. Good for CI, contributor testing.--dockermode: Spin up a local vLLM/Ollama container with a small model (qwen3-0.6b) for integration testing. Slow but fully self-contained.Record mode:
pawbench --record responses/saves actual API responses as fixtures for future--mockruns.