Phase P6 of the orchestration × complexity matrix work.
- New mode:
pawbench --context-tier manifest-only forces context compilation to emit only MANIFEST entries (no FULL/SKELETON/SUMMARY)
- One scheduled weekly run for trend data
- Result published alongside standard Pawbench in leaderboard with visible delta column
Gate to P7: ≥ 4 weeks of trend data.
Why: we've assumed FULL/SKELETON tiers help. Fabian's spec was 50 pages of prose with zero code snippets and the winning agent still hit 85%. If manifest-only Pawbench is within 10% of standard Pawbench, the aggressive context tiers are buying us less than we think — that's a publishable finding either way.
Phase P6 of the orchestration × complexity matrix work.
pawbench --context-tier manifest-onlyforces context compilation to emit only MANIFEST entries (no FULL/SKELETON/SUMMARY)Gate to P7: ≥ 4 weeks of trend data.
Why: we've assumed FULL/SKELETON tiers help. Fabian's spec was 50 pages of prose with zero code snippets and the winning agent still hit 85%. If manifest-only Pawbench is within 10% of standard Pawbench, the aggressive context tiers are buying us less than we think — that's a publishable finding either way.