Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ jobs:
--exclude-loopback
--max-redirects 5
--accept '200..=204,206,301..=308,403,429'
--exclude 'github.com/zenprocess/switchyard'
'./**/*.md'
fail: true

Expand Down
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,14 +53,16 @@ CACP is part of the [standra.ai](https://standra.ai) open standards stack:
- **[Pawbench](https://github.com/zenprocess/pawbench)** — reference benchmark that scores LLMs against CACP-formatted prompts and responses, including the orchestration × complexity matrix.
- **[ServingCard](https://servingcard.dev)** — model serving config standard.

## Recent additive fields (spec 009)
## Additive fields for orchestration & complexity reporting

CACP responses can carry these spec 009 fields (see [switchyard spec 009](https://github.com/zenprocess/switchyard/blob/main/specs/009-pawbench-orchestration-axis/spec.md) for the operational mapping):
CACP responses can carry these additional fields for runners that report multi-dimensional dispatch results. Vocabularies are normative in [Axiom §17](https://github.com/zenprocess/axiom/blob/main/spec.md):

- `complexity_tier` — `display` / `crud` / `transactional` / `cross_cutting`
- `verification_runs[]` — N-run AC re-verification with per-run verdict + prompt hash
- `artifact_quality` — static-analysis score over changed files
- `fixture_gap` (status) — AC un-evaluable due to missing setup, NOT counted against the agent
- `complexity_tier` — `display` / `crud` / `transactional` / `cross_cutting`. Stratifies aggregate scores so display-tier passes don't mask transactional-tier cliffs.
- `verification_runs[]` — N-run AC re-verification, one record per run with `verdict`, `prompt_hash`, `elapsed_ms`. Surfaces verifier flake.
- `artifact_quality` — static-analysis score over the *changed files only*; orthogonal to AC pass.
- `fixture_gap` (terminal status) — AC un-evaluable due to missing setup (seed data, env, services). **Not counted against the agent** — counts against the scenario author.

[Pawbench](https://github.com/zenprocess/pawbench) is the reference implementation of these fields end-to-end.

## License

Expand Down
Loading