Clawmark: lower catalog trust gate to the 0-5 scale (threshold 3.0) by ryan10sa-star · Pull Request #12 · relayforge-ai/carapace-protocol

ryan10sa-star · 2026-05-22T02:12:22Z

Summary

Companion to aria-registry#8. ARIA is migrating Clawmark scores to the canonical 0–5 scale (CANONICAL_CLAWMARK_STANDARD.md). The Carapace SDK catalog gate compared clawmark_score against 80 (the old 0–100 scale) — once ARIA serves 0–5 scores, run_gate_check / runGateCheck would fail-closed for every tool.

Scope note: the original Phase B spec said "do not change the Carapace SDK". The maintainer explicitly authorized this change — leaving the threshold at 80 guarantees a broken gate the moment ARIA flips to 0–5, so the SDK gate must move in lockstep.

The gate threshold is lowered to 3.0 (= beta or above).

Changes

carapace/catalog.py — CatalogEntry.clawmark_score is now float; from_dict coerces ARIA's null (unscored) to 0.0; run_gate_check score_threshold 80 → 3.0.
typescript/src/catalog.ts — CatalogEntry.clawmark_score is now number | null; runGateCheck scoreThreshold 80 → 3.0; the gate treats null/unscored as 0.
tests/test_v05_phase_b.py — catalog fixtures converted 0–100 → 0–5 (85→4.5, 50→2.0, 90→4.8, 75/80→4.0); custom-threshold test uses 1.0; new test_from_dict_handles_null_score.
typescript/test/v05_receipts.test.js — makeCatalogState fixtures converted 0–100 → 0–5.

Notes

certification_tier is left on CatalogEntry for backward compatibility — ARIA no longer emits it (it becomes ""), but nothing breaks.
No change to gate ordering, fail-open semantics, or the receipt API.

Test plan

pytest tests/test_v05_phase_b.py
node --test typescript/test/v05_receipts.test.js (after tsc build of typescript/)
Confirm against aria-registry#8 that /aria/v1/catalog emits 0–5 clawmark_score

https://claude.ai/code/session_01Sm3am5LeiJqJgEkW7oD2vu

Generated by Claude Code

^{Need help on this PR? Tag @codesmith with what you need. Autofix is disabled.}

ARIA now serves Clawmark scores on the canonical 0-5 scale (CANONICAL_CLAWMARK_STANDARD.md). The Carapace SDK catalog gate compared against 80 (the old 0-100 scale), which would fail-closed for every tool once ARIA flips to 0-5. - catalog.py: CatalogEntry.clawmark_score is now float; from_dict coerces ARIA's null (unscored) to 0.0; run_gate_check score_threshold 80 -> 3.0. - catalog.ts: CatalogEntry.clawmark_score is now `number | null`; runGateCheck scoreThreshold 80 -> 3.0; gate treats null as 0. - test_v05_phase_b.py: catalog fixtures converted 0-100 -> 0-5 (85->4.5, 50->2.0, 90->4.8, 75/80->4.0); custom-threshold test uses 1.0; new test for the null-score coercion. Companion to aria-registry#8.

makeCatalogState used 0-100 scores (85/50/90); on the 0-5 scale a 50 would pass the gate, so the low-score fixture no longer exercised the clawmark_gate failure. 85->4.5, 50->2.0, 90->4.8. Companion to aria-registry#8.

The CI `python` job runs pytest in `python/`, which holds a second copy of the SDK (python/carapace, python/tests) byte-identical to the repo root. The previous commit only updated the root copy, so the CI-tested package still gated at 80 against the 0-5 test fixtures — the python job failed. Apply the same 0-5 threshold change (score_threshold 80 -> 3.0, float clawmark_score, null-score coercion, fixtures converted to 0-5) to python/carapace/catalog.py and python/tests/test_v05_phase_b.py so the two copies stay in sync.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

ryan10sa-star added 3 commits May 21, 2026 19:11

Clawmark: update TS catalog test fixtures to the 0-5 scale

b9b78c0

makeCatalogState used 0-100 scores (85/50/90); on the 0-5 scale a 50 would pass the gate, so the low-score fixture no longer exercised the clawmark_gate failure. 85->4.5, 50->2.0, 90->4.8. Companion to aria-registry#8.

ryan10sa-star marked this pull request as ready for review May 22, 2026 02:32

Copilot AI review requested due to automatic review settings May 22, 2026 02:32

ryan10sa-star merged commit 32b8a0e into main May 22, 2026
1 of 2 checks passed

Copilot started reviewing on behalf of ryan10sa-star May 22, 2026 02:32 View session

Copilot AI reviewed May 22, 2026

ryan10sa-star mentioned this pull request May 23, 2026

FIX: carapace python CI failures after 0-5 Clawmark rescale #14

Merged

3 tasks

ryan10sa-star deleted the claude/laughing-thompson-QNxrw branch June 14, 2026 21:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clawmark: lower catalog trust gate to the 0-5 scale (threshold 3.0)#12

Clawmark: lower catalog trust gate to the 0-5 scale (threshold 3.0)#12
ryan10sa-star merged 3 commits into
mainfrom
claude/laughing-thompson-QNxrw

ryan10sa-star commented May 22, 2026 •

edited by blacksmith-sh Bot

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ryan10sa-star commented May 22, 2026 • edited by blacksmith-sh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Notes

Test plan

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ryan10sa-star commented May 22, 2026 •

edited by blacksmith-sh Bot

Loading