Skip to content

chore(release): v2.21.0 + v2.22.0 (M71 "Honored" + M72 "Mirror")#14

Merged
nesquikm merged 37 commits into
mainfrom
chore/ste-283-honored-contracts-enforcement
May 15, 2026
Merged

chore(release): v2.21.0 + v2.22.0 (M71 "Honored" + M72 "Mirror")#14
nesquikm merged 37 commits into
mainfrom
chore/ste-283-honored-contracts-enforcement

Conversation

@nesquikm
Copy link
Copy Markdown
Owner

@nesquikm nesquikm commented May 13, 2026

Summary

This PR bundles two releases on a single feature branch — M71 + M72 are merged together once M72's new fixture group 8 smoke pass confirms M71's hooks fire correctly at runtime.

M71 — Honored Contracts enforcement (v2.21.0 "Honored")

STE-283 + STE-285 — the toolkit's first explicit anti-falsification design after the STE-220→STE-270 6-FR chain.

  • STE-283 (prose layer): TDD Orchestrator Contract callout at /implement Phase 2 step 8 + Rationalization Prevention table + new docs/honored-contracts.md catalog with 3 seeded contract entries (/implement → /tdd, /spec-write → spec-research, /brainstorm → AskUserQuestion-first).
  • STE-285 (byte-checkable layer): opt-in /setup --hooks menu, 4 seeded Process-category hook scripts under templates/hooks/process/, shared _lib/session.sh (single-line atomic grep -F | grep -F against $CLAUDE_SESSION_FILE, NFR-10 stderr emitter), install_hooks.ts settings.json merge helper (idempotent on match, conflict-surfaced on diff, malformed-JSON SyntaxError), per-hook HOOK_REGISTRATIONS (PreToolUse Bash for three hooks, UserPromptSubmit * for the brainstorm reminder), new docs/hooks-reference.md user manual.
  • Per-project scope (lands in <project>/.claude/settings.json, not ~/.claude/) supersedes the session-wide-bundling rejections (STE-262/STE-270/STE-276).

M72 — STE-285 hooks E2E smoke coverage (v2.22.0 "Mirror")

STE-286 — closes the symmetric coverage gap left by M71. STE-283's TDD Orchestrator Contract already had runtime evidence via fixture group 7 (STE-225 TDD orchestrator forks runtime); STE-285's seeded hooks shipped with per-script unit tests but no end-to-end probe of the install → harness dispatch → NFR-10 refusal pipeline.

  • /smoke-test Phase 2.X fixture group 8STE-285 hooks runtime regression, positioned after fixture group 7, mirrors group 7's shape. Diagnostic STE-285 runtime regression: <fixture-name> per STE-231 AC.5 precedent.
  • Install verification half: drives claude -p /dev-process-toolkit:setup --hooks=all and asserts the canonical 3 PreToolUse:Bash + 1 UserPromptSubmit:* event/matcher split lands in the test-project .claude/settings.json.
  • Runtime refusal probe half (per-scenario): 4 standalone bash subprocesses per leg (one per scenario: pre-commit-gate-check, pre-commit-tdd-orchestrator, pre-pr-spec-review, pre-spec-write-brainstorm-reminder) with CLAUDE_SESSION_FILE=/tmp/dpt-smoke-empty-session-<scenario>.jsonl set inline; per-scenario stderr asserted via grep -F for Refusing: / Reminder: + dev-process-toolkit:<skill> + hook=<name> tokens.
  • $CLAUDE_SESSION_FILE empirical finding (Phase 1): Claude Code's harness clobbers wrapper-shell env values on PreToolUse / UserPromptSubmit subprocess spawn, so AC-STE-286.4's literal "one marker-stripped claude -p child with rotated $CLAUDE_SESSION_FILE" was replaced by the AC-STE-286.7-accepted fallback (4 standalone bash subprocesses with env set inline). Recorded in specs/frs/archive/STE-286.md Notes.
  • /setup --hooks=<value> non-interactive preselect — bundled STE-285 scope expansion. parsePreselectFlag(arg) parser accepts --hooks=all (all 4 seeded names) and --hooks=<comma-list> (validates each against HOOK_REGISTRATIONS, refuses unknown / empty / whitespace-only / commas-only with an NFR-10-shaped error including the known-hooks hint). Documented in setup SKILL.md § 0c and docs/hooks-reference.md.

Test plan

  • bun test → 2327 pass / 0 fail / 11 skip across 219 files (baseline +103 new test assertions across the two releases).
  • Linear: STE-283 + STE-285 (M71) + STE-286 (M72) all transitioned to Done (per-FR Phase 4 Close).
  • Plans + FRs archived (status: active → archived, full ISO-8601 timestamps); traceability rows appended to specs/requirements.md § 6 for AC-STE-283.1..7 / AC-STE-285.1..7 / AC-STE-286.1..7.
  • Release files bumped twice via /ship-milestone: plugin.json + marketplace.json (2.20.0 → 2.21.0 → 2.22.0), CHANGELOG.md (two new sections), README.md "Latest:" banner, requirements.md "Latest shipped release:".
  • Smoke verification (M72 gate): run /smoke-test against this branch with fixture group 8 enabled — assert both legs (Linear + Jira) emit STE-285 runtime check: PASS and the 4 per-scenario logs land at /tmp/dpt-smoke-<tracker>-hooks-runtime-<scenario>.log. Merge after green.
  • Manual sanity: /setup --hooks on a downstream test project, pick pre-commit-gate-check, attempt git commit without first running /gate-check (post-merge validation; non-blocking).

Advisory notes

  • templates/hooks/_lib/session.sh:35-36 — JSONL grep needles assume compact JSON serialization (no space after colon). Test fixtures match the actual Claude Code emit shape; broader format-fragility risk is documented severity:medium in STE-285 § Risks.
  • M72 ships behind the smoke verification it enables — explicit chicken-and-egg per the brainstorm decision. First post-merge smoke run is the first real exercise of fixture group 8.

Refs: STE-283, STE-285, STE-286, M71, M72

nesquikm added 11 commits May 13, 2026 14:33
Honored Contracts enforcement (M71 first FR) — adds Contract block
+ Rationalization table + catalog plan. Prose-only enforcement layer
for cross-skill mandates like /implement → /tdd, with eyes open to
the STE-220→STE-270 chain saying prose-only gets falsified. If it
falsifies again, the evidence-based gate is the documented escalation
(this time we promise to listen).

Refs: STE-283
Tracker ↔ local FR/milestone reconciliation (M70's second FR) —
helper preflight at /spec-write entry + /gate-check probe at gate
boundary. Auto-imports tracker→local orphans silently; prompts for
the ambiguous kinds. Supersedes STE-119's "M-numbers are local"
decision, which got falsified the same day we set out to enforce it.

Also writes specs/plan/M70.md (the milestone existed on Linear only
until now — the exact bug this FR fixes, surfaced and patched in one
/brainstorm → /spec-write loop).

Refs: STE-284
Honored Contracts byte-checkable layer (M71's second FR) — /setup
offers a multi-select menu of opt-in toolkit-contract enforcement
hooks (Process category only; Safety + Quality out of scope). Hooks
plug into the user's .claude/settings.json via ${CLAUDE_PLUGIN_ROOT}
substitution (verified against Claude Code docs + STE-133 precedent).

Together with STE-283's prose layer, this gives M71 a "prose + opt-in
byte-checkable" enforcement bundle for the same contracts. The third
layer (bundled session-wide) stays where STE-262/270/276 left it:
in the rejection bin.

Refs: STE-285
Add labeled "TDD Orchestrator Contract" callout + Rationalization Prevention
table at the top of /implement Phase 2 step 8; cite docs/honored-contracts.md
catalog (3 seeded entries: /implement → /tdd, /spec-write → spec-research,
/brainstorm → AskUserQuestion-first). Prose layer of M71 Honored Contracts
enforcement bundle; operator-acknowledged residual risk vs. STE-220→STE-270
6-FR falsification chain (escalation path: evidence-based gate / hard mechanic).

Refs: STE-283
Byte-checkable layer of M71 Honored Contracts enforcement bundle. Adds:
- /setup multi-select menu prompt step (all hooks default off)
- --hooks flag for re-running only the hooks step (idempotent, pre-checked)
- 4 seeded Process-category hooks under templates/hooks/process/
  (pre-commit-gate-check, pre-pr-spec-review,
  pre-spec-write-brainstorm-reminder, pre-commit-tdd-orchestrator)
- Shared _lib/session.sh helper (atomic single-line JSONL grep, NFR-10 emit)
- install_hooks.ts settings.json merge helper (key-level merge, idempotent
  on same-matcher-same-command, conflict-surfaced on diff)
- docs/hooks-reference.md user manual

Per-project opt-in, bounded blast radius; supersedes STE-262/STE-270/STE-276
session-wide bundling rejections.

Refs: STE-285
Atomic archival: move STE-283 + STE-285 FRs and the M71 plan file into
archive/, flip status: active → archived + set archived_at timestamps,
mark plan ACs [x], append shipped-AC traceability rows for both FRs to
specs/requirements.md § 6, attach Implementation notes (advisory: STE-285
session.sh JSONL needle assumes compact serialization — documented in the
FR's Risks table).

Refs: M71
AC-STE-285.6 doc-conformance test hardcoded `specs/frs/STE-285.md` —
breaks after Phase 4 archival relocates the FR. Fall back to
`specs/frs/archive/STE-285.md` via existsSync so the assertion holds
for the same spec content across the active → archived transition.

Refs: STE-285
M71 — Honored Contracts enforcement: prose layer (STE-283) + opt-in
byte-checkable hook layer (STE-285).

Release: v2.21.0 "Honored"
Refs: M71
M72 (STE-285 hooks E2E smoke coverage) — first and only FR. Adds
/smoke-test fixture group 8 (STE-285 hooks runtime regression) covering
install verification + runtime refusal probe for all 4 seeded hooks via
one shared marker-stripped claude -p child spawn with 4 sequenced
scenarios + $CLAUDE_SESSION_FILE rotation. Bundles a small STE-285
scope expansion: /setup --hooks non-interactive preselect flag
(--hooks=all | --hooks=<comma-list>). Symmetric to STE-225 fixture
group 7 (TDD orchestrator forks runtime, per STE-231).

Refs: STE-286
Adds /smoke-test Phase 2.X fixture group 8 covering the M71 STE-285 opt-in
Process-category hooks end-to-end (install verification + 4 per-scenario NFR-10
refusal probes). Bundles a /setup --hooks=<value> non-interactive preselect
flag (parsePreselectFlag) so the smoke driver can install hooks without the
multi-select default-off menu.

The runtime-probe half uses the AC-STE-286.7-accepted fallback (per-scenario
standalone bash subprocesses with $CLAUDE_SESSION_FILE set inline) because
the Claude Code harness clobbers wrapper-shell env values when spawning
PreToolUse / UserPromptSubmit subprocesses — recorded as the empirical
finding in specs/frs/archive/STE-286.md Notes.

Release: deferred — M72 will ship after the new smoke fixture confirms M71's
hooks fire correctly at runtime.

Refs: STE-286
M72 ships /smoke-test fixture group 8 (STE-285 hooks runtime regression).

Release: v2.22.0 "Mirror"
Refs: M72
@nesquikm nesquikm force-pushed the chore/ste-283-honored-contracts-enforcement branch from b89ef88 to d50f4e1 Compare May 13, 2026 16:13
@nesquikm nesquikm changed the title chore(release): v2.21.0 chore(release): v2.21.0 + v2.22.0 (M71 "Honored" + M72 "Mirror") May 13, 2026
nesquikm added 17 commits May 13, 2026 20:32
M73 single-FR: hook install path portability fix — install_hooks.ts emits
literal ${CLAUDE_PLUGIN_ROOT} token in args[0] instead of JS-resolved
absolute path. Aligns implementation with STE-285 AC-STE-285.3.

Refs: STE-288, M73
`additionFor` and `readInstalledHookNames` now use the literal
`${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/<name>.sh` token on
both write and read paths so the same `.claude/settings.json` works
across dev clones and marketplace installs (M71/STE-285 design intent).
Hoisted the literal into the `HOOK_ARGS_PREFIX` constant to avoid
write/read drift. No legacy-shape migration (zero shipped users).

Archives STE-288 + M73 plan; appends AC-STE-288.1..5 traceability row.

Refs: STE-288
STE-288 is a pure bug fix; default Added would mis-bump M73 to minor
when patch is correct. Categorize before /ship-milestone runs.

Refs: STE-288
Hook install path portability — emit literal ${CLAUDE_PLUGIN_ROOT}
token across dev clones and marketplace installs.

Files bumped:
- plugins/dev-process-toolkit/.claude-plugin/plugin.json: 2.22.0 → 2.22.1
- .claude-plugin/marketplace.json: plugins[0].version 2.22.0 → 2.22.1
- CHANGELOG.md: + [2.22.1] "Portable" section (Fixed: STE-288)
- README.md: Latest banner v2.22.0 "Mirror" → v2.22.1 "Portable"
- specs/requirements.md: Latest shipped release v2.22.0 → v2.22.1

FRs included: STE-288 (Hook install path portability).
Total test count at release: 2335 tests, 0 failures, 0 errors.

Release: v2.22.1 "Portable"
Refs: M73
M74 single-FR: ship plugin-bundled hooks/hooks.json, rip out /setup
--hooks installer. Falsifies M71/M72/M73 install-side mechanism after
2026-05-14 empirical discovery that the harness only expands
${CLAUDE_PLUGIN_ROOT} in plugin-bundled hooks.json, not in user
.claude/settings.json. Forward-only; zero shipped users. Release
target v2.22.2.

Refs: STE-289
M71/M72/M73's install-side hook mechanism was structurally impossible:
the Claude Code harness only expands ${CLAUDE_PLUGIN_ROOT} in plugin-
bundled <plugin>/hooks/hooks.json, never in user .claude/settings.json.
Operator's 2026-05-14 run of /setup --hooks=all against v2.22.1 surfaced
the harness refusal "Hook command references ${CLAUDE_PLUGIN_ROOT} but
the hook is not associated with a plugin." Empirical research via the
claude-code-guide agent confirmed the contract.

M74 ships plugins/dev-process-toolkit/hooks/hooks.json (harness auto-
discovers + fires across every project where the plugin is enabled) and
rips out the install-side surface: install_hooks.ts + 5 setup tests + 1
stale capability-rows test + section 0c (Contract-enforcement re-
invocation flag) + section 6a (Toolkit-contract enforcement hooks menu)
+ smoke-test SKILL.md fixture group 8. Per-script hook tests and the 4
hook scripts under templates/hooks/process/ survive — only registration
moves.

Reverses STE-262/STE-270/STE-276's rejection of bundled hooks.json — the
rejection grounds (spawn blast radius, triple-check conflict, no clean
per-session state) all remain mitigated as before. Forward-only per
project_no_users_yet — zero shipped marketplace users; no migration
helper. Release notes will carry a one-line manual cleanup instruction
for the operator's downstream stale entries.

Archived STE-289 + M74 in this commit.

Refs: STE-289
STE-289 is a fix-class FR (the M74 plan explicitly states "patch — pure
fix-class; changelog_category: Fixed on STE-289"). Setting the
frontmatter field so /ship-milestone's inferBump picks patch (v2.22.2)
and the CHANGELOG entry lands under "### Fixed" rather than the
defaulted "### Added". Mirrors c38d44c (same fix for STE-288).

Refs: STE-289
M74 — STE-289 hooks bundling redo: ships plugin-bundled hooks.json,
rips out the install-side /setup --hooks installer that was
structurally impossible (harness only expands ${CLAUDE_PLUGIN_ROOT}
in plugin-bundled hooks.json, never in user .claude/settings.json).

Release: v2.22.2 "Bundled"
Refs: M74
Single-FR fix-class FR for M75. Rewrites the broken
$CLAUDE_SESSION_FILE-reading session.sh as a Bun TS library + 4
per-hook TS modules per the empirically-verified 2026-05-14 hook
stdin contract. Hooks have been silently fail-opening since v2.21.0
("Honored") shipped — STE-285's env-var read pattern is never
satisfied because the harness never sets the var; session info is
passed on stdin as JSON with transcript_path.

Release target v2.22.3 (patch — pure fix-class).

Refs: STE-290
STE-285's byte-checkable enforcement layer was inert since v2.21.0 —
session.sh read $CLAUDE_SESSION_FILE which the harness never sets, so
require_skill_tool_use always hit the fail-open branch silently. The
2026-05-14 empirical probe confirmed the harness passes session info
as stdin JSON carrying transcript_path, the actual contract.

Rewrites _lib/session.sh as a Bun TS lib exporting parseHookPayload /
findSkillToolUse / requireSkillToolUse / emitNFR10, plus 4 per-hook TS
modules under _lib/hooks/. Bash shims at process/*.sh shrink to 2-line
`exec bun run "${CLAUDE_PLUGIN_ROOT}/.../<name>.ts"` wrappers. Integration
tests inject stdin via Bun.spawn. NFR-10 byte-stable substrings (Refusing:,
Reminder:, Context:) preserved per STE-286 §104.

Forward-only per project_no_users_yet — no migration helper.

Refs: STE-290
M75 — STE-290 hook contract fix: ports the byte-checkable enforcement
layer from the never-set $CLAUDE_SESSION_FILE env var to the empirically-
verified stdin transcript_path JSON contract. Layer was inert since
v2.21.0 ("Honored"); now actually enforces.

Release: v2.22.3 "Wired"
Refs: M75
M76 follow-up to M75. Switches 3 Refusing hooks from exit 1 (advisory
per Claude Code 2.1.x harness contract) to exit 2 (blocking + stderr-
to-model). Discovered post-v2.22.3 install: hooks fire, NFR-10 stderr
emits, but Claude Code reports "Failed with non-blocking status code"
and lets the command proceed. brainstorm-reminder UserPromptSubmit
hook stays exit 0 (advisory by STE-285 design).

Refs: STE-291
Claude Code's hook contract: exit 0 = OK, exit 2 = blocking (stderr
fed to the model as feedback), any other non-zero (incl. exit 1) =
advisory (stderr shown to operator only, tool call proceeds).
STE-290 wired the layer to the real harness stdin transcript_path
contract but the 3 Refusing hooks exited 1 on miss, so they emitted
the NFR-10 stderr but didn't actually block. This flip makes them
block.

- pre-commit-gate-check.ts: 1 → 2 on miss path
- pre-pr-spec-review.ts: 1 → 2
- pre-commit-tdd-orchestrator.ts: 1 → 2
- 6 test files (3 unit + 3 integration): refusal-case `.not.toBe(0)`
  → `.toBe(2)` (7 assertion lines; tdd-orchestrator unit file has 2)
- docs/hooks-reference.md: +6-line "Exit-code contract (Claude Code
  2.1.x)" paragraph
- specs/requirements.md: AC-STE-291.1..6 traceability row appended

brainstorm-reminder unchanged (advisory UserPromptSubmit by STE-285
design). Archive sweep included: STE-291.md + M76.md flipped to
archived state in the same atomic commit.

Refs: STE-291
M76 — STE-291 hook refusal exit code: switch the 3 Refusing hooks
(pre-commit-gate-check, pre-pr-spec-review, pre-commit-tdd-orchestrator)
from exit 1 (advisory; Claude Code surfaces stderr to operator only and
proceeds) to exit 2 (blocking; harness blocks the tool call AND feeds
stderr to the model as feedback). STE-290 wired the layer to the real
harness contract; this closes the residual exit-code-semantics gap
discovered during STE-290's empirical verification. Pure fix-class;
forward-only per project_no_users_yet.

Release: v2.22.4 "Blocking"
Refs: M76
STE-226 hardening: tighten /spec-write auto-apply to ignore
autonomous-mode-reminder paraphrases. Sourced from /conformance-loop
iter-1 (2026-05-14) F1; extends the STE-262 + STE-270 triple-layer
defense (FORBIDDEN_PHRASES + regression fixture + runtime byte-grep).
Folds F7 (vacuous post-TIGHTEN); defers F10 (/setup-side, separate FR).

[hook-bypass: F3-iter1] pre-commit-tdd-orchestrator + pre-commit-gate-check
refused specs-only commit; explicit operator consent applied per
/conformance-loop iter-1 finding F3 (M70 fixture-group FR will carve
this out for specs-only commits).

Refs: STE-294
Smoke fixture group: M70 post-iter-1 cohort follow-ups. Bundles 5 ACs
from /conformance-loop iter-1 (2026-05-14) — STE-290 hook carve-out
for spec-only commits (AC.1, ship first), STE-211 stripLinearACFences
on get_issue read path (AC.2), branch_gate_default_applied coverage
via Phase 9 master-merge sub-fix (AC.3, folds F9), probe #37 fence-only
doc gap (AC.4), Linear severity-format canonical normalization (AC.5).

Pairs with STE-294 to cover iter-1 high+medium findings. Follows
STE-231 / STE-286 fixture-group precedent.

Refs: STE-295
Drop the original /conformance-loop iter-1 (2026-05-11) cohort from
M70: medium-severity observability-only follow-ups where the underlying
features work; not urgent enough to block release. M70 now ships three
focused FRs:

- STE-284 (state-emission via tracker ↔ local reconciliation)
- STE-294 (STE-226 hardening — marker as sole auto-apply trigger)
- STE-295 (fixture group bundling F3/F4/F5/F11/severity-format-meta)

STE-280/281/282 cancelled on Linear with statusType=canceled; gate-check
probe family #38–48 + smoke probes catch the same drift if it matters
later. Decision documented in plan Notes so future audits read this
as deliberate scope tightening, not drift.

Refs: M70
nesquikm added 9 commits May 15, 2026 13:35
Five-AC fixture-group FR addressing /conformance-loop iter-1 findings:

- AC.1: pre-commit-tdd-orchestrator hook skips /tdd requirement when
  staged set is spec-only (specs/frs, specs/plan, specs/requirements.md,
  etc.) AND no src/test paths staged. Mixed spec+src/test still requires
  /tdd. Adds classifyStagedPaths pure helper; gates I/O behind
  import.meta.main so unit tests can import without spawning the hook.
- AC.2: stripLinearACFences extended to mcp__linear__get_issue read
  path via new wrapper at adapters/linear/src/get_issue.ts. Round-trip
  test asserts pushDescription → fetchDescription byte-equality.
- AC.3: smoke-test SKILL.md Phase 2 master-merge step (chore/setup-
  bootstrap → master) lands before branch_gate_default_applied spawn
  so the canonical chain exercises STE-228's default-applied path.
- AC.4: gate-check SKILL.md probe #37 anchor body documents fence-only
  scope (path tokens inside fenced directory-tree blocks only).
- AC.5: smoke-test SKILL.md severity-format canonical normalization
  callout. Regression form with severity-word + colon inside bold +
  trailing period is now byte-checkable absent.

Spec deviation (contradicts): STE-290 existing FR-only-refusal
integration test conflicted with AC.1 carve-out. Updated 2 tests to
assert FR-only → exit 0 and use src/foo.test.ts for the refusal case.

smoke-test/SKILL.md also carries STE-294 AC.4 fixture-1b cross-tracker
assertion (committed here since AC.3 + AC.5 are the dominant edit).

Refs: STE-295
Five-AC FR addressing /conformance-loop iter-1 § F1 (Linear-side
fixture 1b auto-applied gates while Jira-side correctly refused).
Extends the STE-220 → STE-226 → STE-237 → STE-262 → STE-270
triple-layer defense pattern.

- AC.1: spec_write_alternate_trigger_scan.ts FORBIDDEN_PHRASES
  extended with `standing instruction` + `default-applied per
  standing` (the two truly-new phrases; `autonomous-mode reminder`
  and `work without stopping` were already present).
- AC.2: spec-write SKILL.md § 0b step 4 + § 7a + ## Rules each carry
  the canonical NOT-a-trigger anchor sentence byte-repeated. The
  literal `NOT acceptable` token is a NEGATION_SIGNATURE in the
  alternate-trigger scan, so the anchor lines self-exclude.
- AC.3: Regression fixture lands at fixtures/socratic-first-turn/
  regression/spec-write-marker-absent-reminder-present-2026-05-14
  .json; new runtime test loads fixture, runs gate-evaluation,
  asserts RequiresInputRefusedError raises with NFR-10 shape
  naming gate site `draft`.
- AC.4: smoke-test SKILL.md fixture 1b assertion changed to
  byte-checkable post-TIGHTEN form `Linear-side AND Jira-side both
  raised RequiresInputRefusedError`; diagnostic line `STE-226
  runtime regression: spec-write marker-absent fixture 1b` (per
  STE-231 AC-5 shape). (Committed with STE-295 since smoke-test
  SKILL.md was the dominant edit there.)
- AC.5: Cross-tracker asymmetry root-cause documented in
  specs/frs/STE-294.md § Notes as LLM stochasticity (AC.5 path b);
  references canonical log paths and cites AC.1–AC.3 triple-layer
  defense as the authoritative byte-check mechanism.

Spec deviation (underspecified): AC.1 listed 4 "new" phrases but 2
were already in FORBIDDEN_PHRASES. Added only the 2 truly-new.

Also adds 4 untracked iter-1 source-capture fixtures (brainstorm /
setup / report-issue / spec-write 2026-05-14.json) as reference
material for the regression-fixture derivation.

Refs: STE-294
Eight-AC FR closing the partial-scan trap surfaced 2026-05-13
when /conformance-loop iter-1 created M70 + STE-280/281/282 on
Linear with no local files (branch unmerged), and a subsequent
/spec-write on main would have allocated M71 over the top.

- AC.1: nextFreeMilestoneNumber extended with optional `provider`
  param; in tracker mode unions tracker milestones via
  Provider.listMilestones(). Function is now async. Returns
  { next, sources: { active, archived, changelog, tracker } }.
- AC.2: New helper reconcileTrackerLocal(provider, specsDir) at
  adapters/_shared/src/reconcile_tracker_local.ts. Returns
  { trackerOrphans, localOrphans, milestoneMismatches } with
  shared readLocalFRBindings helper (single source of truth
  reused by the gate-check probe).
- AC.3: spec-write SKILL.md § 0.5 Tracker-local reconciliation
  preamble between § 0 and § 0a; auto-imports tracker→local
  orphans via importFromTracker guarded by existsSync (STE-135);
  prompts for local→tracker + milestone-mismatch.
- AC.4: New /gate-check probe tracker_local_reconciliation_drift
  at skills/gate-check/probes/tracker_local_reconciliation_drift
  .ts. Severity tiers: warning (any drift) → error (hard FR-id
  collisions: dangling local binding to non-existent tracker ID
  OR duplicate local bindings).
- AC.5: Three capability keys (tracker_local_reconciled,
  tracker_local_orphan_local, milestone_local_orphan) added to
  both spec-write/SKILL.md § 7 and gate-check/SKILL.md report
  maps with plain-language explanations.
- AC.6: STE-119's tracker-exclusion decision explicitly revisited
  + superseded in specs/frs/STE-284.md § Notes.
- AC.7: Provider interface evolution — new `mode`, listMilestones,
  listActiveFRs methods. LocalProvider returns []/`none`;
  TrackerProvider delegates to driver/`tracker`.
- AC.8: Performance budget (≤ 1500ms entry preflight) — smoke-
  deferrable per feedback_smoke_post_ship_retroactive [~].

Spec deviation (underspecified): FR text named adapter file paths
adapters/{linear,jira,local}/src/provider.ts that don't exist —
provider impls live at adapters/_shared/src/{local,tracker}_provider.ts
(TrackerProvider is generic with driver-pattern). Implemented in the
real architecture; behavior unchanged.

spec-write/SKILL.md also carries STE-294 AC.2's canonical anchor
sentences (3 occurrences). gate-check/SKILL.md also carries STE-295
AC.4's probe #37 fence-only scope sentence.

Refs: STE-284
Archive M70 "Runtime emission contract closures" with all three FRs
shipped clean. Operations:

- git mv specs/frs/STE-{284,294,295}.md → specs/frs/archive/
- git mv specs/plan/M70.md → specs/plan/archive/
- Frontmatter flip: status active → archived; archived_at populated
  with full ISO-8601 timestamp 2026-05-15T09:37:43Z
- Plan items flipped [ ] → [x] for all three FRs
- 3 shipped-AC traceability rows appended to specs/requirements.md
  § 6 (AC-STE-284.1..8 + AC-STE-294.1..5 + AC-STE-295.1..5)
- Implementation notes section appended to each archived FR
  ("No advisory notes." — Phase 3 review captured zero advisories)

No live traceability links pointed at the archived FRs (rewriteArchiveLinks
returned 0 changes). No plan verify lines required cleanup.

Refs: M70
The AC-STE-284.6 / AC-STE-284.8 / AC-STE-294.5 prose-grep tests
hardcoded `specs/frs/STE-{284,294}.md`. After M70 archival the live
paths no longer exist, breaking the suite. Tests now check the
active path first (existsSync), then fall through to the archive
path. Shipped ACs stay verifiable against the archived FR body.

Refs: M70
M70 ships tracker↔local FR/milestone reconciliation (STE-284) + STE-226
marker-trigger hardening (STE-294) + post-iter-1 fixture-group bundle
(STE-295). Test count: 2445.

Release: v2.23.0 "Reconciled"
Refs: M70
Add M77 single-FR milestone for the /tdd spec-review audit step: a 4th
forked subagent (tdd-spec-reviewer) that runs end-of-FR after REFACTOR,
blocks on missing_acs only, with one bounded auto-retry through
test-writer + implementer scoped to the missing ACs.

Refs: STE-296
Add a fourth forked-subagent stage to the /tdd orchestrator: a read-only
tdd-spec-reviewer that runs once at end of FR after REFACTOR, independently
traces every AC to file:line + test-file:line, and blocks the pipeline only
on missing_acs (binary contract violations) with a bounded single-round
auto-retry through tdd-write-test + tdd-implement scoped to the missing ACs.
Halts non-zero with a new spec-gap failure mode if the second audit still
finds missing ACs; advisory findings (partial_acs, drift_count,
cross_cutting_drift) ride along in the report without halting.

Closes the second-line-of-defense gap STE-225 deliberately left open:
tdd-result validates each subagent's claims; this FR validates reality.

New /gate-check probe #50 tdd_spec_reviewer_subagent_invariants enforces
the new subagent + child skill's frontmatter invariants byte-checkably.
Three new capability keys (tdd_spec_audit_passed /
tdd_spec_audit_missing_recovered / tdd_spec_audit_halted) propagate the
audit outcome through /implement's closing summary.

Refactor extracts shared fence+YAML primitives (tdd_fence_yaml.ts) and
shared probe primitives (tdd_probe_helpers.ts) -- net dedup while the
closed-schema validation stays local to each caller.

Archives M77 + STE-296.

Refs: STE-296
M77 ships STE-296 (/tdd spec-review audit step). New 4th forked-subagent
stage (tdd-spec-reviewer) runs once at end of FR after REFACTOR,
independently traces every AC to its impl + test, blocks the pipeline
only on missing_acs with a bounded single-round retry, halts non-zero
with new spec-gap failure mode otherwise. New /gate-check probe #50
tdd_spec_reviewer_subagent_invariants + three capability keys
propagating the audit outcome through /implement's closing summary.

Release: v2.24.0 "Audited"
Refs: M77
@nesquikm nesquikm merged commit 3e80f67 into main May 15, 2026
@nesquikm nesquikm deleted the chore/ste-283-honored-contracts-enforcement branch May 15, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant