chore(release): v2.21.0 + v2.22.0 (M71 "Honored" + M72 "Mirror")#14
Merged
Conversation
Honored Contracts enforcement (M71 first FR) — adds Contract block + Rationalization table + catalog plan. Prose-only enforcement layer for cross-skill mandates like /implement → /tdd, with eyes open to the STE-220→STE-270 chain saying prose-only gets falsified. If it falsifies again, the evidence-based gate is the documented escalation (this time we promise to listen). Refs: STE-283
Tracker ↔ local FR/milestone reconciliation (M70's second FR) — helper preflight at /spec-write entry + /gate-check probe at gate boundary. Auto-imports tracker→local orphans silently; prompts for the ambiguous kinds. Supersedes STE-119's "M-numbers are local" decision, which got falsified the same day we set out to enforce it. Also writes specs/plan/M70.md (the milestone existed on Linear only until now — the exact bug this FR fixes, surfaced and patched in one /brainstorm → /spec-write loop). Refs: STE-284
Honored Contracts byte-checkable layer (M71's second FR) — /setup
offers a multi-select menu of opt-in toolkit-contract enforcement
hooks (Process category only; Safety + Quality out of scope). Hooks
plug into the user's .claude/settings.json via ${CLAUDE_PLUGIN_ROOT}
substitution (verified against Claude Code docs + STE-133 precedent).
Together with STE-283's prose layer, this gives M71 a "prose + opt-in
byte-checkable" enforcement bundle for the same contracts. The third
layer (bundled session-wide) stays where STE-262/270/276 left it:
in the rejection bin.
Refs: STE-285
Add labeled "TDD Orchestrator Contract" callout + Rationalization Prevention table at the top of /implement Phase 2 step 8; cite docs/honored-contracts.md catalog (3 seeded entries: /implement → /tdd, /spec-write → spec-research, /brainstorm → AskUserQuestion-first). Prose layer of M71 Honored Contracts enforcement bundle; operator-acknowledged residual risk vs. STE-220→STE-270 6-FR falsification chain (escalation path: evidence-based gate / hard mechanic). Refs: STE-283
Byte-checkable layer of M71 Honored Contracts enforcement bundle. Adds: - /setup multi-select menu prompt step (all hooks default off) - --hooks flag for re-running only the hooks step (idempotent, pre-checked) - 4 seeded Process-category hooks under templates/hooks/process/ (pre-commit-gate-check, pre-pr-spec-review, pre-spec-write-brainstorm-reminder, pre-commit-tdd-orchestrator) - Shared _lib/session.sh helper (atomic single-line JSONL grep, NFR-10 emit) - install_hooks.ts settings.json merge helper (key-level merge, idempotent on same-matcher-same-command, conflict-surfaced on diff) - docs/hooks-reference.md user manual Per-project opt-in, bounded blast radius; supersedes STE-262/STE-270/STE-276 session-wide bundling rejections. Refs: STE-285
Atomic archival: move STE-283 + STE-285 FRs and the M71 plan file into archive/, flip status: active → archived + set archived_at timestamps, mark plan ACs [x], append shipped-AC traceability rows for both FRs to specs/requirements.md § 6, attach Implementation notes (advisory: STE-285 session.sh JSONL needle assumes compact serialization — documented in the FR's Risks table). Refs: M71
AC-STE-285.6 doc-conformance test hardcoded `specs/frs/STE-285.md` — breaks after Phase 4 archival relocates the FR. Fall back to `specs/frs/archive/STE-285.md` via existsSync so the assertion holds for the same spec content across the active → archived transition. Refs: STE-285
M71 — Honored Contracts enforcement: prose layer (STE-283) + opt-in byte-checkable hook layer (STE-285). Release: v2.21.0 "Honored" Refs: M71
M72 (STE-285 hooks E2E smoke coverage) — first and only FR. Adds /smoke-test fixture group 8 (STE-285 hooks runtime regression) covering install verification + runtime refusal probe for all 4 seeded hooks via one shared marker-stripped claude -p child spawn with 4 sequenced scenarios + $CLAUDE_SESSION_FILE rotation. Bundles a small STE-285 scope expansion: /setup --hooks non-interactive preselect flag (--hooks=all | --hooks=<comma-list>). Symmetric to STE-225 fixture group 7 (TDD orchestrator forks runtime, per STE-231). Refs: STE-286
Adds /smoke-test Phase 2.X fixture group 8 covering the M71 STE-285 opt-in Process-category hooks end-to-end (install verification + 4 per-scenario NFR-10 refusal probes). Bundles a /setup --hooks=<value> non-interactive preselect flag (parsePreselectFlag) so the smoke driver can install hooks without the multi-select default-off menu. The runtime-probe half uses the AC-STE-286.7-accepted fallback (per-scenario standalone bash subprocesses with $CLAUDE_SESSION_FILE set inline) because the Claude Code harness clobbers wrapper-shell env values when spawning PreToolUse / UserPromptSubmit subprocesses — recorded as the empirical finding in specs/frs/archive/STE-286.md Notes. Release: deferred — M72 will ship after the new smoke fixture confirms M71's hooks fire correctly at runtime. Refs: STE-286
M72 ships /smoke-test fixture group 8 (STE-285 hooks runtime regression). Release: v2.22.0 "Mirror" Refs: M72
b89ef88 to
d50f4e1
Compare
M73 single-FR: hook install path portability fix — install_hooks.ts emits
literal ${CLAUDE_PLUGIN_ROOT} token in args[0] instead of JS-resolved
absolute path. Aligns implementation with STE-285 AC-STE-285.3.
Refs: STE-288, M73
`additionFor` and `readInstalledHookNames` now use the literal
`${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/<name>.sh` token on
both write and read paths so the same `.claude/settings.json` works
across dev clones and marketplace installs (M71/STE-285 design intent).
Hoisted the literal into the `HOOK_ARGS_PREFIX` constant to avoid
write/read drift. No legacy-shape migration (zero shipped users).
Archives STE-288 + M73 plan; appends AC-STE-288.1..5 traceability row.
Refs: STE-288
STE-288 is a pure bug fix; default Added would mis-bump M73 to minor when patch is correct. Categorize before /ship-milestone runs. Refs: STE-288
Hook install path portability — emit literal ${CLAUDE_PLUGIN_ROOT}
token across dev clones and marketplace installs.
Files bumped:
- plugins/dev-process-toolkit/.claude-plugin/plugin.json: 2.22.0 → 2.22.1
- .claude-plugin/marketplace.json: plugins[0].version 2.22.0 → 2.22.1
- CHANGELOG.md: + [2.22.1] "Portable" section (Fixed: STE-288)
- README.md: Latest banner v2.22.0 "Mirror" → v2.22.1 "Portable"
- specs/requirements.md: Latest shipped release v2.22.0 → v2.22.1
FRs included: STE-288 (Hook install path portability).
Total test count at release: 2335 tests, 0 failures, 0 errors.
Release: v2.22.1 "Portable"
Refs: M73
M74 single-FR: ship plugin-bundled hooks/hooks.json, rip out /setup
--hooks installer. Falsifies M71/M72/M73 install-side mechanism after
2026-05-14 empirical discovery that the harness only expands
${CLAUDE_PLUGIN_ROOT} in plugin-bundled hooks.json, not in user
.claude/settings.json. Forward-only; zero shipped users. Release
target v2.22.2.
Refs: STE-289
M71/M72/M73's install-side hook mechanism was structurally impossible:
the Claude Code harness only expands ${CLAUDE_PLUGIN_ROOT} in plugin-
bundled <plugin>/hooks/hooks.json, never in user .claude/settings.json.
Operator's 2026-05-14 run of /setup --hooks=all against v2.22.1 surfaced
the harness refusal "Hook command references ${CLAUDE_PLUGIN_ROOT} but
the hook is not associated with a plugin." Empirical research via the
claude-code-guide agent confirmed the contract.
M74 ships plugins/dev-process-toolkit/hooks/hooks.json (harness auto-
discovers + fires across every project where the plugin is enabled) and
rips out the install-side surface: install_hooks.ts + 5 setup tests + 1
stale capability-rows test + section 0c (Contract-enforcement re-
invocation flag) + section 6a (Toolkit-contract enforcement hooks menu)
+ smoke-test SKILL.md fixture group 8. Per-script hook tests and the 4
hook scripts under templates/hooks/process/ survive — only registration
moves.
Reverses STE-262/STE-270/STE-276's rejection of bundled hooks.json — the
rejection grounds (spawn blast radius, triple-check conflict, no clean
per-session state) all remain mitigated as before. Forward-only per
project_no_users_yet — zero shipped marketplace users; no migration
helper. Release notes will carry a one-line manual cleanup instruction
for the operator's downstream stale entries.
Archived STE-289 + M74 in this commit.
Refs: STE-289
STE-289 is a fix-class FR (the M74 plan explicitly states "patch — pure fix-class; changelog_category: Fixed on STE-289"). Setting the frontmatter field so /ship-milestone's inferBump picks patch (v2.22.2) and the CHANGELOG entry lands under "### Fixed" rather than the defaulted "### Added". Mirrors c38d44c (same fix for STE-288). Refs: STE-289
M74 — STE-289 hooks bundling redo: ships plugin-bundled hooks.json,
rips out the install-side /setup --hooks installer that was
structurally impossible (harness only expands ${CLAUDE_PLUGIN_ROOT}
in plugin-bundled hooks.json, never in user .claude/settings.json).
Release: v2.22.2 "Bundled"
Refs: M74
Single-FR fix-class FR for M75. Rewrites the broken
$CLAUDE_SESSION_FILE-reading session.sh as a Bun TS library + 4
per-hook TS modules per the empirically-verified 2026-05-14 hook
stdin contract. Hooks have been silently fail-opening since v2.21.0
("Honored") shipped — STE-285's env-var read pattern is never
satisfied because the harness never sets the var; session info is
passed on stdin as JSON with transcript_path.
Release target v2.22.3 (patch — pure fix-class).
Refs: STE-290
STE-285's byte-checkable enforcement layer was inert since v2.21.0 —
session.sh read $CLAUDE_SESSION_FILE which the harness never sets, so
require_skill_tool_use always hit the fail-open branch silently. The
2026-05-14 empirical probe confirmed the harness passes session info
as stdin JSON carrying transcript_path, the actual contract.
Rewrites _lib/session.sh as a Bun TS lib exporting parseHookPayload /
findSkillToolUse / requireSkillToolUse / emitNFR10, plus 4 per-hook TS
modules under _lib/hooks/. Bash shims at process/*.sh shrink to 2-line
`exec bun run "${CLAUDE_PLUGIN_ROOT}/.../<name>.ts"` wrappers. Integration
tests inject stdin via Bun.spawn. NFR-10 byte-stable substrings (Refusing:,
Reminder:, Context:) preserved per STE-286 §104.
Forward-only per project_no_users_yet — no migration helper.
Refs: STE-290
M75 — STE-290 hook contract fix: ports the byte-checkable enforcement
layer from the never-set $CLAUDE_SESSION_FILE env var to the empirically-
verified stdin transcript_path JSON contract. Layer was inert since
v2.21.0 ("Honored"); now actually enforces.
Release: v2.22.3 "Wired"
Refs: M75
M76 follow-up to M75. Switches 3 Refusing hooks from exit 1 (advisory per Claude Code 2.1.x harness contract) to exit 2 (blocking + stderr- to-model). Discovered post-v2.22.3 install: hooks fire, NFR-10 stderr emits, but Claude Code reports "Failed with non-blocking status code" and lets the command proceed. brainstorm-reminder UserPromptSubmit hook stays exit 0 (advisory by STE-285 design). Refs: STE-291
Claude Code's hook contract: exit 0 = OK, exit 2 = blocking (stderr fed to the model as feedback), any other non-zero (incl. exit 1) = advisory (stderr shown to operator only, tool call proceeds). STE-290 wired the layer to the real harness stdin transcript_path contract but the 3 Refusing hooks exited 1 on miss, so they emitted the NFR-10 stderr but didn't actually block. This flip makes them block. - pre-commit-gate-check.ts: 1 → 2 on miss path - pre-pr-spec-review.ts: 1 → 2 - pre-commit-tdd-orchestrator.ts: 1 → 2 - 6 test files (3 unit + 3 integration): refusal-case `.not.toBe(0)` → `.toBe(2)` (7 assertion lines; tdd-orchestrator unit file has 2) - docs/hooks-reference.md: +6-line "Exit-code contract (Claude Code 2.1.x)" paragraph - specs/requirements.md: AC-STE-291.1..6 traceability row appended brainstorm-reminder unchanged (advisory UserPromptSubmit by STE-285 design). Archive sweep included: STE-291.md + M76.md flipped to archived state in the same atomic commit. Refs: STE-291
M76 — STE-291 hook refusal exit code: switch the 3 Refusing hooks (pre-commit-gate-check, pre-pr-spec-review, pre-commit-tdd-orchestrator) from exit 1 (advisory; Claude Code surfaces stderr to operator only and proceeds) to exit 2 (blocking; harness blocks the tool call AND feeds stderr to the model as feedback). STE-290 wired the layer to the real harness contract; this closes the residual exit-code-semantics gap discovered during STE-290's empirical verification. Pure fix-class; forward-only per project_no_users_yet. Release: v2.22.4 "Blocking" Refs: M76
STE-226 hardening: tighten /spec-write auto-apply to ignore autonomous-mode-reminder paraphrases. Sourced from /conformance-loop iter-1 (2026-05-14) F1; extends the STE-262 + STE-270 triple-layer defense (FORBIDDEN_PHRASES + regression fixture + runtime byte-grep). Folds F7 (vacuous post-TIGHTEN); defers F10 (/setup-side, separate FR). [hook-bypass: F3-iter1] pre-commit-tdd-orchestrator + pre-commit-gate-check refused specs-only commit; explicit operator consent applied per /conformance-loop iter-1 finding F3 (M70 fixture-group FR will carve this out for specs-only commits). Refs: STE-294
Smoke fixture group: M70 post-iter-1 cohort follow-ups. Bundles 5 ACs from /conformance-loop iter-1 (2026-05-14) — STE-290 hook carve-out for spec-only commits (AC.1, ship first), STE-211 stripLinearACFences on get_issue read path (AC.2), branch_gate_default_applied coverage via Phase 9 master-merge sub-fix (AC.3, folds F9), probe #37 fence-only doc gap (AC.4), Linear severity-format canonical normalization (AC.5). Pairs with STE-294 to cover iter-1 high+medium findings. Follows STE-231 / STE-286 fixture-group precedent. Refs: STE-295
Drop the original /conformance-loop iter-1 (2026-05-11) cohort from M70: medium-severity observability-only follow-ups where the underlying features work; not urgent enough to block release. M70 now ships three focused FRs: - STE-284 (state-emission via tracker ↔ local reconciliation) - STE-294 (STE-226 hardening — marker as sole auto-apply trigger) - STE-295 (fixture group bundling F3/F4/F5/F11/severity-format-meta) STE-280/281/282 cancelled on Linear with statusType=canceled; gate-check probe family #38–48 + smoke probes catch the same drift if it matters later. Decision documented in plan Notes so future audits read this as deliberate scope tightening, not drift. Refs: M70
Five-AC fixture-group FR addressing /conformance-loop iter-1 findings: - AC.1: pre-commit-tdd-orchestrator hook skips /tdd requirement when staged set is spec-only (specs/frs, specs/plan, specs/requirements.md, etc.) AND no src/test paths staged. Mixed spec+src/test still requires /tdd. Adds classifyStagedPaths pure helper; gates I/O behind import.meta.main so unit tests can import without spawning the hook. - AC.2: stripLinearACFences extended to mcp__linear__get_issue read path via new wrapper at adapters/linear/src/get_issue.ts. Round-trip test asserts pushDescription → fetchDescription byte-equality. - AC.3: smoke-test SKILL.md Phase 2 master-merge step (chore/setup- bootstrap → master) lands before branch_gate_default_applied spawn so the canonical chain exercises STE-228's default-applied path. - AC.4: gate-check SKILL.md probe #37 anchor body documents fence-only scope (path tokens inside fenced directory-tree blocks only). - AC.5: smoke-test SKILL.md severity-format canonical normalization callout. Regression form with severity-word + colon inside bold + trailing period is now byte-checkable absent. Spec deviation (contradicts): STE-290 existing FR-only-refusal integration test conflicted with AC.1 carve-out. Updated 2 tests to assert FR-only → exit 0 and use src/foo.test.ts for the refusal case. smoke-test/SKILL.md also carries STE-294 AC.4 fixture-1b cross-tracker assertion (committed here since AC.3 + AC.5 are the dominant edit). Refs: STE-295
Five-AC FR addressing /conformance-loop iter-1 § F1 (Linear-side fixture 1b auto-applied gates while Jira-side correctly refused). Extends the STE-220 → STE-226 → STE-237 → STE-262 → STE-270 triple-layer defense pattern. - AC.1: spec_write_alternate_trigger_scan.ts FORBIDDEN_PHRASES extended with `standing instruction` + `default-applied per standing` (the two truly-new phrases; `autonomous-mode reminder` and `work without stopping` were already present). - AC.2: spec-write SKILL.md § 0b step 4 + § 7a + ## Rules each carry the canonical NOT-a-trigger anchor sentence byte-repeated. The literal `NOT acceptable` token is a NEGATION_SIGNATURE in the alternate-trigger scan, so the anchor lines self-exclude. - AC.3: Regression fixture lands at fixtures/socratic-first-turn/ regression/spec-write-marker-absent-reminder-present-2026-05-14 .json; new runtime test loads fixture, runs gate-evaluation, asserts RequiresInputRefusedError raises with NFR-10 shape naming gate site `draft`. - AC.4: smoke-test SKILL.md fixture 1b assertion changed to byte-checkable post-TIGHTEN form `Linear-side AND Jira-side both raised RequiresInputRefusedError`; diagnostic line `STE-226 runtime regression: spec-write marker-absent fixture 1b` (per STE-231 AC-5 shape). (Committed with STE-295 since smoke-test SKILL.md was the dominant edit there.) - AC.5: Cross-tracker asymmetry root-cause documented in specs/frs/STE-294.md § Notes as LLM stochasticity (AC.5 path b); references canonical log paths and cites AC.1–AC.3 triple-layer defense as the authoritative byte-check mechanism. Spec deviation (underspecified): AC.1 listed 4 "new" phrases but 2 were already in FORBIDDEN_PHRASES. Added only the 2 truly-new. Also adds 4 untracked iter-1 source-capture fixtures (brainstorm / setup / report-issue / spec-write 2026-05-14.json) as reference material for the regression-fixture derivation. Refs: STE-294
Eight-AC FR closing the partial-scan trap surfaced 2026-05-13
when /conformance-loop iter-1 created M70 + STE-280/281/282 on
Linear with no local files (branch unmerged), and a subsequent
/spec-write on main would have allocated M71 over the top.
- AC.1: nextFreeMilestoneNumber extended with optional `provider`
param; in tracker mode unions tracker milestones via
Provider.listMilestones(). Function is now async. Returns
{ next, sources: { active, archived, changelog, tracker } }.
- AC.2: New helper reconcileTrackerLocal(provider, specsDir) at
adapters/_shared/src/reconcile_tracker_local.ts. Returns
{ trackerOrphans, localOrphans, milestoneMismatches } with
shared readLocalFRBindings helper (single source of truth
reused by the gate-check probe).
- AC.3: spec-write SKILL.md § 0.5 Tracker-local reconciliation
preamble between § 0 and § 0a; auto-imports tracker→local
orphans via importFromTracker guarded by existsSync (STE-135);
prompts for local→tracker + milestone-mismatch.
- AC.4: New /gate-check probe tracker_local_reconciliation_drift
at skills/gate-check/probes/tracker_local_reconciliation_drift
.ts. Severity tiers: warning (any drift) → error (hard FR-id
collisions: dangling local binding to non-existent tracker ID
OR duplicate local bindings).
- AC.5: Three capability keys (tracker_local_reconciled,
tracker_local_orphan_local, milestone_local_orphan) added to
both spec-write/SKILL.md § 7 and gate-check/SKILL.md report
maps with plain-language explanations.
- AC.6: STE-119's tracker-exclusion decision explicitly revisited
+ superseded in specs/frs/STE-284.md § Notes.
- AC.7: Provider interface evolution — new `mode`, listMilestones,
listActiveFRs methods. LocalProvider returns []/`none`;
TrackerProvider delegates to driver/`tracker`.
- AC.8: Performance budget (≤ 1500ms entry preflight) — smoke-
deferrable per feedback_smoke_post_ship_retroactive [~].
Spec deviation (underspecified): FR text named adapter file paths
adapters/{linear,jira,local}/src/provider.ts that don't exist —
provider impls live at adapters/_shared/src/{local,tracker}_provider.ts
(TrackerProvider is generic with driver-pattern). Implemented in the
real architecture; behavior unchanged.
spec-write/SKILL.md also carries STE-294 AC.2's canonical anchor
sentences (3 occurrences). gate-check/SKILL.md also carries STE-295
AC.4's probe #37 fence-only scope sentence.
Refs: STE-284
Archive M70 "Runtime emission contract closures" with all three FRs
shipped clean. Operations:
- git mv specs/frs/STE-{284,294,295}.md → specs/frs/archive/
- git mv specs/plan/M70.md → specs/plan/archive/
- Frontmatter flip: status active → archived; archived_at populated
with full ISO-8601 timestamp 2026-05-15T09:37:43Z
- Plan items flipped [ ] → [x] for all three FRs
- 3 shipped-AC traceability rows appended to specs/requirements.md
§ 6 (AC-STE-284.1..8 + AC-STE-294.1..5 + AC-STE-295.1..5)
- Implementation notes section appended to each archived FR
("No advisory notes." — Phase 3 review captured zero advisories)
No live traceability links pointed at the archived FRs (rewriteArchiveLinks
returned 0 changes). No plan verify lines required cleanup.
Refs: M70
The AC-STE-284.6 / AC-STE-284.8 / AC-STE-294.5 prose-grep tests
hardcoded `specs/frs/STE-{284,294}.md`. After M70 archival the live
paths no longer exist, breaking the suite. Tests now check the
active path first (existsSync), then fall through to the archive
path. Shipped ACs stay verifiable against the archived FR body.
Refs: M70
M70 ships tracker↔local FR/milestone reconciliation (STE-284) + STE-226 marker-trigger hardening (STE-294) + post-iter-1 fixture-group bundle (STE-295). Test count: 2445. Release: v2.23.0 "Reconciled" Refs: M70
Add M77 single-FR milestone for the /tdd spec-review audit step: a 4th forked subagent (tdd-spec-reviewer) that runs end-of-FR after REFACTOR, blocks on missing_acs only, with one bounded auto-retry through test-writer + implementer scoped to the missing ACs. Refs: STE-296
Add a fourth forked-subagent stage to the /tdd orchestrator: a read-only tdd-spec-reviewer that runs once at end of FR after REFACTOR, independently traces every AC to file:line + test-file:line, and blocks the pipeline only on missing_acs (binary contract violations) with a bounded single-round auto-retry through tdd-write-test + tdd-implement scoped to the missing ACs. Halts non-zero with a new spec-gap failure mode if the second audit still finds missing ACs; advisory findings (partial_acs, drift_count, cross_cutting_drift) ride along in the report without halting. Closes the second-line-of-defense gap STE-225 deliberately left open: tdd-result validates each subagent's claims; this FR validates reality. New /gate-check probe #50 tdd_spec_reviewer_subagent_invariants enforces the new subagent + child skill's frontmatter invariants byte-checkably. Three new capability keys (tdd_spec_audit_passed / tdd_spec_audit_missing_recovered / tdd_spec_audit_halted) propagate the audit outcome through /implement's closing summary. Refactor extracts shared fence+YAML primitives (tdd_fence_yaml.ts) and shared probe primitives (tdd_probe_helpers.ts) -- net dedup while the closed-schema validation stays local to each caller. Archives M77 + STE-296. Refs: STE-296
M77 ships STE-296 (/tdd spec-review audit step). New 4th forked-subagent stage (tdd-spec-reviewer) runs once at end of FR after REFACTOR, independently traces every AC to its impl + test, blocks the pipeline only on missing_acs with a bounded single-round retry, halts non-zero with new spec-gap failure mode otherwise. New /gate-check probe #50 tdd_spec_reviewer_subagent_invariants + three capability keys propagating the audit outcome through /implement's closing summary. Release: v2.24.0 "Audited" Refs: M77
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR bundles two releases on a single feature branch — M71 + M72 are merged together once M72's new fixture group 8 smoke pass confirms M71's hooks fire correctly at runtime.
M71 — Honored Contracts enforcement (v2.21.0 "Honored")
STE-283 + STE-285 — the toolkit's first explicit anti-falsification design after the STE-220→STE-270 6-FR chain.
TDD Orchestrator Contractcallout at/implementPhase 2 step 8 + Rationalization Prevention table + newdocs/honored-contracts.mdcatalog with 3 seeded contract entries (/implement → /tdd,/spec-write → spec-research,/brainstorm → AskUserQuestion-first)./setup --hooksmenu, 4 seeded Process-category hook scripts undertemplates/hooks/process/, shared_lib/session.sh(single-line atomicgrep -F | grep -Fagainst$CLAUDE_SESSION_FILE, NFR-10 stderr emitter),install_hooks.tssettings.json merge helper (idempotent on match, conflict-surfaced on diff, malformed-JSON SyntaxError), per-hookHOOK_REGISTRATIONS(PreToolUse Bash for three hooks, UserPromptSubmit*for the brainstorm reminder), newdocs/hooks-reference.mduser manual.<project>/.claude/settings.json, not~/.claude/) supersedes the session-wide-bundling rejections (STE-262/STE-270/STE-276).M72 — STE-285 hooks E2E smoke coverage (v2.22.0 "Mirror")
STE-286 — closes the symmetric coverage gap left by M71. STE-283's TDD Orchestrator Contract already had runtime evidence via fixture group 7 (
STE-225 TDD orchestrator forks runtime); STE-285's seeded hooks shipped with per-script unit tests but no end-to-end probe of the install → harness dispatch → NFR-10 refusal pipeline./smoke-testPhase 2.X fixture group 8 —STE-285 hooks runtime regression, positioned after fixture group 7, mirrors group 7's shape. DiagnosticSTE-285 runtime regression: <fixture-name>per STE-231 AC.5 precedent.claude -p /dev-process-toolkit:setup --hooks=alland asserts the canonical 3 PreToolUse:Bash + 1 UserPromptSubmit:* event/matcher split lands in the test-project.claude/settings.json.bashsubprocesses per leg (one per scenario:pre-commit-gate-check,pre-commit-tdd-orchestrator,pre-pr-spec-review,pre-spec-write-brainstorm-reminder) withCLAUDE_SESSION_FILE=/tmp/dpt-smoke-empty-session-<scenario>.jsonlset inline; per-scenario stderr asserted viagrep -FforRefusing:/Reminder:+dev-process-toolkit:<skill>+hook=<name>tokens.$CLAUDE_SESSION_FILEempirical finding (Phase 1): Claude Code's harness clobbers wrapper-shell env values on PreToolUse / UserPromptSubmit subprocess spawn, so AC-STE-286.4's literal "one marker-strippedclaude -pchild with rotated$CLAUDE_SESSION_FILE" was replaced by the AC-STE-286.7-accepted fallback (4 standalonebashsubprocesses with env set inline). Recorded inspecs/frs/archive/STE-286.mdNotes./setup --hooks=<value>non-interactive preselect — bundled STE-285 scope expansion.parsePreselectFlag(arg)parser accepts--hooks=all(all 4 seeded names) and--hooks=<comma-list>(validates each againstHOOK_REGISTRATIONS, refuses unknown / empty / whitespace-only / commas-only with an NFR-10-shaped error including the known-hooks hint). Documented in setup SKILL.md § 0c anddocs/hooks-reference.md.Test plan
bun test→ 2327 pass / 0 fail / 11 skip across 219 files (baseline +103 new test assertions across the two releases).status: active → archived, full ISO-8601 timestamps); traceability rows appended tospecs/requirements.md § 6for AC-STE-283.1..7 / AC-STE-285.1..7 / AC-STE-286.1..7./ship-milestone: plugin.json + marketplace.json (2.20.0 → 2.21.0 → 2.22.0), CHANGELOG.md (two new sections), README.md "Latest:" banner, requirements.md "Latest shipped release:"./smoke-testagainst this branch with fixture group 8 enabled — assert both legs (Linear + Jira) emitSTE-285 runtime check: PASSand the 4 per-scenario logs land at/tmp/dpt-smoke-<tracker>-hooks-runtime-<scenario>.log. Merge after green./setup --hookson a downstream test project, pickpre-commit-gate-check, attemptgit commitwithout first running/gate-check(post-merge validation; non-blocking).Advisory notes
templates/hooks/_lib/session.sh:35-36— JSONL grep needles assume compact JSON serialization (no space after colon). Test fixtures match the actual Claude Code emit shape; broader format-fragility risk is documented severity:medium in STE-285 § Risks.Refs: STE-283, STE-285, STE-286, M71, M72