From 448d7054b0e8a6267c91e89d95ac450ede735a5b Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 14:33:57 +0400 Subject: [PATCH 01/37] chore(specs): write FR STE-283 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Honored Contracts enforcement (M71 first FR) — adds Contract block + Rationalization table + catalog plan. Prose-only enforcement layer for cross-skill mandates like /implement → /tdd, with eyes open to the STE-220→STE-270 chain saying prose-only gets falsified. If it falsifies again, the evidence-based gate is the documented escalation (this time we promise to listen). Refs: STE-283 --- specs/frs/STE-283.md | 110 +++++++++++++++++++++++++++++++++++++++++++ specs/plan/M71.md | 41 ++++++++++++++++ 2 files changed, 151 insertions(+) create mode 100644 specs/frs/STE-283.md create mode 100644 specs/plan/M71.md diff --git a/specs/frs/STE-283.md b/specs/frs/STE-283.md new file mode 100644 index 0000000..f322f67 --- /dev/null +++ b/specs/frs/STE-283.md @@ -0,0 +1,110 @@ +--- +title: Honored Contracts enforcement — /implement → /tdd Contract block + Rationalization table + catalog +milestone: M71 +status: active +archived_at: null +tracker: + linear: STE-283 +created_at: 2026-05-13T10:28:02Z +--- + +# STE-283: Honored Contracts enforcement — /implement → /tdd Contract block + Rationalization table + catalog {#STE-283} + +## Requirement + +When `/implement` Phase 2 step 8 fires, the skill prose mandates invoking `/dev-process-toolkit:tdd ` per FR — spawning three forked subagents (test-writer / implementer / refactorer) with strict `tdd-result` hand-off contracts per STE-225. Cross-project conformance evidence (2026-05-13 finding) shows the prose mandate gets compressed into inline TDD in the main context: tests + implementation written together in the same head, no forked subagents, no per-AC orchestration, no `tdd-result` audit trail. + +Same prose-vs-byte-checkable failure shape as STE-220 → STE-226 → STE-237 → STE-251 → STE-262 → STE-270 (6 FRs of falsification — see `/brainstorm` 2026-05-13 spec-research seed). The architectural lesson is byte-checkable enforcement; the operator decision under this FR is to invest in **maximally-byte-checkable-by-a-human-reviewer prose** within the prose-only constraint. Lifting to runtime byte-grep / gate-check probe / hard mechanic is deferred by explicit operator choice (`/brainstorm` 2026-05-13). + +The fix layers three pieces of prose so they reinforce each other: + +1. A labeled **TDD Orchestrator Contract** callout at `/implement` Phase 2 step 8 (named violation: "Inline TDD Antipattern", auditable evidence shape). +2. An inline **Rationalization Prevention table** preempting the documented excuses (cost, no-N-times-pattern, shipping-over-fidelity). +3. A new `plugins/dev-process-toolkit/docs/honored-contracts.md` catalog enumerating every cross-skill mandate (`/implement → /tdd` is the first instance) under a uniform shape (Mandate / Violation / Evidence / Precedent FRs). + +**Operator-acknowledged residual risk.** Per `/brainstorm` 2026-05-13, sharpened-prose-only was selected with eyes open to the STE-220→STE-270 falsification chain. If post-ship conformance falsifies this layer, the escalation path is evidence-based gate (STE-262/STE-270 pattern) or hard mechanic (STE-225 pattern) — both explicitly out of scope here. + +## Acceptance Criteria + +- AC-STE-283.1: `/implement` Phase 2 step 8 carries a labeled **TDD Orchestrator Contract** callout at the start of the step's prose; block names the violation (`Inline TDD Antipattern`) and states the auditable shape (`N Skill(/dev-process-toolkit:tdd ) tool_use entries where N = FR count in milestone scope`). + - verify: `grep -F "TDD Orchestrator Contract" plugins/dev-process-toolkit/skills/implement/SKILL.md` ≥ 1; `grep -F "Inline TDD Antipattern" plugins/dev-process-toolkit/skills/implement/SKILL.md` ≥ 1. +- AC-STE-283.2: Inline **Rationalization Prevention table** within (or immediately following) the Contract block with ≥ 3 rows preempting documented rationalizations: cost, no-N-times-pattern, shipping-over-fidelity. + - verify: `grep -F "| Excuse | Reality |" plugins/dev-process-toolkit/skills/implement/SKILL.md` ≥ 1; manual review confirms ≥ 3 rows with explicit counters. +- AC-STE-283.3: `plugins/dev-process-toolkit/docs/honored-contracts.md` exists with the uniform catalog shape (Mandate / Violation name / Auditable evidence / Precedent FRs) and contains an entry for `/implement → /tdd`. + - verify: `test -f plugins/dev-process-toolkit/docs/honored-contracts.md`; `grep -cE "^\*\*(Mandate|Violation name|Auditable evidence|Precedent FRs)" plugins/dev-process-toolkit/docs/honored-contracts.md` ≥ 12 (3 entries × 4 labels). +- AC-STE-283.4: `/implement → /tdd` catalog entry cites STE-225 (orchestrator) + the prose-falsification chain (STE-220, STE-226, STE-237, STE-251, STE-262, STE-270). + - verify: `grep -E "STE-225|STE-220|STE-226|STE-237|STE-251|STE-262|STE-270" plugins/dev-process-toolkit/docs/honored-contracts.md` ≥ 7 distinct STE refs. +- AC-STE-283.5: Contract block in `/implement` references the catalog file path so reviewers can navigate from point-of-use. + - verify: `grep -F "docs/honored-contracts.md" plugins/dev-process-toolkit/skills/implement/SKILL.md` ≥ 1 line in Phase 2 step 8. +- AC-STE-283.6: Catalog seeds ≥ 3 contract entries: `/implement → /tdd` (primary), `/spec-write → spec-research` (precedent — STE-230), `/brainstorm → AskUserQuestion-first` (precedent — STE-237). + - verify: `grep -cE "^## " plugins/dev-process-toolkit/docs/honored-contracts.md` ≥ 3. +- AC-STE-283.7: Contract block includes a one-line residual-risk note citing the STE-220→STE-270 chain as falsifiable precedent and naming the documented escalation path (evidence-based gate / hard mechanic). + - verify: `grep -E "falsif|escalation" plugins/dev-process-toolkit/skills/implement/SKILL.md` ≥ 1 in Phase 2 step 8 referencing STE-220. + +## Technical Design + +### Files touched + +- `plugins/dev-process-toolkit/skills/implement/SKILL.md` — Contract callout (AC.1) + Rationalization table (AC.2) + catalog reference (AC.5) + residual-risk note (AC.7) at top of Phase 2 step 8. +- `plugins/dev-process-toolkit/docs/honored-contracts.md` — new file: catalog with 3 seeded entries (AC.3 / AC.4 / AC.6). + +### Contract block shape + +`>` blockquote opener modeled on STE-237's Socratic Loop Contract block in `skills/brainstorm/SKILL.md`. Bolded heading naming the contract, body stating mandate verbatim, closing line citing precedent FRs + catalog path. Sits at the very start of Phase 2 step 8, before the existing step prose (`Execute in TDD order via the multi-agent orchestrator…`). + +### Rationalization table shape + +`Excuse | Reality` modeled on `/brainstorm`'s end-of-file table. Three seeded rows: + +| Excuse | Reality | +|--------|---------| +| Milestone spans N FRs / many ACs — orchestrator cost is too high | Cost is not a contract waiver; orchestrator-per-FR IS the milestone-scope pattern | +| `/implement M` milestone-scope has no clear "use the orchestrator N times" pattern | N-times IS the pattern: one `Skill(/dev-process-toolkit:tdd )` tool_use per FR in scope | +| Prioritized shipping over process fidelity | Process fidelity IS the ship gate, not its competitor | + +Rows append (not replace) as future rationalizations surface in post-ship runs. + +### Catalog shape + +`plugins/dev-process-toolkit/docs/honored-contracts.md` carries a top-of-file overview paragraph, then one `## ` heading per contract with four labeled subsections (`**Mandate.**`, `**Violation name.**`, `**Auditable evidence.**`, `**Precedent FRs.**`). Three seeded entries: `/implement → /tdd` (primary), `/spec-write → spec-research` (precedent — STE-230), `/brainstorm → AskUserQuestion-first` (precedent — STE-237). Future "X must invoke Y" rules extend the catalog without re-authoring per-skill Contract blocks. + +### Cross-references + +- `/implement` Contract block ends with: `Catalog entry: docs/honored-contracts.md § /implement → /tdd.` +- Each catalog entry's **Mandate** points back to the skill-prose location (e.g., `plugins/dev-process-toolkit/skills/implement/SKILL.md § Phase 2 step 8`). + +### Out of scope + +- No `/gate-check` probe (operator choice: prose-only). +- No runtime byte-grep enforcement (operator choice). +- No `/implement` skill mechanic change (operator choice). +- No fix that makes `/implement` actually enforce per-FR `/tdd` invocation — this FR makes the contract *visible*, not *enforced*. +- Other "X must invoke Y" contracts beyond the three seeded entries: deferred to follow-up FRs. + +## Testing + +Deliverables are prose edits. No runtime behavior to assert. Test strategy: `bun test` fixture suite greps post-edit file content against each AC's verify-line shape. + +### Test files + +- `plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts` — one test per AC verify line. + +### Risks (testing-side) + +- Grep tests are brittle to wording. Mitigated by AC verify lines using literal substrings ("TDD Orchestrator Contract", "Inline TDD Antipattern") chosen as canonical names — wording is intentionally part of the contract. +- Post-ship conformance is the only signal that the prose worked. No in-CI assertion that future agents honor the contract — operator-accepted residual risk. + +## Risks + +| Category | Risk | Severity | Mitigation | +|----------|------|----------|------------| +| Unclear acceptance criteria | AC-STE-283.7's "residual-risk framing" criterion is partly subjective | medium | grep verify on `falsif\|escalation` + STE-220 keywords + manual review during `/implement` Phase 4 | +| Breaking changes | Contract block added to existing SKILL.md may shift downstream parsing if any tool reads SKILL.md by line offset | low | none known; SKILL.md is prose-consumed, no line-offset parsers | +| Maintenance drift | Three-layered prose (Contract block + Excuse table + Catalog) may diverge in framing over time | low | catalog is single source of truth; Contract block points to it; on framing divergence, catalog wins | + +## Notes + +- First FR of M71 ("Honored Contracts enforcement"); intentionally separated from M70's runtime-emission theme (STE-280/281/282). +- Escalation path if falsified: evidence-based gate (`tdd_orchestrator_invocation_evidence` /gate-check probe) → hard mechanic (STE-225-style first-action contract for Phase 2 step 8). +- Parked follow-up: tracker ↔ local FR/milestone sync (a separate future-allocated FR after this one ships, candidate for M70 since theme fits runtime-emission) — the meta-bug behind the 2026-05-13 partial-scan trap. +- Brainstorm session: 2026-05-13 (maximalist 1+2+3 design selected with explicit acknowledgement of STE-220→STE-270 precedent risk). diff --git a/specs/plan/M71.md b/specs/plan/M71.md new file mode 100644 index 0000000..aa98018 --- /dev/null +++ b/specs/plan/M71.md @@ -0,0 +1,41 @@ +--- +status: active +archived_at: null +kind: feature +--- + +# M71: Honored Contracts enforcement {#M71} + +Release target: TBD +Codename: TBD + +## Summary + +Sharpened-prose enforcement for cross-skill technique mandates in the toolkit, anchored on `/implement → /tdd` as the first instance. Adds a labeled **TDD Orchestrator Contract** callout at `/implement` Phase 2 step 8, an inline **Rationalization Prevention table** preempting documented excuses, and a new `plugins/dev-process-toolkit/docs/honored-contracts.md` catalog enumerating every cross-skill mandate under a uniform shape (Mandate / Violation / Evidence / Precedent FRs). + +Intentionally separated from M70's runtime-emission theme (STE-280/281/282). This layer is prose-only and operator-acknowledged as falsifiable per the STE-220→STE-270 precedent chain (6 FRs of prose-only falsification). Escalation path on falsification: evidence-based gate (STE-262/STE-270 pattern) or hard mechanic (STE-225 pattern), both explicitly out of scope here. + +Triggered by: 2026-05-13 cross-project finding — `/implement` Phase 2 step 8's mandate to invoke `/dev-process-toolkit:tdd ` per FR got compressed into inline TDD in the main context (no forked subagents, no `tdd-result` audit trail). See `/brainstorm` 2026-05-13 for the design session (spec-research seed: STE-220, STE-262, STE-270; design choice: maximalist 1+2+3, sharpened-prose-only). + +## In Scope + +- [ ] STE-283 — Honored Contracts enforcement: /implement → /tdd Contract block + Rationalization table + catalog + - verify: 7 ACs covering Contract block presence, Rationalization table, catalog file existence + entries, precedent FR citations, cross-references, residual-risk acknowledgement; tests in `plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts` + +## Out of Scope (deferred) + +- `/gate-check` probe enforcement (deferred — operator-acknowledged escalation path on falsification). +- Runtime byte-grep enforcement (deferred — same). +- `/implement` skill mechanic change (deferred — same). +- Tracker ↔ local FR/milestone sync FR (parked as future-allocated candidate after this FR ships; thematic fit is M70, not M71) — the meta-bug behind the 2026-05-13 partial-scan trap where I missed M70/STE-280/281/282 on Linear because their local FR files lived on an unmerged branch. + +## Risks + +- **Risk:** The 6-FR precedent chain (STE-220 → STE-270) says prose-only enforcement gets falsified. **Severity:** medium (operator-acknowledged). **Mitigation:** make the prose maximally byte-checkable-by-a-human-reviewer — labeled callout, named violation, auditable evidence shape. Post-ship conformance run is the falsifiability test. On falsification, lift to evidence-based gate per documented escalation path. +- **Risk:** Three-layered prose change (Contract block + Excuse table + Catalog) increases authoring + maintenance surface; future drift across the three artifacts may diverge their framing. **Severity:** low. **Mitigation:** catalog is the single source of truth; the Contract block points to it; if framing diverges, fix the Contract block to match the catalog (catalog wins). + +## Notes + +- M71 is the next free milestone after M70 ("Runtime emission contract closures"), themed prose-enforcement and intentionally distinct. +- This is the first M71 FR. Future FRs extending the Honored Contracts catalog (additional cross-skill mandates) may land in M71 or be deferred — TBD per future `/spec-write` sessions. +- Brainstorm session: 2026-05-13 — operator selected maximalist (Contract block + Rationalization table + Catalog) sharpened-prose design with explicit acknowledgement of the STE-220→STE-270 precedent risk. From 1d6ef4f4d35627123f9e381802a02afe0b5cf3e8 Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 14:58:16 +0400 Subject: [PATCH 02/37] chore(specs): write FR STE-284 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Tracker ↔ local FR/milestone reconciliation (M70's second FR) — helper preflight at /spec-write entry + /gate-check probe at gate boundary. Auto-imports tracker→local orphans silently; prompts for the ambiguous kinds. Supersedes STE-119's "M-numbers are local" decision, which got falsified the same day we set out to enforce it. Also writes specs/plan/M70.md (the milestone existed on Linear only until now — the exact bug this FR fixes, surfaced and patched in one /brainstorm → /spec-write loop). Refs: STE-284 --- specs/frs/STE-284.md | 132 +++++++++++++++++++++++++++++++++++++++++++ specs/plan/M70.md | 45 +++++++++++++++ 2 files changed, 177 insertions(+) create mode 100644 specs/frs/STE-284.md create mode 100644 specs/plan/M70.md diff --git a/specs/frs/STE-284.md b/specs/frs/STE-284.md new file mode 100644 index 0000000..5458697 --- /dev/null +++ b/specs/frs/STE-284.md @@ -0,0 +1,132 @@ +--- +title: Tracker ↔ local FR/milestone reconciliation — helper preflight + /gate-check probe +milestone: M70 +status: active +archived_at: null +tracker: + linear: STE-284 +created_at: 2026-05-13T10:55:45Z +--- + +# STE-284: Tracker ↔ local FR/milestone reconciliation — helper preflight + /gate-check probe {#STE-284} + +## Requirement + +`/spec-write` in tracker mode currently scans only local sources (`specs/plan/`, `specs/plan/archive/`, CHANGELOG) when allocating new milestones/FRs. STE-119 (M31) explicitly excluded the tracker because "M-numbers are local FS IDs, not tracker objects." That decision got falsified on 2026-05-13: `/conformance-loop` iter-1 created M70 + STE-280/281/282 directly on Linear with no local files (branch unmerged); a subsequent `/spec-write` on `main` saw empty local plan/frs dirs and would have allocated M71 over the top. + +The partial-scan trap is the canonical failure mode: local FS state and tracker state can disagree across branches, and the toolkit has no mechanism to detect or close that gap automatically. The memory-level mitigation added on 2026-05-13 (`feedback_check_milestones_first.md` updated to require a Linear scan) relies on the next agent honoring prose — the same enforcement layer the STE-220→STE-270 chain repeatedly falsifies. + +This FR closes the gap with **two layers**: entry-time helper preflight at `/spec-write` (auto-imports tracker→local orphans per operator design choice) + `/gate-check` probe at gate boundary (deterministic, catches drift from any source including manual edits and `/spec-write` bypass). Both layers share `reconcileTrackerLocal()` as single source of truth. + +## Acceptance Criteria + +- AC-STE-284.1: `nextFreeMilestoneNumber()` extended with optional `provider` param; in tracker mode unions tracker milestones via `Provider.listMilestones()`. Returns `{ next, sources: { active, archive, changelog, tracker } }`. `mode: none` is vacuous (`tracker: []`). + - verify: `bun test plugins/dev-process-toolkit/adapters/_shared/src/__tests__/next_free_milestone_number_tracker.test.ts` PASSES — covers tracker-only collision (local empty, tracker has M70 → returns 71), both-side collision (both have M70 → returns 71), mode-none vacuous (returns based on local only). +- AC-STE-284.2: New helper `reconcileTrackerLocal(provider, specsDir)` at `plugins/dev-process-toolkit/adapters/_shared/src/reconcile_tracker_local.ts` returns `{ trackerOrphans: TrackerOrphan[], localOrphans: LocalOrphan[], milestoneMismatches: MilestoneMismatch[] }`. Each orphan carries `kind`, `id`, `details`. + - verify: `bun test plugins/dev-process-toolkit/adapters/_shared/src/__tests__/reconcile_tracker_local.test.ts` PASSES — all three orphan kinds + clean-sync + the canonical 2026-05-13 partial-scan reproduction. +- AC-STE-284.3: `/spec-write` § 0 preamble (NEW step `§ 0.5 Tracker-local reconciliation`, between § 0 and § 0a) calls the helper, auto-imports tracker→local orphans via existing `importFromTracker(...)` **guarded by `existsSync` per STE-135**; prompts NFR-10 canonical refusal on decline for local→tracker + milestone-mismatch. + - verify: `bun test plugins/dev-process-toolkit/skills/spec-write/__tests__/preamble_reconcile.test.ts` PASSES — partial-scan reproduction fixture (M70 + STE-280/281/282 on Linear, local empty) auto-imports 3 FRs + plan stub and emits `tracker_local_reconciled` row with count=3. +- AC-STE-284.4: New `/gate-check` probe `tracker_local_reconciliation_drift` registered in `plugins/dev-process-toolkit/skills/gate-check/SKILL.md § probes` at **severity:warning** (any drift) with escalation to **severity:error** on hard FR-id collisions (same tracker ID bound to two local files, or local FR pointing to non-existent tracker ID). + - verify: `bun test plugins/dev-process-toolkit/tests/gate-check-tracker-local-reconciliation-drift.test.ts` PASSES — clean-sync / drift-warning / hard-collision-error cases. +- AC-STE-284.5: Three capability keys added to `/spec-write` § 7 + `/gate-check` § Report static plain-language maps: + - `tracker_local_reconciled` — `tracker → local reconciliation imported N FR(s) and M milestone(s) — see specs/frs/.md and specs/plan/.md for the imported content` + - `tracker_local_orphan_local` — `local FR has no tracker binding — run /spec-write --push-to-tracker to sync, or remove the local file if abandoned` + - `milestone_local_orphan` — `local milestone M has no tracker milestone — auto-create via Provider.attachProjectMilestone(M), or remove if abandoned` + - verify: `grep -F "tracker_local_reconciled" plugins/dev-process-toolkit/skills/spec-write/SKILL.md` ≥ 1; same for the other two keys; same three rows in `skills/gate-check/SKILL.md`. +- AC-STE-284.6: STE-119's tracker-exclusion decision explicitly revisited + superseded in this FR's Notes section, with reasoning (Linear/Jira treat M-numbers as real milestone objects since `/conformance-loop` adoption). + - verify: `grep -F "STE-119" specs/frs/STE-284.md` ≥ 1 line in Notes section; manual review confirms supersession framing. +- AC-STE-284.7: `Provider.listMilestones()` interface added to `IProvider` at `plugins/dev-process-toolkit/adapters/_shared/src/provider.ts`; implemented by `LinearProvider` (calls `mcp__linear__list_milestones` filtered to active project), `JiraProvider` (Jira's project-milestone API), `LocalProvider` (returns `[]`). + - verify: `bun test plugins/dev-process-toolkit/adapters/linear/__tests__/list_milestones.test.ts plugins/dev-process-toolkit/adapters/jira/__tests__/list_milestones.test.ts` PASSES; LocalProvider returns empty. +- AC-STE-284.8: Performance budget — entry-time preflight adds ≤ 1500ms to `/spec-write` invocation on a typical project (≤ 50 active FRs, ≤ 10 active milestones). + - verify: smoke test fixture measures wall-clock delta from `/spec-write` start to first `AskUserQuestion` in tracker mode with vs. without the preflight; median delta ≤ 1500ms; soft-fail via `tracker_query_slow` capability row on overrun. + +## Technical Design + +### Files touched + +- `plugins/dev-process-toolkit/adapters/_shared/src/next_free_milestone_number.ts` — extend signature to accept optional `provider` and include tracker-side query in the union. +- `plugins/dev-process-toolkit/adapters/_shared/src/reconcile_tracker_local.ts` — new file: canonical reconciliation helper. +- `plugins/dev-process-toolkit/adapters/_shared/src/provider.ts` — add `listMilestones()` to `IProvider`. +- `plugins/dev-process-toolkit/adapters/linear/src/provider.ts` — implement `listMilestones()` via `mcp__linear__list_milestones`. +- `plugins/dev-process-toolkit/adapters/jira/src/provider.ts` — implement `listMilestones()` via Jira's milestone API. +- `plugins/dev-process-toolkit/adapters/local/src/provider.ts` — implement `listMilestones()` returning `[]`. +- `plugins/dev-process-toolkit/skills/spec-write/SKILL.md` — add `§ 0.5 Tracker-local reconciliation` preamble between § 0 and § 0a; update § 7 capability map. +- `plugins/dev-process-toolkit/skills/gate-check/SKILL.md` — add `tracker_local_reconciliation_drift` probe row in § probes; update § Report capability map. +- `plugins/dev-process-toolkit/skills/gate-check/probes/tracker_local_reconciliation_drift.ts` — new file: probe implementation calling `reconcileTrackerLocal()` shared helper. +- 6 test files (per-AC + integration + smoke fixture). + +### Reconciliation algorithm + +``` +function reconcileTrackerLocal(provider, specsDir) { + if (provider.mode === "none") return empty; + const trackerFRs = await provider.listActiveFRs(); + const trackerMilestones = await provider.listMilestones(); + const localFRs = read(specsDir/frs/*.md, excluding archive/); + const localPlans = read(specsDir/plan/M*.md, excluding archive/); + return { + trackerOrphans: trackerFRs without local file (existsSync check), + localOrphans: local FRs without tracker binding in tracker state, + milestoneMismatches: tracker milestones without local plan (or vice versa), + }; +} +``` + +### Preamble integration in /spec-write + +NEW `§ 0.5 Tracker-local reconciliation (tracker mode only)` between § 0 and § 0a: + +- Call `reconcileTrackerLocal(provider, specsDir)`. +- For each `trackerOrphan`: call `importFromTracker(provider, id, specsDir)` — guarded by `existsSync` per STE-135. +- For each `localOrphan`: prompt user (push to tracker / remove local / leave). +- For each `milestoneMismatch`: prompt user (resolve direction). +- Emit capability rows in Step 7 summary. + +### Probe implementation + +`/gate-check` probe calls the same `reconcileTrackerLocal()` helper. Differences from preamble path: + +- Probe NEVER auto-imports (read-only at the gate boundary; mutation is operator's job via /spec-write). +- Severity logic: any drift → warning; hard FR-id collision → error. +- Probe output follows standard /gate-check probe shape — substitute `tracker:` or `milestone:` where file:line is unavailable. + +### Out of scope + +- Cross-branch awareness — probe only sees current working tree. Documented limitation, not a regression. STE-225-style branch-aware scanning deferred. +- Bidirectional FR-description sync — only presence/absence parity. Per-AC content drift is STE-211's existing flow. +- Auto-resolution of milestone-mismatches — flagged + prompted only. + +## Testing + +Test strategy: unit tests per helper + integration tests for the preamble + smoke fixture reproducing 2026-05-13's partial-scan trap. MCP adapter tests mock Linear/Jira; shared helper tested against in-memory provider stub. + +### Test files + +- `plugins/dev-process-toolkit/adapters/_shared/src/__tests__/next_free_milestone_number_tracker.test.ts` +- `plugins/dev-process-toolkit/adapters/_shared/src/__tests__/reconcile_tracker_local.test.ts` +- `plugins/dev-process-toolkit/skills/spec-write/__tests__/preamble_reconcile.test.ts` +- `plugins/dev-process-toolkit/tests/gate-check-tracker-local-reconciliation-drift.test.ts` +- `plugins/dev-process-toolkit/adapters/linear/__tests__/list_milestones.test.ts` +- `plugins/dev-process-toolkit/adapters/jira/__tests__/list_milestones.test.ts` + +### Risks (testing-side) + +- Mocking the Linear/Jira MCP for adapter tests is brittle to MCP API changes. Mitigated by isolating MCP calls in adapter implementations and unit-testing adapters with mocked MCP; the shared helper is tested against an in-memory provider stub. +- Performance budget (AC.8) is wall-clock dependent on tracker latency. Mitigated by soft-fail via `tracker_query_slow` capability row on overrun; only median timing counts for the AC. + +## Risks + +| Category | Risk | Severity | Mitigation | +|----------|------|----------|------------| +| Performance impact | +500–1500ms tracker latency per /spec-write entry in tracker mode | medium | AC.8 caps budget; `tracker_query_slow` row on overrun; future session-cache layer | +| External dependencies | Linear API rate limit on every /spec-write entry | low | 1–3 calls per invocation; project-scoped; generous limit | +| Breaking changes | `nextFreeMilestoneNumber()` signature extension | low | `provider` param is optional; existing callers unaffected | + +## Notes + +- **Supersedes STE-119's tracker-exclusion decision.** STE-119 (M31, archived) defined `nextFreeMilestoneNumber()` as mode-agnostic on the grounds that "M-numbers are local FS IDs, not tracker objects." This decision was falsified on 2026-05-13 when `/conformance-loop` iter-1 created `M70 — Runtime emission contract closures` directly on Linear without writing the local plan file (it lived on an unmerged branch). Linear / Jira treat M-numbers as milestone names (real objects), and the toolkit's downstream consumers (`/spec-write`, `/implement`, `/gate-check`) need cross-source visibility to avoid the partial-scan trap. +- **Second FR of M70's reconciliation theme expansion.** STE-280/281/282 covered runtime emission contract closures in /implement, /spec-write, /tdd; this FR closes the **state-emission contract** — the toolkit's local view of tracker state must converge. +- **Operator design choice** (per `/brainstorm` 2026-05-13): auto-import tracker→local silently (tracker is authoritative for tracker-bound FRs); prompt for local→tracker + milestone-mismatch (require human decision since direction is ambiguous). +- **Shared helper between layers.** Both the /spec-write preamble and the /gate-check probe call `reconcileTrackerLocal()` — single source of truth ensures the two layers always agree on what "in sync" means. +- **Cross-branch limitation (documented).** The reconciliation scan only sees the current working tree. If an unmerged branch carries the FR file, the probe will flag it as a tracker-orphan and offer to import (which would conflict with the branch). The operator must merge the branch or accept the import. Documented limitation, not a regression — STE-225-style branch-aware scanning is out of scope. +- **Brainstorm session:** 2026-05-13 (Approach 3 hybrid: helper preflight + /gate-check probe; auto-import tracker→local silently; defense in depth). diff --git a/specs/plan/M70.md b/specs/plan/M70.md new file mode 100644 index 0000000..62531c5 --- /dev/null +++ b/specs/plan/M70.md @@ -0,0 +1,45 @@ +--- +status: active +archived_at: null +kind: feature +--- + +# M70: Runtime emission contract closures (post-M69 follow-ups) {#M70} + +Release target: TBD +Codename: TBD + +## Summary + +Close runtime emission and state-emission contract gaps surfaced after M69 shipped (v2.20.0 "Byte-Strict"). M70 was originally created on Linear by `/conformance-loop` iter-1 (2026-05-11) along with the first three FRs (STE-280/281/282) covering Pattern-26 first-turn enforcement broadening, /implement orchestrator `tdd-result` visibility, and `/spec-write` spec-research literal token emission. STE-284 (added 2026-05-13) extends the theme to **state-emission**: the toolkit's local view of tracker state must converge with the tracker's authoritative state, enforced via a `/spec-write` preamble + `/gate-check` probe. + +Triggered by: post-M69 `/conformance-loop` findings (STE-280/281/282 — `/conformance-loop` iter-1 2026-05-11) + 2026-05-13 partial-scan trap surfaced manually during the `/brainstorm` → `/spec-write` flow for STE-283 / Honored Contracts (STE-284). + +## In Scope + +- [ ] STE-280 — Broaden Pattern-26 first-turn enforcement to flag tracker MCP writes + git branch checkouts as side-effect violations + - verify: see Linear ticket for ACs; local FR file pending import via STE-284's reconciliation flow (or manual `/spec-write STE-280` to pull it in earlier). +- [ ] STE-281 — Re-emit fenced `tdd-result` hand-off blocks from /implement orchestrator to parent stdout + - verify: see Linear ticket for ACs; local FR file pending import. +- [ ] STE-282 — Emit literal `spec_research_invoked` / `_no_matches` / `_shape_violation` capability rows in /spec-write § 7 closing summary + - verify: see Linear ticket for ACs; local FR file pending import. +- [ ] STE-284 — Tracker ↔ local FR/milestone reconciliation: helper preflight + /gate-check probe + - verify: 8 ACs covering `nextFreeMilestoneNumber` tracker extension, `reconcileTrackerLocal` helper, `/spec-write` § 0.5 preamble, `/gate-check` probe, capability key map updates, STE-119 supersession framing, `Provider.listMilestones()` interface, performance budget; tests in `plugins/dev-process-toolkit/adapters/_shared/src/__tests__/`, `skills/{spec-write,gate-check}/__tests__/`, `adapters/{linear,jira}/__tests__/`. + +## Out of Scope (deferred) + +- Cross-branch awareness (STE-225-style branch-aware scanning) — documented limitation in STE-284; deferred follow-up FR. +- Bidirectional FR-description sync — STE-211's existing per-AC content drift flow is separate. +- Auto-resolution of milestone-mismatches — STE-284 prompts only, never silently resolves. + +## Risks + +- **Risk:** STE-280/281/282 don't have local FR files; M70's plan file refers to them but their full AC details live only on Linear. **Severity:** low. **Mitigation:** STE-284 shipping creates the reconciliation flow that auto-imports them. Until STE-284 ships, the operator can manually run `/spec-write ` per orphan to pull them locally — `importFromTracker` already exists per STE-31 / STE-135. +- **Risk:** STE-284 supersedes STE-119's tracker-exclusion decision; cleanup of any other call sites that assumed mode-agnostic milestone allocation may surface during `/implement`. **Severity:** medium. **Mitigation:** grep for `nextFreeMilestoneNumber` call sites + verify each handles the optional `provider` param correctly during `/implement` Phase 2. +- **Risk:** Mixed-theme milestone — STE-280/281/282 are runtime-emission (capability row emission, stdout visibility), STE-284 is state-emission (tracker ↔ local convergence). **Severity:** low. **Mitigation:** the broader theme statement "runtime AND state emission contract closures" covers both. If theme drift becomes problematic, STE-284 can be re-homed to a future milestone. + +## Notes + +- M70 was originally themed "Runtime emission contract closures (post-M69 follow-ups)". STE-284 extends the theme to **state-emission** — the toolkit's local view of tracker state must converge. +- The local plan file `specs/plan/M70.md` is created on 2026-05-13 (this file), 2 days after M70 was created on Linear via `/conformance-loop` iter-1. This is the canonical partial-scan trap reproduction: the milestone existed on the tracker but had no local plan file until manual recovery. STE-284's reconciliation flow will prevent this in the future. +- Brainstorm session for STE-284: 2026-05-13 (Approach 3 hybrid: helper preflight + `/gate-check` probe; auto-import tracker→local silently; defense in depth). From a35b440c9e8bbd595dc6e79fbf87cabd295cce0c Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 17:33:44 +0400 Subject: [PATCH 03/37] chore(specs): write FR STE-285 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Honored Contracts byte-checkable layer (M71's second FR) — /setup offers a multi-select menu of opt-in toolkit-contract enforcement hooks (Process category only; Safety + Quality out of scope). Hooks plug into the user's .claude/settings.json via ${CLAUDE_PLUGIN_ROOT} substitution (verified against Claude Code docs + STE-133 precedent). Together with STE-283's prose layer, this gives M71 a "prose + opt-in byte-checkable" enforcement bundle for the same contracts. The third layer (bundled session-wide) stays where STE-262/270/276 left it: in the rejection bin. Refs: STE-285 --- specs/frs/STE-285.md | 142 +++++++++++++++++++++++++++++++++++++++++++ specs/plan/M71.md | 22 ++++--- 2 files changed, 157 insertions(+), 7 deletions(-) create mode 100644 specs/frs/STE-285.md diff --git a/specs/frs/STE-285.md b/specs/frs/STE-285.md new file mode 100644 index 0000000..44cd540 --- /dev/null +++ b/specs/frs/STE-285.md @@ -0,0 +1,142 @@ +--- +title: Honored Contracts byte-checkable layer — opt-in toolkit-contract enforcement hooks via /setup +milestone: M71 +status: active +archived_at: null +tracker: + linear: STE-285 +created_at: 2026-05-13T13:29:22Z +--- + +# STE-285: Honored Contracts byte-checkable layer — opt-in toolkit-contract enforcement hooks via /setup {#STE-285} + +## Requirement + +STE-283 shipped the prose layer of Honored Contracts enforcement: a TDD Orchestrator Contract callout at `/implement` Phase 2 step 8 + Rationalization Prevention table + `docs/honored-contracts.md` catalog. The operator accepted, eyes open, that the STE-220→STE-270 chain (6 FRs of prose-only falsification) said this prose layer alone would likely get falsified by a subsequent conformance run. + +This FR adds the **byte-checkable** layer — opt-in, per-project, via `/setup`. When a user runs `/setup` on their project, `/setup` offers (after stack detection) a single multi-select menu of toolkit-contract enforcement hooks (all defaulted off). Each selected hook lands in the user's `.claude/settings.json` referencing `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/.sh` — plugin-owned via the existing `${CLAUDE_PLUGIN_ROOT}` substitution pattern (confirmed working per `docs/skill-anatomy.md` and STE-133's commit-msg precedent). + +**Why this layer (not session-wide bundled).** STE-262, STE-270, and STE-276 each rejected bundling PreToolUse hooks in the plugin's auto-active `hooks/hooks.json` on three grounds: (1) session-wide spawn blast radius affecting every plugin user including downstream consumers, (2) conflicts with the triple-check posture, (3) no clean per-session state surface for the hook API. The `/setup`-installed layer is the layer those rejections explicitly opened up: opt-in per project, user-controlled, blast radius bounded to the project the user explicitly bootstrapped. + +**Why Process-only.** Per `/brainstorm` 2026-05-13, the operator's primary motivation is "force Claude Code to obey the plugin rules". Safety hooks (e.g., block destructive bash) and Quality hooks (e.g., format-on-write) are explicitly out of scope — not deferred-with-follow-up, just out. + +## Acceptance Criteria + +- AC-STE-285.1: `/setup` adds a new step (after stack detection, before the final summary report) that fires a single `AskUserQuestion` with multi-select options. Each option is a named toolkit-contract enforcement hook. All options default to off. User picks zero or more. + - verify: `bun test plugins/dev-process-toolkit/skills/setup/__tests__/hooks_menu_prompt.test.ts` PASSES — fixture invokes `/setup`; asserts the menu Q fires with ≥ 4 options, all defaulting off. +- AC-STE-285.2: At least four seeded Process-category hooks ship with the plugin under `plugins/dev-process-toolkit/templates/hooks/process/`: + - `pre-commit-gate-check.sh` — PreToolUse Bash:`git commit*` → require a `Skill(/dev-process-toolkit:gate-check)` tool_use in the current session log; refuse with NFR-10 shape on miss. + - `pre-pr-spec-review.sh` — PreToolUse Bash:`gh pr create*` → require a `Skill(/dev-process-toolkit:spec-review)` tool_use in current session. + - `pre-spec-write-brainstorm-reminder.sh` — UserPromptSubmit on `/dev-process-toolkit:spec-write` invocation → if no `Skill(/dev-process-toolkit:brainstorm)` tool_use in current session AND the FR appears greenfield (heuristic: no resolved tracker ID arg), inject a stderr reminder to consider `/brainstorm` first. + - `pre-commit-tdd-orchestrator.sh` — PreToolUse Bash:`git commit*` → if FR-related files staged (specs/frs/.md or matching test files), require a `Skill(/dev-process-toolkit:tdd)` tool_use in current session; refuse with NFR-10 shape on miss. **Byte-checkable continuation of STE-283's TDD Orchestrator Contract.** + - verify: `test -f` for each script; each has shebang + `chmod +x`; per-hook unit tests in `plugins/dev-process-toolkit/templates/hooks/__tests__/` cover happy + miss paths. +- AC-STE-285.3: Selected hooks write into the user's `.claude/settings.json` via key-level merge (existing entries preserved). Each entry uses exec form (per Claude Code docs recommendation for paths): `"command": "bash"`, `"args": ["${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/.sh"]`. Conflict resolution: if a key with the same matcher + command already exists, no-op (idempotent re-run); if matcher matches but command differs, diff + prompt per STE-133. + - verify: `bun test plugins/dev-process-toolkit/skills/setup/__tests__/hooks_merge_settings.test.ts` PASSES — fixtures cover empty settings.json, settings with unrelated hooks (preserved), settings with same matcher + identical command (no-op), settings with same matcher + different command (diff + prompt). +- AC-STE-285.4: `/setup`'s final summary report includes a `hooks_installed` capability row listing each installed hook by name (or `hooks_skipped` if user picked none). Row text per the static plain-language map: + - `hooks_installed` — `Installed N opt-in toolkit-contract enforcement hook(s): — toggle off any hook by editing .claude/settings.json` + - `hooks_skipped` — `User declined opt-in hooks during /setup — run /setup --hooks to reconsider, or edit .claude/settings.json manually` + - verify: `grep -F "hooks_installed" plugins/dev-process-toolkit/skills/setup/SKILL.md` ≥ 1; same for `hooks_skipped`. +- AC-STE-285.5: New `/setup --hooks` flag re-runs only the hooks step (skips stack detection + CLAUDE.md generation). Idempotent: re-running on a project with already-installed hooks shows the menu pre-checked for installed hooks, allowing toggle off or addition. + - verify: `bun test plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_flag.test.ts` PASSES — covers idempotent re-run on pre-installed project + adding a new hook + toggling off an existing. +- AC-STE-285.6: STE-262/STE-270/STE-276 cancellation chain explicitly cited in this FR's Notes section under "Why not session-wide bundled" — preserves the decision chain so future agents reading this FR understand the layer choice. + - verify: `grep -E "STE-262|STE-270|STE-276" specs/frs/STE-285.md` returns ≥ 3 distinct refs in Notes section. +- AC-STE-285.7: New `plugins/dev-process-toolkit/docs/hooks-reference.md` enumerates each seeded hook with: name, event, matcher, requirement, NFR-10 refusal shape on miss, override pattern (how to snapshot-copy + edit). Provides the "user manual" for the installed hooks. + - verify: `test -f plugins/dev-process-toolkit/docs/hooks-reference.md`; `grep -cE "^### " plugins/dev-process-toolkit/docs/hooks-reference.md` ≥ 4 (one section per seeded hook). + +## Technical Design + +### Files touched + +- `plugins/dev-process-toolkit/skills/setup/SKILL.md` — add a new step (after stack detection, before final summary) for the hooks menu prompt; add `--hooks` flag handling; update § Report capability map. +- `plugins/dev-process-toolkit/templates/hooks/process/pre-commit-gate-check.sh` — new file (AC.2). +- `plugins/dev-process-toolkit/templates/hooks/process/pre-pr-spec-review.sh` — new file (AC.2). +- `plugins/dev-process-toolkit/templates/hooks/process/pre-spec-write-brainstorm-reminder.sh` — new file (AC.2). +- `plugins/dev-process-toolkit/templates/hooks/process/pre-commit-tdd-orchestrator.sh` — new file (AC.2). +- `plugins/dev-process-toolkit/templates/hooks/_lib/session.sh` — shared helper (reads `$CLAUDE_SESSION_FILE`; greps for required `Skill` tool_use; isolates env-var dependency). +- `plugins/dev-process-toolkit/skills/setup/install_hooks.ts` — new helper: writes hook entries into user's `.claude/settings.json` with key-level merge, idempotency, diff-and-prompt on conflict per STE-133. +- `plugins/dev-process-toolkit/docs/hooks-reference.md` — new file (AC.7). +- Test files (per-AC + per-hook unit tests). + +### Hook session-log detection + +The Process-category hooks all need to read the current Claude Code session log to detect whether a particular `Skill` tool_use has fired. Claude Code exposes session JSON via `$CLAUDE_SESSION_FILE` (or equivalent env var) during hook execution. Each hook script: + +1. Reads `$CLAUDE_SESSION_FILE` (JSONL stream of tool calls). +2. Greps for the required `Skill` tool_use shape (e.g., `"name":"Skill"`, `"input":{"skill":"dev-process-toolkit:gate-check"}`). +3. Exit 0 on hit; exit 1 + NFR-10-shape stderr on miss. + +If `$CLAUDE_SESSION_FILE` is unset (e.g., hook invoked outside a Claude Code session, or fresh session with no log yet), the hook exits 0 (fail-open — don't block legitimate non-Claude commits). The session detection is isolated in `templates/hooks/_lib/session.sh` so the env-var dependency is one-file. + +### Settings.json merge format + +Each seeded hook installs as: + +```json +{ + "hooks": { + "PreToolUse": [ + { + "matcher": "Bash", + "hooks": [ + { + "type": "command", + "command": "bash", + "args": ["${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/.sh"], + "timeout": 5000 + } + ] + } + ] + } +} +``` + +The matcher is event-specific (e.g., `"Bash"` for PreToolUse on git commit, `"*"` for UserPromptSubmit). Merge logic preserves existing entries; appends new ones; refuses to overwrite if same-event same-matcher same-command exists (idempotent). + +### Out of scope + +- Quality category hooks (format-on-write, lint, etc.) — operator decision per `/brainstorm` 2026-05-13. +- Safety category hooks (destructive-op blocks, secret-edit blocks) — operator decision per `/brainstorm` 2026-05-13. +- Bundled (auto-active) hooks in plugin's `hooks/hooks.json` — explicitly rejected per STE-262/STE-270/STE-276 cancellation chain. +- Manifest-driven extensibility (Approach 3 from `/brainstorm`) — deferred; first version uses fixed bundles per `/brainstorm`'s Approach 1 pick. +- Per-hook override via project-side snapshot — documented as a manual workflow in `docs/hooks-reference.md`; no `/setup` support. +- Cross-platform shell coverage beyond bash/zsh — hook scripts are `.sh`, assume POSIX shell. PowerShell variants deferred. + +## Testing + +Test strategy: unit tests per hook script (mocked `$CLAUDE_SESSION_FILE`) + integration tests for `/setup`'s menu prompt + merge helper + smoke test for the canonical reproduction (run `/setup` with hooks chosen, then attempt `git commit` without `/gate-check` → should refuse). + +### Test files + +- `plugins/dev-process-toolkit/skills/setup/__tests__/hooks_menu_prompt.test.ts` (AC.1) +- `plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-gate-check.test.ts` (AC.2) +- `plugins/dev-process-toolkit/templates/hooks/__tests__/pre-pr-spec-review.test.ts` (AC.2) +- `plugins/dev-process-toolkit/templates/hooks/__tests__/pre-spec-write-brainstorm-reminder.test.ts` (AC.2) +- `plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-tdd-orchestrator.test.ts` (AC.2) +- `plugins/dev-process-toolkit/skills/setup/__tests__/hooks_merge_settings.test.ts` (AC.3) +- `plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_flag.test.ts` (AC.5) +- `plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_smoke.test.ts` (smoke: end-to-end install + commit-block reproduction) + +### Risks (testing-side) + +- `$CLAUDE_SESSION_FILE` exposure is Claude Code internal; if its name or format changes between Claude Code versions, hooks break. Mitigated by the shared `_lib/session.sh` helper that's referenced by all hook scripts — changing the env var name affects one file. +- Heuristic for "FR-related files staged" in `pre-commit-tdd-orchestrator.sh` is fuzzy. Risk: false positives (block legitimate commits) or false negatives (let `/tdd`-bypass through). Mitigated by AC.2's verify line + per-hook unit test covering staged-FR + staged-non-FR + mixed cases. + +## Risks + +| Category | Risk | Severity | Mitigation | +|----------|------|----------|------------| +| External deps | `$CLAUDE_SESSION_FILE` format is Claude Code internal; format change breaks every hook | medium | shared `_lib/session.sh` helper; per-hook unit tests catch on version bump | +| Heuristic precision | `pre-commit-tdd-orchestrator.sh`'s "FR-related staged" detection is fuzzy | medium | per-hook unit test covers staged-FR + non-FR + mixed cases | +| Fail-open trade | Empty session log → hooks pass; could let bypass-cases through | medium | accept; fail-closed would block non-Claude commits which is unacceptable; opt-in nature means user consents | +| Plugin uninstall | Hooks reference `${CLAUDE_PLUGIN_ROOT}`; uninstall breaks them | low | `docs/hooks-reference.md` documents the cleanup; `/setup --hooks --uninstall` flag could automate (deferred) | +| User override clunky | Project that wants to extend a hook must snapshot-copy + switch settings.json reference | low | documented manual workflow in hooks-reference.md | + +## Notes + +- **Honored Contracts byte-checkable layer.** STE-283 (M71) shipped the prose layer; this FR is the byte-checkable opt-in layer at the user's project scope. Same contracts (TDD Orchestrator, Pattern 26 first-turn, `/gate-check` before `/pr`, `/spec-review` before `/pr`), harder enforcement. +- **Why not session-wide bundled.** STE-262, STE-270, STE-276 each rejected bundling PreToolUse hooks in the plugin's `hooks/hooks.json` on three grounds: (1) session-wide spawn blast radius affecting every plugin user including downstream consumers, (2) conflicts with the triple-check posture, (3) no clean per-session state surface for the hook API. This FR is the `/setup`-opt-in layer those rejections explicitly opened up — bounded blast radius (per-project, user consents), no triple-check conflict (each install is a one-time user decision), and the per-session state surface (`$CLAUDE_SESSION_FILE`) is acceptable for opt-in scope even if it would have been unreliable for plugin-wide auto-active hooks. +- **Process-only by design.** Quality + Safety explicitly out of scope per `/brainstorm` 2026-05-13. Re-opening either category requires a new `/brainstorm` + FR. +- **Update propagation.** Plugin updates the hook scripts → user's next session picks them up (no `/setup` re-run needed). Plugin-owned via `${CLAUDE_PLUGIN_ROOT}` substitution; same pattern as STE-133's commit-msg hook. +- **STE-283 → STE-285 relationship.** STE-283 prose + STE-285 hooks together form the M71 Honored Contracts enforcement bundle. Independent ship gates — STE-285 doesn't depend on STE-283 shipping first, but the `docs/honored-contracts.md` catalog from STE-283 cross-references the hook IDs from STE-285. If both ship in M71's release commit, the cross-reference is byte-clean. +- **Brainstorm session:** 2026-05-13 (Approach 1 = single-prompt all-off menu; Process-only after refinement from "all three categories"; M71 milestone fit). diff --git a/specs/plan/M71.md b/specs/plan/M71.md index aa98018..c4c4fa8 100644 --- a/specs/plan/M71.md +++ b/specs/plan/M71.md @@ -11,23 +11,31 @@ Codename: TBD ## Summary -Sharpened-prose enforcement for cross-skill technique mandates in the toolkit, anchored on `/implement → /tdd` as the first instance. Adds a labeled **TDD Orchestrator Contract** callout at `/implement` Phase 2 step 8, an inline **Rationalization Prevention table** preempting documented excuses, and a new `plugins/dev-process-toolkit/docs/honored-contracts.md` catalog enumerating every cross-skill mandate under a uniform shape (Mandate / Violation / Evidence / Precedent FRs). +Honored Contracts enforcement for cross-skill technique mandates in the toolkit, anchored on `/implement → /tdd` as the first instance. Ships in **two layers** — prose (STE-283) + byte-checkable opt-in hooks via `/setup` (STE-285) — that together form a complete enforcement bundle for the same set of contracts. -Intentionally separated from M70's runtime-emission theme (STE-280/281/282). This layer is prose-only and operator-acknowledged as falsifiable per the STE-220→STE-270 precedent chain (6 FRs of prose-only falsification). Escalation path on falsification: evidence-based gate (STE-262/STE-270 pattern) or hard mechanic (STE-225 pattern), both explicitly out of scope here. +**STE-283 (prose layer):** labeled **TDD Orchestrator Contract** callout at `/implement` Phase 2 step 8, inline **Rationalization Prevention table**, new `plugins/dev-process-toolkit/docs/honored-contracts.md` catalog enumerating every cross-skill mandate under a uniform shape (Mandate / Violation / Evidence / Precedent FRs). Operator-acknowledged risk: prose alone gets falsified per the STE-220→STE-270 6-FR chain. -Triggered by: 2026-05-13 cross-project finding — `/implement` Phase 2 step 8's mandate to invoke `/dev-process-toolkit:tdd ` per FR got compressed into inline TDD in the main context (no forked subagents, no `tdd-result` audit trail). See `/brainstorm` 2026-05-13 for the design session (spec-research seed: STE-220, STE-262, STE-270; design choice: maximalist 1+2+3, sharpened-prose-only). +**STE-285 (byte-checkable opt-in layer):** `/setup` offers a multi-select menu of toolkit-contract enforcement hooks (all defaulted off); selected entries write to user's `.claude/settings.json` referencing `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/.sh`. This is the layer STE-262/STE-270/STE-276 explicitly opened up after rejecting bundled session-wide PreToolUse hooks — opt-in per project, bounded blast radius. Process-only; Safety + Quality explicitly out of scope. + +Intentionally separated from M70's runtime-emission theme (STE-280/281/282). Triggered by: 2026-05-13 cross-project finding — `/implement` Phase 2 step 8's mandate to invoke `/dev-process-toolkit:tdd ` per FR got compressed into inline TDD in the main context. See `/brainstorm` 2026-05-13 for the two design sessions (STE-283: maximalist sharpened-prose; STE-285: single-prompt all-off menu, Process-only). ## In Scope - [ ] STE-283 — Honored Contracts enforcement: /implement → /tdd Contract block + Rationalization table + catalog - verify: 7 ACs covering Contract block presence, Rationalization table, catalog file existence + entries, precedent FR citations, cross-references, residual-risk acknowledgement; tests in `plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts` +- [ ] STE-285 — Honored Contracts byte-checkable layer: opt-in toolkit-contract enforcement hooks via /setup + - verify: 7 ACs covering /setup menu prompt, ≥ 4 seeded process hooks (pre-commit-gate-check, pre-pr-spec-review, pre-spec-write-brainstorm-reminder, pre-commit-tdd-orchestrator), settings.json merge semantics (STE-133-style idempotent + diff-and-prompt), capability rows, `/setup --hooks` flag, STE-262/270/276 supersession framing, new `docs/hooks-reference.md`; tests in `plugins/dev-process-toolkit/{skills/setup,templates/hooks}/__tests__/`. ## Out of Scope (deferred) -- `/gate-check` probe enforcement (deferred — operator-acknowledged escalation path on falsification). -- Runtime byte-grep enforcement (deferred — same). -- `/implement` skill mechanic change (deferred — same). -- Tracker ↔ local FR/milestone sync FR (parked as future-allocated candidate after this FR ships; thematic fit is M70, not M71) — the meta-bug behind the 2026-05-13 partial-scan trap where I missed M70/STE-280/281/282 on Linear because their local FR files lived on an unmerged branch. +- `/gate-check` probe enforcement of TDD Orchestrator Contract — deferred per STE-283's operator-acknowledged escalation path on falsification; STE-285 covers the user-opt-in hook layer instead. +- Runtime byte-grep enforcement of the prose contract at the toolkit-internal level — deferred (escalation path on STE-283 falsification). +- `/implement` skill mechanic change (STE-225-style first-action contract for Phase 2 step 8) — deferred (final escalation if both prose + opt-in hooks falsify). +- Quality category hooks (format-on-write, lint) — explicitly out of M71 scope per `/brainstorm` 2026-05-13; not deferred, just out. +- Safety category hooks (destructive-op blocks, secret-edit blocks) — explicitly out of M71 scope per `/brainstorm` 2026-05-13; not deferred, just out. +- Bundled (auto-active) hooks in plugin's `hooks/hooks.json` — explicitly rejected per STE-262/STE-270/STE-276 cancellation chain. +- Manifest-driven extensibility (Approach 3 from STE-285's `/brainstorm`) — deferred; first version uses fixed bundles. +- Tracker ↔ local FR/milestone sync FR (parked as STE-284 in M70, not M71) — the meta-bug behind the 2026-05-13 partial-scan trap. ## Risks From cc5a04a71435d12f6cd25458085affd970b4ac96 Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 18:34:01 +0400 Subject: [PATCH 04/37] feat(skills/implement): TDD Orchestrator Contract prose layer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add labeled "TDD Orchestrator Contract" callout + Rationalization Prevention table at the top of /implement Phase 2 step 8; cite docs/honored-contracts.md catalog (3 seeded entries: /implement → /tdd, /spec-write → spec-research, /brainstorm → AskUserQuestion-first). Prose layer of M71 Honored Contracts enforcement bundle; operator-acknowledged residual risk vs. STE-220→STE-270 6-FR falsification chain (escalation path: evidence-based gate / hard mechanic). Refs: STE-283 --- .../docs/honored-contracts.md | 40 +++ .../skills/implement/SKILL.md | 14 +- .../honored-contracts-implement-tdd.test.ts | 239 ++++++++++++++++++ 3 files changed, 292 insertions(+), 1 deletion(-) create mode 100644 plugins/dev-process-toolkit/docs/honored-contracts.md create mode 100644 plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts diff --git a/plugins/dev-process-toolkit/docs/honored-contracts.md b/plugins/dev-process-toolkit/docs/honored-contracts.md new file mode 100644 index 0000000..201d131 --- /dev/null +++ b/plugins/dev-process-toolkit/docs/honored-contracts.md @@ -0,0 +1,40 @@ +# Honored Contracts + +This catalog records **contracts between skills** that have been hardened with named-violation enforcement after at least one observed regression. Each entry follows a uniform four-label shape so the catalog stays byte-checkable and additions stay cheap: + +- **Mandate.** What the caller skill is required to do — phrased as a non-negotiable, not a guideline. +- **Violation name.** A short, search-grep-able label for the antipattern. The name is load-bearing: it lets reviewers, gate checks, and future spec writers cite the failure mode in one token. +- **Auditable evidence.** The byte-checkable shape that proves the contract was honored on a given run. Usually a `tool_use` pattern, a file presence/content check, or a log marker. +- **Precedent FRs.** STE refs (orchestrator + falsification chain) that established the contract or hardened it after a documented breach. + +The catalog is intentionally short. An entry earns its place only after a contract has been broken in practice and the fix required prose-level reinforcement on top of any mechanical guard. + +## /implement → /tdd + +**Mandate.** Inside `/implement` Phase 2, the build loop MUST delegate every FR's RED → GREEN → REFACTOR cycle to the `/dev-process-toolkit:tdd` multi-agent orchestrator. The parent `/implement` context does not write tests, write implementation code, or run the refactor pass inline — those passes belong to the three forked TDD subagents (test-writer / implementer / refactorer) under the orchestrator's bounded-retry budget. + +**Violation name.** Inline TDD Antipattern — `/implement` performing TDD in its own context instead of forking `/dev-process-toolkit:tdd` once per FR. + +**Auditable evidence.** N `Skill(/dev-process-toolkit:tdd )` `tool_use` entries in the `/implement` run transcript, where N = FR count in the milestone scope. Zero such entries with non-zero FRs implemented is the canonical signature of the violation. + +**Precedent FRs.** STE-225 (multi-agent orchestrator that made the fork the mechanic), STE-220 (anchor of the prose-falsification chain), STE-226, STE-237, STE-251, STE-262, STE-270 (six prose hardenings that compounded after the contract was breached despite the mechanic existing). + +## /spec-write → spec-research + +**Mandate.** `/spec-write` MUST fork the internal `spec-research` subagent to gather related FRs before drafting a new spec. The parent context does not perform related-FR retrieval inline; the forked subagent returns a bounded (≤ 25-line) related-FR block that the spec writer cites. + +**Violation name.** Inline Spec Research — `/spec-write` searching the FR archive itself instead of forking `spec-research` for the related-FR retrieval pass. + +**Auditable evidence.** At least one `Skill(/dev-process-toolkit:spec-research ...)` `tool_use` entry in the `/spec-write` run transcript before the new FR file is drafted, plus a related-FRs block of ≤ 25 lines in the resulting spec. + +**Precedent FRs.** STE-230 (introduced the `spec-research` fork as the related-FR retrieval mechanic for `/spec-write` and `/brainstorm`). + +## /brainstorm → AskUserQuestion-first + +**Mandate.** `/brainstorm` MUST drive its Socratic clarification loop with the `AskUserQuestion` tool one question at a time, before proposing any solution sketches. Free-form narrative questions in the parent prose channel do not count — the tool call is the contract. + +**Violation name.** Narrative Clarification — `/brainstorm` asking clarifying questions in prose instead of via `AskUserQuestion` tool calls. + +**Auditable evidence.** A run of `AskUserQuestion` `tool_use` entries in the `/brainstorm` transcript prior to the first solution-sketch turn. Zero such entries with a delivered solution sketch is the canonical signature of the violation. + +**Precedent FRs.** STE-237 (mandated `AskUserQuestion`-first as a hard mechanic on `/brainstorm` after a documented inline-prose-questions regression). diff --git a/plugins/dev-process-toolkit/skills/implement/SKILL.md b/plugins/dev-process-toolkit/skills/implement/SKILL.md index 1d67ecd..3450d8c 100644 --- a/plugins/dev-process-toolkit/skills/implement/SKILL.md +++ b/plugins/dev-process-toolkit/skills/implement/SKILL.md @@ -77,7 +77,19 @@ If a multi-milestone worktree run partially succeeds, list completed work (miles ## Phase 2: Build (TDD) -8. **Execute in TDD order via the multi-agent orchestrator** — invoke `/dev-process-toolkit:tdd ` inline (no separate opt-in path). Per STE-225, the orchestrator runs three forked-subagent stages (test-writer once per FR with the full AC list batched; implementer once per AC; refactorer once at end after all GREEN) with `context: fork` isolation, a strict `tdd-result` fenced-block hand-off contract, and bounded retry (max 2 per role per AC for semantic failures A/B/C/E; single targeted retry for format violation D). Per-stage isolation enforces the test-writer-cannot-see-implementation guarantee deterministically. The orchestrator's halt path **does** pause for the operator — that's intentional, not a pacing violation: halt fires only after the bounded-retry cap is exhausted, so it surfaces a real failure. Routine cycles (no retries) run end-to-end without operator interaction. Follow project patterns from CLAUDE.md. +8. **Execute in TDD order via the multi-agent orchestrator** — + + > **TDD Orchestrator Contract.** Violation name: **Inline TDD Antipattern** (writing tests + code in the parent `/implement` context instead of forking the orchestrator). Auditable evidence shape: **N `Skill(/dev-process-toolkit:tdd )` `tool_use` entries where N = FR count in milestone scope** — one orchestrator invocation per FR, no inlined RED→GREEN→REFACTOR in the parent transcript. Residual-risk note: the STE-220→STE-270 prose-falsification chain shows prose alone is falsifiable; the documented escalation path on repeat violation is an **evidence-based gate** (STE-262 / STE-270 pattern) or a **hard mechanic** (STE-225 pattern). Catalog: `docs/honored-contracts.md`. + + **Rationalization Prevention.** The following rationalizations are documented antipatterns — each is preempted here so they cannot be invoked as waivers: + + | Excuse | Reality | + |--------|---------| + | Milestone spans N FRs / many ACs — orchestrator cost is too high | Cost is not a contract waiver; orchestrator-per-FR IS the milestone-scope pattern (STE-225). | + | `/implement M` milestone-scope has no clear "use the orchestrator N times" pattern | N-times IS the pattern: one `Skill(/dev-process-toolkit:tdd )` `tool_use` per FR in scope — N invocations where N = FR count. | + | Prioritized shipping over process fidelity | Process fidelity IS the ship gate, not its competitor; an FR shipped via inline TDD has not shipped through the contract. | + + invoke `/dev-process-toolkit:tdd ` inline (no separate opt-in path). Per STE-225, the orchestrator runs three forked-subagent stages (test-writer once per FR with the full AC list batched; implementer once per AC; refactorer once at end after all GREEN) with `context: fork` isolation, a strict `tdd-result` fenced-block hand-off contract, and bounded retry (max 2 per role per AC for semantic failures A/B/C/E; single targeted retry for format violation D). Per-stage isolation enforces the test-writer-cannot-see-implementation guarantee deterministically. The orchestrator's halt path **does** pause for the operator — that's intentional, not a pacing violation: halt fires only after the bounded-retry cap is exhausted, so it surfaces a real failure. Routine cycles (no retries) run end-to-end without operator interaction. Follow project patterns from CLAUDE.md. 9. **Spec deviation check** — If reality contradicts the spec, STOP coding forward and classify: `underspecified` (backfill + test + continue), `ambiguous` (provisional decision + user confirm at Phase 4), `contradicts` (wait for user decision), `infeasible` (wait). Always backfill edge cases to `specs/requirements.md` / `specs/technical-spec.md` plus a test. Full playbook: `docs/implement-reference.md` § Spec Deviation Check. diff --git a/plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts b/plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts new file mode 100644 index 0000000..77a9714 --- /dev/null +++ b/plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts @@ -0,0 +1,239 @@ +// Doc-conformance tests for STE-283. +// +// "Honored Contracts enforcement — /implement → /tdd Contract block + +// Rationalization table + catalog". Three layered prose edits reinforce each +// other: +// +// 1. A labeled **TDD Orchestrator Contract** callout at /implement Phase 2 +// step 8 (named violation: "Inline TDD Antipattern", auditable evidence +// shape, residual-risk note citing STE-220→STE-270 + escalation path, +// cross-reference to the catalog file). +// 2. An inline **Rationalization Prevention table** (Excuse | Reality) with +// ≥ 3 rows preempting cost / no-N-times-pattern / shipping-over-fidelity. +// 3. A new `plugins/dev-process-toolkit/docs/honored-contracts.md` catalog +// with ≥ 3 seeded entries (`/implement → /tdd` primary, +// `/spec-write → spec-research` precedent, `/brainstorm → +// AskUserQuestion-first` precedent) under a uniform four-label shape +// (Mandate / Violation name / Auditable evidence / Precedent FRs). +// +// One test per AC verify line. Assertions are over file content (prose-only +// FR — no runtime behavior to assert). + +import { describe, expect, test } from "bun:test"; +import { existsSync, readFileSync } from "node:fs"; +import { join } from "node:path"; + +const PLUGIN_ROOT = join(import.meta.dir, ".."); +const IMPLEMENT_SKILL_PATH = join(PLUGIN_ROOT, "skills", "implement", "SKILL.md"); +const CATALOG_PATH = join(PLUGIN_ROOT, "docs", "honored-contracts.md"); + +function readImplementSkill(): string { + if (!existsSync(IMPLEMENT_SKILL_PATH)) { + throw new Error(`/implement SKILL.md not found at ${IMPLEMENT_SKILL_PATH}`); + } + return readFileSync(IMPLEMENT_SKILL_PATH, "utf-8"); +} + +function readCatalog(): string { + if (!existsSync(CATALOG_PATH)) { + throw new Error(`honored-contracts.md not found at ${CATALOG_PATH}`); + } + return readFileSync(CATALOG_PATH, "utf-8"); +} + +/** + * Extract the prose region for /implement Phase 2 step 8 — the labeled + * "Execute in TDD order via the multi-agent orchestrator" item under + * `## Phase 2: Build (TDD)`. The Contract block lives at the start of this + * region per the FR's Technical Design. + * + * Region runs from the `## Phase 2: Build (TDD)` header through the start of + * step 9 (`Spec deviation check`). This window is the AC's "Phase 2 step 8" + * scope; assertions narrower than the SKILL file as a whole guard against + * stray matches elsewhere in the file. + */ +function step8Region(body: string): string { + const phase2Idx = body.indexOf("## Phase 2: Build (TDD)"); + if (phase2Idx === -1) { + throw new Error("`## Phase 2: Build (TDD)` heading not found in /implement SKILL.md"); + } + // Step 9 begins with "9. **Spec deviation check**". Fall back to the next + // `### Spec Breakout` / `## ` heading if the step numbering ever shifts. + const step9Idx = body.indexOf("9. **Spec deviation check**", phase2Idx); + const fallbackIdx = (() => { + const a = body.indexOf("### Spec Breakout", phase2Idx); + const b = body.indexOf("\n## ", phase2Idx + 1); + const candidates = [step9Idx, a, b].filter((i) => i > phase2Idx); + if (candidates.length === 0) { + return body.length; + } + return Math.min(...candidates); + })(); + return body.slice(phase2Idx, fallbackIdx); +} + +describe("STE-283 AC.1 — /implement Phase 2 step 8 carries the TDD Orchestrator Contract callout", () => { + test("Contract block names the contract verbatim", () => { + const body = readImplementSkill(); + expect(body).toContain("TDD Orchestrator Contract"); + }); + + test('Contract block names the violation: "Inline TDD Antipattern"', () => { + const body = readImplementSkill(); + expect(body).toContain("Inline TDD Antipattern"); + }); + + test("Contract block + violation name appear inside Phase 2 step 8 (not elsewhere)", () => { + const body = readImplementSkill(); + const region = step8Region(body); + expect(region).toContain("TDD Orchestrator Contract"); + expect(region).toContain("Inline TDD Antipattern"); + }); + + test("Contract block states the auditable evidence shape (N tool_use entries = FR count in milestone scope)", () => { + const region = step8Region(readImplementSkill()); + // The FR's AC.1 spells out the canonical evidence shape: + // "N Skill(/dev-process-toolkit:tdd ) tool_use entries where + // N = FR count in milestone scope". + // We assert the canonical-name fragments rather than the exact prose so + // the wording can absorb minor phrasing adjustments without churn — the + // *substance* (orchestrator call, tool_use evidence, FR-count rule) is + // load-bearing. + expect(region).toMatch(/\/dev-process-toolkit:tdd/); + expect(region).toMatch(/tool_use/i); + expect(region).toMatch(/FR count|per FR|N\s*=\s*FR/i); + }); +}); + +describe("STE-283 AC.2 — Rationalization Prevention table (≥ 3 rows)", () => { + test("Excuse | Reality table header appears in /implement SKILL.md", () => { + const body = readImplementSkill(); + expect(body).toContain("| Excuse | Reality |"); + }); + + test("Excuse | Reality table sits inside or immediately after the Contract block in Phase 2 step 8", () => { + const region = step8Region(readImplementSkill()); + expect(region).toContain("| Excuse | Reality |"); + }); + + test("Table has ≥ 3 body rows preempting documented rationalizations (cost / N-times / shipping-over-fidelity)", () => { + const region = step8Region(readImplementSkill()); + const headerIdx = region.indexOf("| Excuse | Reality |"); + expect(headerIdx).toBeGreaterThan(-1); + // Walk past the header row + separator row, then count consecutive body + // rows (lines beginning with `|`). Stop at the first non-pipe line. + const afterHeader = region.slice(headerIdx).split("\n"); + // afterHeader[0] is the header row, afterHeader[1] is the separator + // (`|--------|---------|`). Body rows start at index 2. + let bodyRows = 0; + for (let i = 2; i < afterHeader.length; i++) { + const line = afterHeader[i]!; + if (line.trim().startsWith("|")) { + bodyRows++; + } else { + break; + } + } + expect(bodyRows).toBeGreaterThanOrEqual(3); + + // The three canonical rationalizations are explicitly named in the FR. + // Each must be addressed somewhere in the table body. We match generous + // substrings — the rebuttal wording is the operator's call. + const tableBody = afterHeader.slice(2, 2 + bodyRows).join("\n"); + expect(tableBody.toLowerCase()).toMatch(/cost/); + expect(tableBody.toLowerCase()).toMatch(/n-times|n times|N times/i); + expect(tableBody.toLowerCase()).toMatch(/ship|fidelity/); + }); +}); + +describe("STE-283 AC.3 — honored-contracts.md catalog exists with uniform four-label shape", () => { + test("plugins/dev-process-toolkit/docs/honored-contracts.md exists", () => { + expect(existsSync(CATALOG_PATH)).toBe(true); + }); + + test("catalog has ≥ 12 four-label section headers (3 entries × 4 labels: Mandate / Violation name / Auditable evidence / Precedent FRs)", () => { + const body = readCatalog(); + // AC verify line uses the regex: ^\*\*(Mandate|Violation name|Auditable evidence|Precedent FRs). + // We replicate it line-by-line for clarity. + const lines = body.split("\n"); + const labelRe = /^\*\*(Mandate|Violation name|Auditable evidence|Precedent FRs)/; + const matches = lines.filter((l) => labelRe.test(l)); + expect(matches.length).toBeGreaterThanOrEqual(12); + }); + + test("catalog contains an entry for /implement → /tdd", () => { + const body = readCatalog(); + // The entry's heading or body must name the contract pair. We accept + // either ASCII arrow `->` or the unicode arrow `→` used elsewhere in the + // repo prose, paired with `/implement` and `/tdd`. + expect(body).toMatch(/\/implement[\s\S]{0,80}(→|->)[\s\S]{0,80}\/tdd/); + }); +}); + +describe("STE-283 AC.4 — /implement → /tdd catalog entry cites the STE-225 + STE-220→STE-270 chain", () => { + test("catalog cites STE-225 (orchestrator) and all six prose-falsification FRs (STE-220, 226, 237, 251, 262, 270)", () => { + const body = readCatalog(); + const required = [ + "STE-225", + "STE-220", + "STE-226", + "STE-237", + "STE-251", + "STE-262", + "STE-270", + ]; + // AC verify line: grep -E "STE-225|STE-220|STE-226|STE-237|STE-251|STE-262|STE-270" >= 7 distinct refs. + const distinctHits = new Set(); + for (const ref of required) { + if (body.includes(ref)) { + distinctHits.add(ref); + } + } + expect(distinctHits.size).toBeGreaterThanOrEqual(7); + }); +}); + +describe("STE-283 AC.5 — Contract block references the catalog file path", () => { + test("`docs/honored-contracts.md` is cited inside Phase 2 step 8", () => { + const region = step8Region(readImplementSkill()); + expect(region).toContain("docs/honored-contracts.md"); + }); +}); + +describe("STE-283 AC.6 — catalog seeds ≥ 3 contract entries under `## ` headings", () => { + test("catalog has ≥ 3 `## ` (level-2) headings", () => { + const body = readCatalog(); + const headings = body.split("\n").filter((l) => /^## /.test(l)); + expect(headings.length).toBeGreaterThanOrEqual(3); + }); + + test("catalog names the three seeded entries: /implement → /tdd, /spec-write → spec-research, /brainstorm → AskUserQuestion-first", () => { + const body = readCatalog(); + // /implement → /tdd + expect(body).toMatch(/\/implement[\s\S]{0,80}(→|->)[\s\S]{0,80}\/tdd/); + // /spec-write → spec-research (precedent — STE-230) + expect(body).toMatch(/\/spec-write[\s\S]{0,80}(→|->)[\s\S]{0,80}spec-research/); + // /brainstorm → AskUserQuestion-first (precedent — STE-237) + expect(body).toMatch(/\/brainstorm[\s\S]{0,80}(→|->)[\s\S]{0,80}AskUserQuestion/); + }); +}); + +describe("STE-283 AC.7 — Contract block carries a residual-risk note citing STE-220→STE-270 + escalation path", () => { + test("Phase 2 step 8 mentions falsification or escalation (residual-risk framing)", () => { + const region = step8Region(readImplementSkill()); + // AC verify line: grep -E "falsif|escalation" ... referencing STE-220. + expect(region).toMatch(/falsif|escalation/i); + }); + + test("residual-risk note references STE-220 (anchor of the prose-falsification chain)", () => { + const region = step8Region(readImplementSkill()); + expect(region).toContain("STE-220"); + }); + + test("residual-risk note names the documented escalation path (evidence-based gate / hard mechanic)", () => { + const region = step8Region(readImplementSkill()); + // The FR notes the escalation path as "evidence-based gate (STE-262/STE-270 + // pattern) or hard mechanic (STE-225 pattern)". Either phrase counts. + expect(region).toMatch(/evidence-based gate|hard mechanic/i); + }); +}); From 40a555d7211d9604042382eb23b0c5453f38d22d Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 18:34:20 +0400 Subject: [PATCH 05/37] feat(skills/setup): opt-in byte-checkable contract hooks via /setup Byte-checkable layer of M71 Honored Contracts enforcement bundle. Adds: - /setup multi-select menu prompt step (all hooks default off) - --hooks flag for re-running only the hooks step (idempotent, pre-checked) - 4 seeded Process-category hooks under templates/hooks/process/ (pre-commit-gate-check, pre-pr-spec-review, pre-spec-write-brainstorm-reminder, pre-commit-tdd-orchestrator) - Shared _lib/session.sh helper (atomic single-line JSONL grep, NFR-10 emit) - install_hooks.ts settings.json merge helper (key-level merge, idempotent on same-matcher-same-command, conflict-surfaced on diff) - docs/hooks-reference.md user manual Per-project opt-in, bounded blast radius; supersedes STE-262/STE-270/STE-276 session-wide bundling rejections. Refs: STE-285 --- .../docs/hooks-reference.md | 100 ++++++ .../dev-process-toolkit/skills/setup/SKILL.md | 30 ++ .../setup/__tests__/hooks_menu_prompt.test.ts | 118 +++++++ .../__tests__/hooks_merge_settings.test.ts | 182 +++++++++++ .../setup/__tests__/setup_hooks_flag.test.ts | 144 +++++++++ .../skills/setup/install_hooks.ts | 292 ++++++++++++++++++ .../__tests__/pre-commit-gate-check.test.ts | 109 +++++++ .../pre-commit-tdd-orchestrator.test.ts | 133 ++++++++ .../__tests__/pre-pr-spec-review.test.ts | 104 +++++++ ...pre-spec-write-brainstorm-reminder.test.ts | 132 ++++++++ .../templates/hooks/_lib/session.sh | 92 ++++++ .../hooks/process/pre-commit-gate-check.sh | 14 + .../process/pre-commit-tdd-orchestrator.sh | 64 ++++ .../hooks/process/pre-pr-spec-review.sh | 14 + .../pre-spec-write-brainstorm-reminder.sh | 60 ++++ ...red-contracts-byte-checkable-layer.test.ts | 111 +++++++ .../tests/hooks-capability-rows.test.ts | 53 ++++ 17 files changed, 1752 insertions(+) create mode 100644 plugins/dev-process-toolkit/docs/hooks-reference.md create mode 100644 plugins/dev-process-toolkit/skills/setup/__tests__/hooks_menu_prompt.test.ts create mode 100644 plugins/dev-process-toolkit/skills/setup/__tests__/hooks_merge_settings.test.ts create mode 100644 plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_flag.test.ts create mode 100644 plugins/dev-process-toolkit/skills/setup/install_hooks.ts create mode 100644 plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-gate-check.test.ts create mode 100644 plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-tdd-orchestrator.test.ts create mode 100644 plugins/dev-process-toolkit/templates/hooks/__tests__/pre-pr-spec-review.test.ts create mode 100644 plugins/dev-process-toolkit/templates/hooks/__tests__/pre-spec-write-brainstorm-reminder.test.ts create mode 100755 plugins/dev-process-toolkit/templates/hooks/_lib/session.sh create mode 100755 plugins/dev-process-toolkit/templates/hooks/process/pre-commit-gate-check.sh create mode 100755 plugins/dev-process-toolkit/templates/hooks/process/pre-commit-tdd-orchestrator.sh create mode 100755 plugins/dev-process-toolkit/templates/hooks/process/pre-pr-spec-review.sh create mode 100755 plugins/dev-process-toolkit/templates/hooks/process/pre-spec-write-brainstorm-reminder.sh create mode 100644 plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts create mode 100644 plugins/dev-process-toolkit/tests/hooks-capability-rows.test.ts diff --git a/plugins/dev-process-toolkit/docs/hooks-reference.md b/plugins/dev-process-toolkit/docs/hooks-reference.md new file mode 100644 index 0000000..f73d0af --- /dev/null +++ b/plugins/dev-process-toolkit/docs/hooks-reference.md @@ -0,0 +1,100 @@ +# Hooks Reference — Process-Category Toolkit-Contract Enforcement + +This is the user manual for the **opt-in, per-project toolkit-contract enforcement hooks** seeded by `/setup` (STE-285, M71). The hooks are the byte-checkable layer of the Honored Contracts enforcement stack — the prose layer ships separately at `docs/honored-contracts.md` (STE-283). + +**Scope.** All hooks in this catalog are **Process** category — they enforce contracts between skills (e.g., "run `/gate-check` before `git commit`"). Quality hooks (format-on-write, lint) and Safety hooks (destructive-op blocks) are explicitly out of scope per the STE-285 `/brainstorm` decision (2026-05-13). + +**Opt-in by design.** Every hook defaults to **off**. `/setup` prompts via a single multi-select `AskUserQuestion` after stack detection. Re-run `/setup --hooks` at any time to add, remove, or toggle hooks without re-running stack detection. + +**Install shape.** Each selected hook lands in your project's `.claude/settings.json` as a `bash` exec-form command referencing `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/.sh`. The plugin owns the script body; updates propagate automatically when the plugin updates (no `/setup` re-run needed). + +**NFR-10 refusal shape.** On a contract miss, hooks exit non-zero and write a 3-line structured refusal to stderr in the canonical NFR-10 shape emitted by `templates/hooks/_lib/session.sh`: + +``` +Refusing: +Remedy: +Context: mode=hook, ticket=unbound, skill=, hook= +``` + +Advisory (non-blocking) hooks substitute `Reminder:` for `Refusing:` and exit 0. + +The Claude Code harness surfaces this stderr block back to the model, which then either runs the missing skill or asks the operator to confirm a deliberate override. + +**Override pattern.** To customize a seeded hook's behavior on your project, snapshot-copy and edit: + +1. Copy `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/.sh` to a project-local path, e.g., `.claude/hooks/.sh`. +2. Edit `.claude/settings.json` to replace the `${CLAUDE_PLUGIN_ROOT}/...` reference with the project-local path. +3. Plugin updates no longer touch your fork. Re-snapshot manually if you want upstream changes. + +To disable a hook entirely, delete its block from `.claude/settings.json` (or run `/setup --hooks` and uncheck it). + +**Fail-open on missing session log.** Every hook reads `$CLAUDE_SESSION_FILE` to detect required `Skill` `tool_use` entries. If `$CLAUDE_SESSION_FILE` is unset (e.g., commit made outside a Claude Code session, or a fresh session with no log yet), the hook exits 0 — non-Claude commits are never blocked. The fail-open trade-off is explicitly accepted (see STE-285 Risks table). + +--- + +### pre-commit-gate-check + +- **Name:** `pre-commit-gate-check` +- **Event:** `PreToolUse` +- **Matcher:** `Bash` (with command-pattern guard for `git commit*`) +- **Requirement:** A `Skill(/dev-process-toolkit:gate-check)` `tool_use` MUST appear in the current session log before any `git commit` invocation. Enforces the "gate-check before commit" contract at the byte layer. +- **NFR-10 refusal shape on miss:** + ``` + Refusing: required dev-process-toolkit:gate-check Skill tool_use not found in current session. + Remedy: run /dev-process-toolkit:gate-check before retrying this action. + Context: mode=hook, ticket=unbound, skill=dev-process-toolkit:gate-check, hook=pre-commit-gate-check + ``` +- **Override pattern:** Snapshot-copy `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/pre-commit-gate-check.sh` to `.claude/hooks/pre-commit-gate-check.sh`, edit (e.g., relax the matcher or whitelist `--amend`), and repoint the `args` entry in `.claude/settings.json`. + +### pre-pr-spec-review + +- **Name:** `pre-pr-spec-review` +- **Event:** `PreToolUse` +- **Matcher:** `Bash` (with command-pattern guard for `gh pr create*`) +- **Requirement:** A `Skill(/dev-process-toolkit:spec-review)` `tool_use` MUST appear in the current session log before any `gh pr create` invocation. Enforces the "spec-review before PR" contract at the byte layer. +- **NFR-10 refusal shape on miss:** + ``` + Refusing: required dev-process-toolkit:spec-review Skill tool_use not found in current session. + Remedy: run /dev-process-toolkit:spec-review before retrying this action. + Context: mode=hook, ticket=unbound, skill=dev-process-toolkit:spec-review, hook=pre-pr-spec-review + ``` +- **Override pattern:** Snapshot-copy `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/pre-pr-spec-review.sh` to `.claude/hooks/pre-pr-spec-review.sh`, edit (e.g., scope to specific repos or skip on docs-only branches), and repoint the `args` entry in `.claude/settings.json`. + +### pre-spec-write-brainstorm-reminder + +- **Name:** `pre-spec-write-brainstorm-reminder` +- **Event:** `UserPromptSubmit` +- **Matcher:** `*` (filters internally on `/dev-process-toolkit:spec-write` invocation) +- **Requirement:** When the user invokes `/dev-process-toolkit:spec-write`, the hook checks for a prior `Skill(/dev-process-toolkit:brainstorm)` `tool_use` in the current session. If absent AND the FR appears greenfield (heuristic: no resolved tracker ID arg passed), the hook injects a **stderr reminder** to consider `/brainstorm` first. This is a soft nudge — the hook does NOT block. +- **NFR-10 refusal shape on miss:** This hook does **not** refuse; it only emits a reminder. The reminder text uses the NFR-10 shape for consistency but exits 0: + ``` + Reminder: no dev-process-toolkit:brainstorm Skill tool_use in this session and the FR looks greenfield (no tracker ID). + Remedy: consider running /dev-process-toolkit:brainstorm first to explore the design space before drafting the spec. + Context: mode=hook, ticket=unbound, skill=dev-process-toolkit:brainstorm, hook=pre-spec-write-brainstorm-reminder + ``` +- **Override pattern:** Snapshot-copy `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/pre-spec-write-brainstorm-reminder.sh` to `.claude/hooks/pre-spec-write-brainstorm-reminder.sh`, edit (e.g., tune the greenfield heuristic or change the reminder threshold), and repoint the `args` entry in `.claude/settings.json`. Convert the exit code to non-zero if you want the reminder to hard-block. + +### pre-commit-tdd-orchestrator + +- **Name:** `pre-commit-tdd-orchestrator` +- **Event:** `PreToolUse` +- **Matcher:** `Bash` (with command-pattern guard for `git commit*`) +- **Requirement:** If FR-related files are staged (heuristic: `specs/frs/.md` or matching test files under `tests/` referencing an FR ID), a `Skill(/dev-process-toolkit:tdd)` `tool_use` MUST appear in the current session log. Byte-checkable continuation of STE-283's TDD Orchestrator Contract: prevents the "Inline TDD Antipattern" where `/implement` writes tests + code itself instead of forking `/dev-process-toolkit:tdd`. +- **NFR-10 refusal shape on miss:** + ``` + Refusing: required dev-process-toolkit:tdd Skill tool_use not found in current session. + Remedy: run /dev-process-toolkit:tdd before retrying this action. + Context: mode=hook, ticket=unbound, skill=dev-process-toolkit:tdd, hook=pre-commit-tdd-orchestrator + ``` +- **Override pattern:** Snapshot-copy `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/pre-commit-tdd-orchestrator.sh` to `.claude/hooks/pre-commit-tdd-orchestrator.sh`, edit (e.g., tighten or loosen the "FR-related staged" heuristic, allow-list certain commit types like `docs:` or `chore:`), and repoint the `args` entry in `.claude/settings.json`. + +--- + +## Related references + +- `docs/honored-contracts.md` — prose-layer catalog of the same contracts these hooks enforce. +- `docs/skill-anatomy.md` — `${CLAUDE_PLUGIN_ROOT}` substitution pattern used by the hook install entries. +- STE-285 (M71) — FR that seeded this catalog. +- STE-283 (M71) — prose-layer FR; the TDD Orchestrator Contract callout in `/implement` Phase 2 step 8. +- STE-262, STE-270, STE-276 — cancellation chain that explicitly opened up the `/setup`-opt-in layer (rejected bundling these hooks in the plugin's auto-active `hooks/hooks.json`). +- STE-133 — `${CLAUDE_PLUGIN_ROOT}` commit-msg hook precedent (same install pattern). diff --git a/plugins/dev-process-toolkit/skills/setup/SKILL.md b/plugins/dev-process-toolkit/skills/setup/SKILL.md index b06134c..bde7b97 100644 --- a/plugins/dev-process-toolkit/skills/setup/SKILL.md +++ b/plugins/dev-process-toolkit/skills/setup/SKILL.md @@ -48,6 +48,18 @@ When `$ARGUMENTS` contains `--migrate` or `--migrate-dry-run`, skip steps 1–8 Detailed tracker-mode switching procedures live in `docs/setup-tracker-mode.md`. +### 0c. Contract-enforcement re-invocation flag + +When `$ARGUMENTS` contains `--hooks`, skip steps 1–6 and 7–8 entirely — this flag **re-runs only the hooks step** (skips stack detection + CLAUDE.md generation) and routes directly into step 6a's toolkit-contract enforcement hooks menu. Use this when re-running `/setup` against an already-bootstrapped project to add, remove, or reconsider opt-in hooks without regenerating the rest of the scaffold (STE-285 AC-STE-285.5). + +The re-run is **idempotent**: the menu is rendered with options **pre-checked** for currently-installed plugin hooks (detected via `readInstalledHookNames(settingsPath, pluginRoot)` from `install_hooks.ts`, which scans `.claude/settings.json` for entries whose `args[0]` points under `${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/.sh`). The operator may: + +- **toggle off** an already-installed hook (uncheck it → the merge step removes it from `.claude/settings.json`); +- **add** a new hook (check a previously-unchecked option → the merge step installs it); +- **leave the selection unchanged** (re-confirming the current state is a no-op write). + +Because the menu reflects on-disk state, running `/setup --hooks` twice in a row against the same project surfaces an identical pre-checked menu — re-running on a project with already-installed hooks shows the menu pre-checked for installed hooks, allowing toggle off or addition. After the menu is answered, control jumps to step 11 (Report) — no bootstrap commit is produced; the hooks-only delta lands as a regular subsequent commit (see § After /setup). + ### 1. Detect the project Check for project files (`package.json`, `pubspec.yaml`, `pyproject.toml`, `go.mod`, `Cargo.toml`, etc.) and source directories. @@ -140,6 +152,24 @@ Create `CLAUDE.md` based on the template, filling in: project name and descripti 4. If it exists: read + parse it, merge via `mergeAllowList(existing, canonical)`, write back. Preserves user additions; never strips entries. Malformed pre-existing JSON → abort with NFR-10 canonical shape. 5. On any write failure (sandbox refusal, ENOSPC, etc.), emit NFR-10-shape error and exit non-zero. Never continue with a partial scaffold. +### 6a. Toolkit-contract enforcement hooks menu + +`default: none-selected` — fire a single `AskUserQuestion` (multi-select: true) listing the seeded toolkit-contract enforcement hooks. All options default to off; the user picks zero or more. Skip nothing — the menu fires unconditionally after stack detection and before the final Report step (STE-285 AC-STE-285.1). + +The four seeded options (one per honored-contract enforcement hook): + +- `pre-commit-gate-check` — run `/gate-check` before each commit; abort the commit on red. +- `pre-pr-spec-review` — run `/spec-review` before opening a PR; surface deviations as PR-blocking. +- `pre-spec-write-brainstorm-reminder` — remind the operator to run `/brainstorm` before `/spec-write` for greenfield FRs. +- `pre-commit-tdd-orchestrator` — block commits that touch implementation files without a paired test diff (TDD-orchestrator guard). + +The prompt is **multi-select** — the operator may pick any subset, including the empty set. **All options default to off** so re-running `/setup` against a hook-free project is byte-identical to the previous flow. Selected hooks are written into the user's `.claude/settings.json` via key-level merge by `install_hooks.ts` (`installHooks(settingsPath, hookNames, pluginRoot)`); each entry uses exec form `"command": "bash", "args": ["${CLAUDE_PLUGIN_ROOT}/templates/hooks/process/.sh"]` per AC-STE-285.3. Unselected hooks are no-ops; existing unrelated settings entries are preserved verbatim. + +**Report capability rows (STE-285 AC-STE-285.4).** Step 11's final summary surfaces exactly one of the following keys per the static plain-language map (mirrors the `docs_default_applied` precedent at step 7d): + +- `hooks_installed` — `Installed N opt-in toolkit-contract enforcement hook(s): — toggle off any hook by editing .claude/settings.json` (emitted when the operator selected one or more hooks; `N` and `` are interpolated from the picked subset). +- `hooks_skipped` — `User declined opt-in hooks during /setup — run /setup --hooks to reconsider, or edit .claude/settings.json manually` (emitted when the operator picked the empty set, including the default-applied autonomous-mode path). + ### 6b. Install commit-msg hook `default: shell` — install the POSIX-shell Conventional-Commits hook (zero-dep). `$ARGUMENTS` contains `--commitlint` ⇒ install the commitlint-delegating variant. Skip entirely when `.git/hooks/` is absent (log `setup: skipping commit-msg hook (not a git repo)`). diff --git a/plugins/dev-process-toolkit/skills/setup/__tests__/hooks_menu_prompt.test.ts b/plugins/dev-process-toolkit/skills/setup/__tests__/hooks_menu_prompt.test.ts new file mode 100644 index 0000000..6989bb0 --- /dev/null +++ b/plugins/dev-process-toolkit/skills/setup/__tests__/hooks_menu_prompt.test.ts @@ -0,0 +1,118 @@ +import { describe, expect, test } from "bun:test"; +import { readFileSync } from "node:fs"; +import { join } from "node:path"; + +// STE-285 AC-STE-285.1 — `/setup` hooks-menu prompt step. +// +// `/setup` adds a new step (after stack detection, before the final summary +// report) that fires a single `AskUserQuestion` with multi-select options. +// Each option is a named toolkit-contract enforcement hook. All options +// default to off. User picks zero or more. +// +// This is doc-conformance: the SKILL.md prose must document the new step +// with (a) an `AskUserQuestion` directive, (b) the multi-select shape, +// (c) ≥ 4 named hook options, (d) all-off default. + +const SKILL_PATH = join( + import.meta.dir, + "..", + "SKILL.md", +); + +const SEEDED_HOOKS = [ + "pre-commit-gate-check", + "pre-pr-spec-review", + "pre-spec-write-brainstorm-reminder", + "pre-commit-tdd-orchestrator", +]; + +function read(): string { + return readFileSync(SKILL_PATH, "utf-8"); +} + +/** + * Locate the hooks-menu step. The AC says it lands "after stack detection, + * before the final summary report" — that puts it after step 7-series and + * before step 11 (Report). We anchor on a heading whose body names the + * `AskUserQuestion` directive AND at least one seeded hook name. + */ +function hooksStepRegion(body: string): string { + // Search for the "hooks" step anchor. Accept any of these heading shapes: + // "### 7f. Hooks menu" / "### 7f. Toolkit-contract enforcement hooks" + // "### 8c. Hooks menu" — orchestrator may pick any free sub-step slot. + // The required predicates are (1) `AskUserQuestion` text and (2) ≥ 1 + // seeded hook name appearing in the same region. + const lines = body.split("\n"); + let startIdx = -1; + for (let i = 0; i < lines.length; i++) { + const line = lines[i] ?? ""; + if (/^###\s/.test(line) && /hook/i.test(line)) { + startIdx = i; + break; + } + } + if (startIdx === -1) { + return ""; + } + // Region extends to the next `### ` or `## ` heading. + let endIdx = lines.length; + for (let i = startIdx + 1; i < lines.length; i++) { + const line = lines[i] ?? ""; + if (/^###?\s/.test(line)) { + endIdx = i; + break; + } + } + return lines.slice(startIdx, endIdx).join("\n"); +} + +describe("AC-STE-285.1 — hooks-menu step is documented in /setup SKILL.md", () => { + test("SKILL.md carries a hooks step heading", () => { + const region = hooksStepRegion(read()); + expect(region.length).toBeGreaterThan(0); + }); + + test("hooks step body directs `AskUserQuestion` as the prompt mechanism", () => { + const region = hooksStepRegion(read()); + expect(region).toContain("AskUserQuestion"); + }); + + test("hooks step body declares multi-select shape", () => { + const region = hooksStepRegion(read()); + expect(region).toMatch(/multi-select|multi select|multiselect/i); + }); + + test("hooks step body documents the all-off default", () => { + const region = hooksStepRegion(read()); + // Accept any of: "all options default to off", "default: off", + // "all defaulted off", "defaulted off". + expect(region).toMatch(/default(ed| to|s|:)?\s*(to\s*)?off|all\s+off|all\s+defaulted\s+off|all\s+default(ed)?\s+(to\s+)?off/i); + }); + + test("hooks step body names ≥ 4 seeded hooks as options", () => { + const region = hooksStepRegion(read()); + const named = SEEDED_HOOKS.filter((h) => region.includes(h)); + expect(named.length).toBeGreaterThanOrEqual(4); + }); + + test("hooks step is placed after stack detection and before the final Report step", () => { + const body = read(); + const stackDetectIdx = body.indexOf("### 1. Detect the project"); + expect(stackDetectIdx).toBeGreaterThan(-1); + const reportIdx = body.indexOf("### 11. Report"); + expect(reportIdx).toBeGreaterThan(-1); + // Find the hooks step header by name. + const lines = body.split("\n"); + let hooksHeadingOffset = -1; + let runningOffset = 0; + for (const line of lines) { + if (/^###\s/.test(line) && /hook/i.test(line)) { + hooksHeadingOffset = runningOffset; + break; + } + runningOffset += line.length + 1; + } + expect(hooksHeadingOffset).toBeGreaterThan(stackDetectIdx); + expect(hooksHeadingOffset).toBeLessThan(reportIdx); + }); +}); diff --git a/plugins/dev-process-toolkit/skills/setup/__tests__/hooks_merge_settings.test.ts b/plugins/dev-process-toolkit/skills/setup/__tests__/hooks_merge_settings.test.ts new file mode 100644 index 0000000..6ddad3c --- /dev/null +++ b/plugins/dev-process-toolkit/skills/setup/__tests__/hooks_merge_settings.test.ts @@ -0,0 +1,182 @@ +import { describe, expect, test } from "bun:test"; +import { mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { + installHooks, + mergeHooksIntoSettings, +} from "../install_hooks"; + +// STE-285 AC-STE-285.3 — install_hooks.ts merge helper. +// +// Selected hooks write into the user's `.claude/settings.json` via key-level +// merge (existing entries preserved). Each entry uses exec form: +// +// { "command": "bash", +// "args": ["/templates/hooks/process/.sh"], +// "timeout": 5000 } +// +// Conflict resolution: +// - same matcher + identical command → no-op (idempotent re-run) +// - same matcher + different command → diff + prompt (per STE-133) +// +// Fixtures: (a) empty settings.json, (b) settings with unrelated hooks +// preserved, (c) same matcher + identical command (no-op), +// (d) same matcher + different command (conflict surfaced). + +const SAMPLE_PLUGIN_ROOT = "/plugin/dev-process-toolkit"; + +function hookEntry(name: string, matcher: string, event: string): { + event: string; + matcher: string; + hook: { + type: "command"; + command: string; + args: string[]; + timeout?: number; + }; +} { + return { + event, + matcher, + hook: { + type: "command", + command: "bash", + args: [`${SAMPLE_PLUGIN_ROOT}/templates/hooks/process/${name}.sh`], + timeout: 5000, + }, + }; +} + +describe("AC-STE-285.3 — mergeHooksIntoSettings: empty settings.json (case a)", () => { + test("merges into an empty object", () => { + const additions = [hookEntry("pre-commit-gate-check", "Bash", "PreToolUse")]; + const { merged, conflicts } = mergeHooksIntoSettings({}, additions); + expect(conflicts).toEqual([]); + expect(merged.hooks?.PreToolUse).toBeDefined(); + expect(merged.hooks!.PreToolUse!.length).toBeGreaterThanOrEqual(1); + const installed = merged.hooks!.PreToolUse![0]!; + expect(installed.matcher).toBe("Bash"); + expect(installed.hooks[0]!.command).toBe("bash"); + expect(installed.hooks[0]!.args).toContain( + `${SAMPLE_PLUGIN_ROOT}/templates/hooks/process/pre-commit-gate-check.sh`, + ); + }); +}); + +describe("AC-STE-285.3 — mergeHooksIntoSettings: preserves unrelated existing entries (case b)", () => { + test("user's custom hook entries are not stripped", () => { + const existing = { + hooks: { + PreToolUse: [ + { + matcher: "Edit", + hooks: [ + { + type: "command", + command: "bash", + args: ["/Users/me/custom-hook.sh"], + }, + ], + }, + ], + }, + }; + const additions = [hookEntry("pre-commit-gate-check", "Bash", "PreToolUse")]; + const { merged, conflicts } = mergeHooksIntoSettings(existing, additions); + expect(conflicts).toEqual([]); + // Both the pre-existing Edit matcher and the new Bash matcher must survive. + const matchers = merged.hooks!.PreToolUse!.map((e) => e.matcher); + expect(matchers).toContain("Edit"); + expect(matchers).toContain("Bash"); + // The user's custom-hook command must still be present. + const flatCommands = merged + .hooks!.PreToolUse!.flatMap((e) => e.hooks.map((h) => h.args?.[0] ?? "")) + .filter(Boolean); + expect(flatCommands.some((c) => c.includes("custom-hook.sh"))).toBe(true); + }); +}); + +describe("AC-STE-285.3 — mergeHooksIntoSettings: idempotent re-run (case c)", () => { + test("same matcher + identical command → no duplicate entry", () => { + const additions = [hookEntry("pre-commit-gate-check", "Bash", "PreToolUse")]; + const first = mergeHooksIntoSettings({}, additions); + const second = mergeHooksIntoSettings(first.merged, additions); + expect(second.conflicts).toEqual([]); + // The args list for the Bash matcher should NOT carry the same script + // twice. Count occurrences of the script path across all entries. + const allArgs = second + .merged.hooks!.PreToolUse!.flatMap((e) => e.hooks.map((h) => h.args?.[0] ?? "")); + const hits = allArgs.filter((a) => + a.includes("pre-commit-gate-check.sh"), + ); + expect(hits.length).toBe(1); + }); +}); + +describe("AC-STE-285.3 — mergeHooksIntoSettings: conflict on differing command (case d)", () => { + test("same matcher + different command → conflict reported (no silent overwrite)", () => { + const existing = { + hooks: { + PreToolUse: [ + { + matcher: "Bash", + hooks: [ + { + type: "command", + command: "bash", + args: [ + // A hook at the SAME script path-suffix but with a DIFFERENT + // pinned plugin root (user manually pointed at a vendored + // copy). Conflict: same intent, different command. + "/user/vendored/templates/hooks/process/pre-commit-gate-check.sh", + ], + }, + ], + }, + ], + }, + }; + const additions = [hookEntry("pre-commit-gate-check", "Bash", "PreToolUse")]; + const { merged: _merged, conflicts } = mergeHooksIntoSettings( + existing, + additions, + ); + // Conflict surfaced — caller (skill prose) handles diff+prompt per STE-133. + expect(conflicts.length).toBeGreaterThan(0); + const first = conflicts[0]!; + expect(first.matcher).toBe("Bash"); + // Conflict carries both the existing and proposed commands for the prompt UX. + expect(typeof first.existingCommand).toBe("string"); + expect(typeof first.proposedCommand).toBe("string"); + expect(first.existingCommand).not.toBe(first.proposedCommand); + }); +}); + +describe("AC-STE-285.3 — installHooks: writes selected hooks to settings.json on disk", () => { + test("installHooks materializes a settings file when given a path + hook names", () => { + const dir = mkdtempSync(join(tmpdir(), "ste-285-merge-")); + const settingsPath = join(dir, "settings.json"); + writeFileSync(settingsPath, "{}"); + try { + const result = installHooks( + settingsPath, + ["pre-commit-gate-check"], + SAMPLE_PLUGIN_ROOT, + ); + // installHooks returns a MergeResult with `conflicts` field. + expect(Array.isArray(result.conflicts)).toBe(true); + const written = JSON.parse(readFileSync(settingsPath, "utf-8")); + // The hook entry landed on disk. + const allArgs: string[] = (written.hooks?.PreToolUse ?? []).flatMap( + (e: { hooks: Array<{ args?: string[] }> }) => + e.hooks.flatMap((h) => h.args ?? []), + ); + expect(allArgs.some((a) => a.includes("pre-commit-gate-check.sh"))).toBe( + true, + ); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); +}); diff --git a/plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_flag.test.ts b/plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_flag.test.ts new file mode 100644 index 0000000..6ce2492 --- /dev/null +++ b/plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_flag.test.ts @@ -0,0 +1,144 @@ +import { describe, expect, test } from "bun:test"; +import { mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { readInstalledHookNames } from "../install_hooks"; + +// STE-285 AC-STE-285.5 — `/setup --hooks` flag re-runs only the hooks step. +// +// Two-part test: +// (a) Doc conformance — `/setup` SKILL.md mentions `--hooks` flag with +// "re-runs only the hooks step" semantics. +// (b) Helper invocation — `install_hooks.ts` exports +// `readInstalledHookNames(settingsPath, pluginRoot)` returning the +// array of installed hook names (used for pre-checking the menu). + +const SKILL_PATH = join(import.meta.dir, "..", "SKILL.md"); + +function read(): string { + return readFileSync(SKILL_PATH, "utf-8"); +} + +describe("AC-STE-285.5 (a) — /setup SKILL.md documents --hooks flag", () => { + test("SKILL.md mentions --hooks flag verbatim", () => { + expect(read()).toContain("--hooks"); + }); + + test("SKILL.md states the flag re-runs only the hooks step (skips stack detection + CLAUDE.md generation)", () => { + const body = read(); + // Either "re-runs only the hooks step" verbatim, or load-bearing + // fragments separately. + expect(body).toMatch(/--hooks/); + expect(body).toMatch(/re-runs?\s+only.*hooks?\s+step|skip.*(stack detection|CLAUDE\.md generation)/i); + }); + + test("SKILL.md states the re-run is idempotent with pre-checked menu", () => { + const body = read(); + // Idempotency: menu pre-checks already-installed hooks. + expect(body).toMatch(/idempotent|pre-check(ed)?|already.installed/i); + }); +}); + +describe("AC-STE-285.5 (b) — readInstalledHookNames helper", () => { + const PLUGIN_ROOT = "/plugin/dev-process-toolkit"; + + function makeSettings(hookNames: string[]): string { + return JSON.stringify({ + hooks: { + PreToolUse: hookNames.map((name) => ({ + matcher: "Bash", + hooks: [ + { + type: "command", + command: "bash", + args: [`${PLUGIN_ROOT}/templates/hooks/process/${name}.sh`], + timeout: 5000, + }, + ], + })), + }, + }); + } + + test("returns empty array when settings.json is missing", () => { + const dir = mkdtempSync(join(tmpdir(), "ste-285-flag-")); + const settingsPath = join(dir, "missing.json"); + try { + const names = readInstalledHookNames(settingsPath, PLUGIN_ROOT); + expect(names).toEqual([]); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); + + test("returns empty array when settings.json has no hook entries", () => { + const dir = mkdtempSync(join(tmpdir(), "ste-285-flag-")); + const settingsPath = join(dir, "settings.json"); + writeFileSync(settingsPath, "{}"); + try { + const names = readInstalledHookNames(settingsPath, PLUGIN_ROOT); + expect(names).toEqual([]); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); + + test("returns names of currently installed plugin hooks", () => { + const dir = mkdtempSync(join(tmpdir(), "ste-285-flag-")); + const settingsPath = join(dir, "settings.json"); + writeFileSync( + settingsPath, + makeSettings(["pre-commit-gate-check", "pre-pr-spec-review"]), + ); + try { + const names = readInstalledHookNames(settingsPath, PLUGIN_ROOT); + expect(names.sort()).toEqual( + ["pre-commit-gate-check", "pre-pr-spec-review"].sort(), + ); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); + + test("ignores non-plugin hooks (user's own custom-hook.sh)", () => { + const dir = mkdtempSync(join(tmpdir(), "ste-285-flag-")); + const settingsPath = join(dir, "settings.json"); + writeFileSync( + settingsPath, + JSON.stringify({ + hooks: { + PreToolUse: [ + { + matcher: "Edit", + hooks: [ + { + type: "command", + command: "bash", + args: ["/Users/me/custom-hook.sh"], + }, + ], + }, + { + matcher: "Bash", + hooks: [ + { + type: "command", + command: "bash", + args: [ + `${PLUGIN_ROOT}/templates/hooks/process/pre-commit-gate-check.sh`, + ], + }, + ], + }, + ], + }, + }), + ); + try { + const names = readInstalledHookNames(settingsPath, PLUGIN_ROOT); + expect(names).toEqual(["pre-commit-gate-check"]); + } finally { + rmSync(dir, { recursive: true, force: true }); + } + }); +}); diff --git a/plugins/dev-process-toolkit/skills/setup/install_hooks.ts b/plugins/dev-process-toolkit/skills/setup/install_hooks.ts new file mode 100644 index 0000000..099f68c --- /dev/null +++ b/plugins/dev-process-toolkit/skills/setup/install_hooks.ts @@ -0,0 +1,292 @@ +// `/setup` hook installer + merge helper (STE-285 AC-STE-285.3). +// +// Selected hooks are written into the user's `.claude/settings.json` via +// key-level merge — existing entries are preserved. Each plugin hook +// uses exec form: +// +// { "type": "command", "command": "bash", +// "args": ["/templates/hooks/process/.sh"], +// "timeout": 5000 } +// +// Conflict resolution (per STE-133): +// - same matcher + identical command → no-op (idempotent re-run) +// - same matcher + different command → conflict surfaced to caller +// (the skill prose handles the diff + prompt UX) +// +// Settings shape (Claude Code hooks v1): +// +// { "hooks": { "": [ +// { "matcher": "", +// "hooks": [{ "type": "command", "command": "bash", +// "args": [...], "timeout"?: number }, ...] }, +// ... +// ] } } + +import { readFileSync, writeFileSync } from "node:fs"; + +// ----- Types ----- + +export type HookEntryCommand = { + type: "command"; + command: string; + args?: string[]; + timeout?: number; +}; + +export type MatcherEntry = { + matcher: string; + hooks: HookEntryCommand[]; +}; + +export type HooksByEvent = Record; + +export type Settings = { + hooks?: HooksByEvent; + // Other settings keys are preserved as-is. + [extra: string]: unknown; +}; + +/** A single addition: event + matcher + the hook descriptor to install. */ +export type HookAddition = { + event: string; + matcher: string; + hook: HookEntryCommand; +}; + +/** Conflict surfaced when same matcher carries a different command. */ +export type HookConflict = { + event: string; + matcher: string; + existingCommand: string; + proposedCommand: string; +}; + +export type MergeResult = { + merged: Settings; + conflicts: HookConflict[]; +}; + +// ----- Internals ----- + +/** + * Extract the script basename (e.g. `pre-commit-gate-check.sh`) from a + * hook command. The basename is the conflict-identity for "same hook" + * detection — a user-vendored copy at a different path still represents + * the same logical hook. + */ +function scriptBasename(hook: HookEntryCommand): string | null { + const first = hook.args?.[0]; + if (!first) return null; + const slash = first.lastIndexOf("/"); + return slash === -1 ? first : first.slice(slash + 1); +} + +/** + * Render a hook command as a single string for diff display in the + * conflict prompt. Format mirrors what the user sees in settings.json. + */ +function renderCommand(hook: HookEntryCommand): string { + const cmd = hook.command; + const args = (hook.args ?? []).join(" "); + return args ? `${cmd} ${args}` : cmd; +} + +/** Deep clone via JSON round-trip — settings are pure data. */ +function cloneSettings(s: Settings): Settings { + return JSON.parse(JSON.stringify(s)) as Settings; +} + +// ----- Public API ----- + +/** + * Key-level merge of `additions` into `existing`. Existing entries + * (unrelated event keys, unrelated matchers, unrelated commands under + * the same matcher) are preserved verbatim. + * + * For each addition: + * - if the matcher does not yet exist for that event → create it, + * append the addition's hook; + * - if the matcher exists and already carries a hook with the SAME + * script basename and SAME args[0] → no-op (idempotent); + * - if the matcher exists and carries a hook with the SAME script + * basename but a DIFFERENT args[0] → conflict surfaced (no write); + * - otherwise → append the addition's hook alongside. + */ +export function mergeHooksIntoSettings( + existing: Settings, + additions: HookAddition[], +): MergeResult { + const merged = cloneSettings(existing); + if (!merged.hooks) merged.hooks = {}; + const conflicts: HookConflict[] = []; + + for (const addition of additions) { + const { event, matcher, hook } = addition; + const proposedBase = scriptBasename(hook); + const proposedFirstArg = hook.args?.[0] ?? ""; + + const list: MatcherEntry[] = merged.hooks[event] ?? []; + if (!merged.hooks[event]) merged.hooks[event] = list; + + // Locate any matcher entry that matches this addition's matcher. + const matcherEntry = list.find((e) => e.matcher === matcher); + + if (!matcherEntry) { + // New matcher → create with the proposed hook. + list.push({ matcher, hooks: [hook] }); + continue; + } + + // Same matcher already exists. Check for a hook with the same + // script basename. + const twin = matcherEntry.hooks.find( + (h) => scriptBasename(h) === proposedBase && proposedBase !== null, + ); + + if (!twin) { + // Same matcher, different basename → append alongside. + matcherEntry.hooks.push(hook); + continue; + } + + const twinFirstArg = twin.args?.[0] ?? ""; + + if (twinFirstArg === proposedFirstArg) { + // Idempotent — already installed, no-op. + continue; + } + + // Same basename, different args[0] → conflict. + conflicts.push({ + event, + matcher, + existingCommand: renderCommand(twin), + proposedCommand: renderCommand(hook), + }); + } + + return { merged, conflicts }; +} + +/** + * Per-hook event + matcher mapping for seeded plugin hooks. Hooks fire on + * different Claude Code events — three on PreToolUse Bash (gating + * `git commit` / `gh pr create`), one on UserPromptSubmit `*` (the + * spec-write greenfield reminder). Misregistering a hook under the wrong + * event silently no-ops it because Claude Code never dispatches it. + * + * Add a row here when seeding a new hook script under + * `templates/hooks/process/`; unmapped names fall back to PreToolUse Bash. + */ +const HOOK_REGISTRATIONS: Record = { + "pre-commit-gate-check": { event: "PreToolUse", matcher: "Bash" }, + "pre-pr-spec-review": { event: "PreToolUse", matcher: "Bash" }, + "pre-spec-write-brainstorm-reminder": { + event: "UserPromptSubmit", + matcher: "*", + }, + "pre-commit-tdd-orchestrator": { event: "PreToolUse", matcher: "Bash" }, +}; + +/** + * Build a HookAddition for one named plugin hook. Event + matcher come + * from `HOOK_REGISTRATIONS`; the command is always exec form + * `bash /templates/hooks/process/.sh` with a 5000 ms timeout. + */ +function additionFor(name: string, pluginRoot: string): HookAddition { + const reg = HOOK_REGISTRATIONS[name] ?? { event: "PreToolUse", matcher: "Bash" }; + return { + event: reg.event, + matcher: reg.matcher, + hook: { + type: "command", + command: "bash", + args: [`${pluginRoot}/templates/hooks/process/${name}.sh`], + timeout: 5000, + }, + }; +} + +/** + * Read `settingsPath`, merge the named plugin hooks via + * `mergeHooksIntoSettings`, and write the result back. Returns the + * MergeResult so the caller can surface any conflicts. + * + * Partial-write semantics: when `result.conflicts` is non-empty, the + * conflicting hook entries are NEVER added to the merged tree (the merge + * loop excludes them). The on-disk write reflects the non-conflicting + * subset only — no entry the caller has not already approved is silently + * overwritten. The SKILL.md prose is expected to surface the conflicts + * (diff + prompt per STE-133) and re-invoke with a resolution. + * + * If `settingsPath` is missing, treats existing settings as `{}`. A + * malformed (non-JSON) existing file aborts the call by throwing a + * SyntaxError — never silently overwritten. + */ +export function installHooks( + settingsPath: string, + hookNames: string[], + pluginRoot: string, +): MergeResult { + let existing: Settings = {}; + let raw: string | null = null; + try { + raw = readFileSync(settingsPath, "utf-8"); + } catch { + // Missing file → treat as empty. The caller wrote the file path + // explicitly; we don't need to materialize directories here. + raw = null; + } + if (raw !== null && raw.trim().length > 0) { + // Malformed JSON surfaces as SyntaxError — caller maps to NFR-10. + existing = JSON.parse(raw) as Settings; + } + + const additions = hookNames.map((n) => additionFor(n, pluginRoot)); + const result = mergeHooksIntoSettings(existing, additions); + + writeFileSync(settingsPath, JSON.stringify(result.merged, null, 2)); + return result; +} + +/** + * Return the list of plugin hook names currently installed in + * `settingsPath` (i.e. entries whose `args[0]` points under + * `/templates/hooks/process/.sh`). + * + * Used by `/setup --hooks` to pre-check the menu options on re-run. + * Missing settings.json → empty array. + */ +export function readInstalledHookNames( + settingsPath: string, + pluginRoot: string, +): string[] { + let raw: string; + try { + raw = readFileSync(settingsPath, "utf-8"); + } catch { + return []; + } + let parsed: Settings; + try { + parsed = JSON.parse(raw) as Settings; + } catch { + return []; + } + const events = parsed.hooks ?? {}; + const prefix = `${pluginRoot}/templates/hooks/process/`; + const names = new Set(); + for (const matcherList of Object.values(events)) { + for (const entry of matcherList ?? []) { + for (const hook of entry.hooks ?? []) { + const first = hook.args?.[0]; + if (!first) continue; + if (!first.startsWith(prefix)) continue; + const tail = first.slice(prefix.length); + if (!tail.endsWith(".sh")) continue; + names.add(tail.slice(0, -3)); + } + } + } + return [...names]; +} diff --git a/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-gate-check.test.ts b/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-gate-check.test.ts new file mode 100644 index 0000000..487350e --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-gate-check.test.ts @@ -0,0 +1,109 @@ +import { describe, expect, test } from "bun:test"; +import { existsSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +// STE-285 AC-STE-285.2 — `pre-commit-gate-check.sh` Process-category hook. +// +// PreToolUse Bash:`git commit*` → require a `Skill(/dev-process-toolkit:gate-check)` +// tool_use in current session; refuse with NFR-10 shape on miss. + +const HOOKS_DIR = join( + import.meta.dir, + "..", + "process", +); +const HOOK_PATH = join(HOOKS_DIR, "pre-commit-gate-check.sh"); + +interface RunResult { + exitCode: number; + stdout: string; + stderr: string; +} + +async function runHook(env: Record): Promise { + const proc = Bun.spawn(["bash", HOOK_PATH], { + env: { ...process.env, ...env }, + stdout: "pipe", + stderr: "pipe", + }); + const [stdout, stderr] = await Promise.all([ + new Response(proc.stdout).text(), + new Response(proc.stderr).text(), + ]); + const exitCode = await proc.exited; + return { exitCode, stdout, stderr }; +} + +function writeSessionJsonl(entries: unknown[]): string { + const dir = mkdtempSync(join(tmpdir(), "ste-285-gc-")); + const file = join(dir, "session.jsonl"); + writeFileSync(file, entries.map((e) => JSON.stringify(e)).join("\n") + "\n"); + return file; +} + +describe("AC-STE-285.2 — pre-commit-gate-check.sh: file exists with shebang", () => { + test("hook script exists at the documented path", () => { + expect(existsSync(HOOK_PATH)).toBe(true); + }); + + test("script starts with a shebang line", () => { + const content = readFileSync(HOOK_PATH, "utf-8"); + const firstLine = content.split("\n")[0] ?? ""; + expect(firstLine.startsWith("#!")).toBe(true); + }); +}); + +describe("AC-STE-285.2 — pre-commit-gate-check.sh: happy path (gate-check Skill tool_use present)", () => { + test("exit 0 when session log contains a /dev-process-toolkit:gate-check Skill tool_use", async () => { + const sessionFile = writeSessionJsonl([ + { + type: "tool_use", + name: "Skill", + input: { skill: "dev-process-toolkit:gate-check" }, + }, + ]); + try { + const r = await runHook({ CLAUDE_SESSION_FILE: sessionFile }); + expect(r.exitCode).toBe(0); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-commit-gate-check.sh: miss path (no gate-check Skill tool_use)", () => { + test("exit non-zero + NFR-10-shape stderr when session has no gate-check tool_use", async () => { + const sessionFile = writeSessionJsonl([ + { type: "tool_use", name: "Bash", input: { command: "ls" } }, + ]); + try { + const r = await runHook({ CLAUDE_SESSION_FILE: sessionFile }); + expect(r.exitCode).not.toBe(0); + // NFR-10 shape: verdict line + "Remedy:" + "Context:". + expect(r.stderr).toContain("Remedy:"); + expect(r.stderr).toContain("Context:"); + // The contract is documented — the hook must name what's required. + expect(r.stderr).toMatch(/gate-check/); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-commit-gate-check.sh: fail-open when CLAUDE_SESSION_FILE unset", () => { + test("exit 0 when CLAUDE_SESSION_FILE env var is unset (non-Claude commit)", async () => { + // Spawn without CLAUDE_SESSION_FILE; mimics a bare `git commit` outside + // a Claude Code session. Hook must not block legitimate non-Claude + // commits per the FR's "fail-open" design. + const cleanEnv = { ...process.env }; + delete cleanEnv.CLAUDE_SESSION_FILE; + const proc = Bun.spawn(["bash", HOOK_PATH], { + env: cleanEnv, + stdout: "pipe", + stderr: "pipe", + }); + const exitCode = await proc.exited; + expect(exitCode).toBe(0); + }); +}); diff --git a/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-tdd-orchestrator.test.ts b/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-tdd-orchestrator.test.ts new file mode 100644 index 0000000..5f4351b --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-tdd-orchestrator.test.ts @@ -0,0 +1,133 @@ +import { describe, expect, test } from "bun:test"; +import { existsSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +// STE-285 AC-STE-285.2 — `pre-commit-tdd-orchestrator.sh` Process hook. +// +// PreToolUse Bash:`git commit*` → if FR-related files staged +// (specs/frs/.md or matching test files), require a +// `Skill(/dev-process-toolkit:tdd)` tool_use in current session; refuse with +// NFR-10 shape on miss. **Byte-checkable continuation of STE-283's TDD +// Orchestrator Contract.** + +const HOOKS_DIR = join( + import.meta.dir, + "..", + "process", +); +const HOOK_PATH = join(HOOKS_DIR, "pre-commit-tdd-orchestrator.sh"); + +interface RunResult { + exitCode: number; + stdout: string; + stderr: string; +} + +async function runHook(env: Record): Promise { + const proc = Bun.spawn(["bash", HOOK_PATH], { + env: { ...process.env, ...env }, + stdout: "pipe", + stderr: "pipe", + }); + const [stdout, stderr] = await Promise.all([ + new Response(proc.stdout).text(), + new Response(proc.stderr).text(), + ]); + const exitCode = await proc.exited; + return { exitCode, stdout, stderr }; +} + +function writeSessionJsonl(entries: unknown[]): string { + const dir = mkdtempSync(join(tmpdir(), "ste-285-tdd-")); + const file = join(dir, "session.jsonl"); + writeFileSync(file, entries.map((e) => JSON.stringify(e)).join("\n") + "\n"); + return file; +} + +describe("AC-STE-285.2 — pre-commit-tdd-orchestrator.sh: file exists with shebang", () => { + test("hook script exists at the documented path", () => { + expect(existsSync(HOOK_PATH)).toBe(true); + }); + + test("script starts with a shebang line", () => { + const content = readFileSync(HOOK_PATH, "utf-8"); + const firstLine = content.split("\n")[0] ?? ""; + expect(firstLine.startsWith("#!")).toBe(true); + }); +}); + +describe("AC-STE-285.2 — pre-commit-tdd-orchestrator.sh: happy path (tdd Skill tool_use present)", () => { + test("FR file staged AND /tdd Skill tool_use in session → exit 0", async () => { + const sessionFile = writeSessionJsonl([ + { + type: "tool_use", + name: "Skill", + input: { skill: "dev-process-toolkit:tdd" }, + }, + ]); + try { + // Simulate staged FR file via env var the hook can read. + const r = await runHook({ + CLAUDE_SESSION_FILE: sessionFile, + CLAUDE_STAGED_FILES: "specs/frs/STE-285.md\nplugins/dev-process-toolkit/skills/setup/install_hooks.ts", + }); + expect(r.exitCode).toBe(0); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-commit-tdd-orchestrator.sh: miss path (FR file staged but no /tdd tool_use)", () => { + test("FR file staged + no /tdd Skill tool_use → exit non-zero + NFR-10 stderr", async () => { + const sessionFile = writeSessionJsonl([ + { type: "tool_use", name: "Bash", input: { command: "ls" } }, + ]); + try { + const r = await runHook({ + CLAUDE_SESSION_FILE: sessionFile, + CLAUDE_STAGED_FILES: "specs/frs/STE-285.md", + }); + expect(r.exitCode).not.toBe(0); + expect(r.stderr).toContain("Remedy:"); + expect(r.stderr).toContain("Context:"); + // The contract names /tdd as the required orchestrator. + expect(r.stderr).toMatch(/tdd/i); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-commit-tdd-orchestrator.sh: skip path (no FR-related files staged)", () => { + test("only non-FR files staged → exit 0 even with no /tdd tool_use", async () => { + const sessionFile = writeSessionJsonl([ + { type: "tool_use", name: "Bash", input: { command: "ls" } }, + ]); + try { + // CHANGELOG / docs-only / config-only commits don't need /tdd. + const r = await runHook({ + CLAUDE_SESSION_FILE: sessionFile, + CLAUDE_STAGED_FILES: "CHANGELOG.md\nREADME.md", + }); + expect(r.exitCode).toBe(0); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-commit-tdd-orchestrator.sh: fail-open when CLAUDE_SESSION_FILE unset", () => { + test("exit 0 when CLAUDE_SESSION_FILE env var is unset", async () => { + const cleanEnv = { ...process.env }; + delete cleanEnv.CLAUDE_SESSION_FILE; + const proc = Bun.spawn(["bash", HOOK_PATH], { + env: cleanEnv, + stdout: "pipe", + stderr: "pipe", + }); + const exitCode = await proc.exited; + expect(exitCode).toBe(0); + }); +}); diff --git a/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-pr-spec-review.test.ts b/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-pr-spec-review.test.ts new file mode 100644 index 0000000..b8320af --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-pr-spec-review.test.ts @@ -0,0 +1,104 @@ +import { describe, expect, test } from "bun:test"; +import { existsSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +// STE-285 AC-STE-285.2 — `pre-pr-spec-review.sh` Process-category hook. +// +// PreToolUse Bash:`gh pr create*` → require a `Skill(/dev-process-toolkit:spec-review)` +// tool_use in current session; refuse with NFR-10 shape on miss. + +const HOOKS_DIR = join( + import.meta.dir, + "..", + "process", +); +const HOOK_PATH = join(HOOKS_DIR, "pre-pr-spec-review.sh"); + +interface RunResult { + exitCode: number; + stdout: string; + stderr: string; +} + +async function runHook(env: Record): Promise { + const proc = Bun.spawn(["bash", HOOK_PATH], { + env: { ...process.env, ...env }, + stdout: "pipe", + stderr: "pipe", + }); + const [stdout, stderr] = await Promise.all([ + new Response(proc.stdout).text(), + new Response(proc.stderr).text(), + ]); + const exitCode = await proc.exited; + return { exitCode, stdout, stderr }; +} + +function writeSessionJsonl(entries: unknown[]): string { + const dir = mkdtempSync(join(tmpdir(), "ste-285-spec-review-")); + const file = join(dir, "session.jsonl"); + writeFileSync(file, entries.map((e) => JSON.stringify(e)).join("\n") + "\n"); + return file; +} + +describe("AC-STE-285.2 — pre-pr-spec-review.sh: file exists with shebang", () => { + test("hook script exists at the documented path", () => { + expect(existsSync(HOOK_PATH)).toBe(true); + }); + + test("script starts with a shebang line", () => { + const content = readFileSync(HOOK_PATH, "utf-8"); + const firstLine = content.split("\n")[0] ?? ""; + expect(firstLine.startsWith("#!")).toBe(true); + }); +}); + +describe("AC-STE-285.2 — pre-pr-spec-review.sh: happy path (spec-review Skill tool_use present)", () => { + test("exit 0 when session has a /dev-process-toolkit:spec-review Skill tool_use", async () => { + const sessionFile = writeSessionJsonl([ + { + type: "tool_use", + name: "Skill", + input: { skill: "dev-process-toolkit:spec-review" }, + }, + ]); + try { + const r = await runHook({ CLAUDE_SESSION_FILE: sessionFile }); + expect(r.exitCode).toBe(0); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-pr-spec-review.sh: miss path (no spec-review Skill tool_use)", () => { + test("exit non-zero + NFR-10-shape stderr when session has no spec-review tool_use", async () => { + const sessionFile = writeSessionJsonl([ + { type: "tool_use", name: "Bash", input: { command: "ls" } }, + ]); + try { + const r = await runHook({ CLAUDE_SESSION_FILE: sessionFile }); + expect(r.exitCode).not.toBe(0); + expect(r.stderr).toContain("Remedy:"); + expect(r.stderr).toContain("Context:"); + expect(r.stderr).toMatch(/spec-review/); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-pr-spec-review.sh: fail-open when CLAUDE_SESSION_FILE unset", () => { + test("exit 0 when CLAUDE_SESSION_FILE env var is unset", async () => { + const cleanEnv = { ...process.env }; + delete cleanEnv.CLAUDE_SESSION_FILE; + const proc = Bun.spawn(["bash", HOOK_PATH], { + env: cleanEnv, + stdout: "pipe", + stderr: "pipe", + }); + const exitCode = await proc.exited; + expect(exitCode).toBe(0); + }); +}); diff --git a/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-spec-write-brainstorm-reminder.test.ts b/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-spec-write-brainstorm-reminder.test.ts new file mode 100644 index 0000000..cee1110 --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/__tests__/pre-spec-write-brainstorm-reminder.test.ts @@ -0,0 +1,132 @@ +import { describe, expect, test } from "bun:test"; +import { existsSync, mkdtempSync, readFileSync, rmSync, writeFileSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; + +// STE-285 AC-STE-285.2 — `pre-spec-write-brainstorm-reminder.sh` Process hook. +// +// UserPromptSubmit on `/dev-process-toolkit:spec-write` → if no +// `Skill(/dev-process-toolkit:brainstorm)` tool_use in current session AND +// no resolved tracker ID arg, inject a stderr reminder to consider +// `/brainstorm` first. +// +// This is a reminder hook (not a refusal hook). It MAY exit non-zero to +// signal "advisory", or exit 0 with stderr output, depending on the hook +// type. The contract: when triggered, stderr must contain a brainstorm +// reminder. When not triggered (brainstorm already fired OR tracker ID +// supplied), no reminder appears. + +const HOOKS_DIR = join( + import.meta.dir, + "..", + "process", +); +const HOOK_PATH = join(HOOKS_DIR, "pre-spec-write-brainstorm-reminder.sh"); + +interface RunResult { + exitCode: number; + stdout: string; + stderr: string; +} + +async function runHook( + env: Record, + stdinPayload: string = "", +): Promise { + const proc = Bun.spawn(["bash", HOOK_PATH], { + env: { ...process.env, ...env }, + stdin: stdinPayload ? new Response(stdinPayload).body : "ignore", + stdout: "pipe", + stderr: "pipe", + }); + const [stdout, stderr] = await Promise.all([ + new Response(proc.stdout).text(), + new Response(proc.stderr).text(), + ]); + const exitCode = await proc.exited; + return { exitCode, stdout, stderr }; +} + +function writeSessionJsonl(entries: unknown[]): string { + const dir = mkdtempSync(join(tmpdir(), "ste-285-bs-")); + const file = join(dir, "session.jsonl"); + writeFileSync(file, entries.map((e) => JSON.stringify(e)).join("\n") + "\n"); + return file; +} + +describe("AC-STE-285.2 — pre-spec-write-brainstorm-reminder.sh: file exists with shebang", () => { + test("hook script exists at the documented path", () => { + expect(existsSync(HOOK_PATH)).toBe(true); + }); + + test("script starts with a shebang line", () => { + const content = readFileSync(HOOK_PATH, "utf-8"); + const firstLine = content.split("\n")[0] ?? ""; + expect(firstLine.startsWith("#!")).toBe(true); + }); +}); + +describe("AC-STE-285.2 — pre-spec-write-brainstorm-reminder.sh: miss-path triggers reminder", () => { + test("no brainstorm tool_use AND no tracker ID arg → stderr carries a brainstorm reminder", async () => { + const sessionFile = writeSessionJsonl([ + { type: "tool_use", name: "Bash", input: { command: "ls" } }, + ]); + try { + // Greenfield invocation: bare `/dev-process-toolkit:spec-write` with + // no tracker arg. Surface the user prompt to the hook via env or + // stdin — both shapes are accepted; we test env-var form. + const r = await runHook({ + CLAUDE_SESSION_FILE: sessionFile, + CLAUDE_USER_PROMPT: "/dev-process-toolkit:spec-write", + }); + // The hook must NOT block (UserPromptSubmit hooks may inject a + // reminder via stderr but should not refuse the prompt outright). + // Reminder content: must mention brainstorm. + expect(r.stderr).toMatch(/brainstorm/i); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-spec-write-brainstorm-reminder.sh: happy path (brainstorm already fired)", () => { + test("brainstorm Skill tool_use in session → no reminder injected", async () => { + const sessionFile = writeSessionJsonl([ + { + type: "tool_use", + name: "Skill", + input: { skill: "dev-process-toolkit:brainstorm" }, + }, + ]); + try { + const r = await runHook({ + CLAUDE_SESSION_FILE: sessionFile, + CLAUDE_USER_PROMPT: "/dev-process-toolkit:spec-write", + }); + // Brainstorm fired ⇒ no reminder ⇒ empty stderr. + expect(r.stderr).not.toMatch(/brainstorm.*reminder|consider.*brainstorm|run.*brainstorm/i); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); + +describe("AC-STE-285.2 — pre-spec-write-brainstorm-reminder.sh: happy path (tracker ID arg supplied)", () => { + test("tracker ID arg in user prompt → no reminder injected (not greenfield)", async () => { + const sessionFile = writeSessionJsonl([ + { type: "tool_use", name: "Bash", input: { command: "ls" } }, + ]); + try { + const r = await runHook({ + CLAUDE_SESSION_FILE: sessionFile, + // Tracker-mode invocation with explicit ticket reference. Heuristic + // per FR: presence of a tracker-style ID (e.g., STE-123, PROJ-456) + // marks the FR as non-greenfield, so no reminder is needed. + CLAUDE_USER_PROMPT: "/dev-process-toolkit:spec-write STE-285", + }); + expect(r.stderr).not.toMatch(/brainstorm.*reminder|consider.*brainstorm|run.*brainstorm/i); + } finally { + rmSync(sessionFile, { force: true }); + } + }); +}); diff --git a/plugins/dev-process-toolkit/templates/hooks/_lib/session.sh b/plugins/dev-process-toolkit/templates/hooks/_lib/session.sh new file mode 100755 index 0000000..f9dff44 --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/_lib/session.sh @@ -0,0 +1,92 @@ +#!/usr/bin/env bash +# Shared helper for Process-category enforcement hooks (STE-285). +# +# Reads the current Claude Code session log (JSONL stream at +# $CLAUDE_SESSION_FILE) and looks for a `Skill` tool_use entry naming a +# specific skill. Fail-open when $CLAUDE_SESSION_FILE is unset (hook invoked +# outside a Claude Code session, e.g. a bare `git commit`). +# +# Public API: +# require_skill_tool_use +# - exits 0 on hit (skill tool_use found in session log). +# - exits 0 if $CLAUDE_SESSION_FILE is unset (fail-open). +# - exits 1 with NFR-10-shape stderr on miss. +# +# has_skill_tool_use +# - returns 0 on hit, 1 on miss. Does not emit anything. Fail-open +# (returns 0) when $CLAUDE_SESSION_FILE is unset. + +# Return 0 if a Skill tool_use for $1 exists in $CLAUDE_SESSION_FILE. +# Fail-open (return 0) if $CLAUDE_SESSION_FILE is unset or missing. +has_skill_tool_use() { + local skill="$1" + if [ -z "${CLAUDE_SESSION_FILE:-}" ]; then + return 0 + fi + if [ ! -f "$CLAUDE_SESSION_FILE" ]; then + return 0 + fi + # JSONL: each line is a JSON object. Match a tool_use entry where both + # name=="Skill" AND input.skill=="" appear on the SAME line — two + # separate greps would false-positive on unrelated Skill tool_uses for a + # different skill plus an unrelated mention of on another line. + # `-F` is fixed-string (no regex), neutralising any metacharacter content + # an attacker-influenced skill name might carry. + local needle_name='"name":"Skill"' + local needle_skill="\"skill\":\"${skill}\"" + if grep -F "$needle_name" "$CLAUDE_SESSION_FILE" 2>/dev/null \ + | grep -qF "$needle_skill"; then + return 0 + fi + return 1 +} + +# Emit a 3-line NFR-10-shape block to stderr. +# $1 = verdict prefix word (e.g. "Refusing", "Reminder") +# $2 = WHY — one-line reason +# $3 = HOW — one-line remediation +# $4 = skill name (for the Context tail) +# $5 = hook name (for the Context tail) +emit_nfr10_block() { + local verdict="$1" + local why="$2" + local how="$3" + local skill="$4" + local hook="$5" + { + echo "${verdict}: ${why}" + echo "Remedy: ${how}" + echo "Context: mode=hook, ticket=unbound, skill=${skill}, hook=${hook}" + } >&2 +} + +# Emit NFR-10-shape refusal to stderr (does not exit). +# $1 = skill name (e.g. dev-process-toolkit:gate-check) +# $2 = hook name (e.g. pre-commit-gate-check) +emit_nfr10_refusal() { + local skill="$1" + local hook="$2" + emit_nfr10_block \ + "Refusing" \ + "required ${skill} Skill tool_use not found in current session." \ + "run /${skill} before retrying this action." \ + "${skill}" \ + "${hook}" +} + +# require_skill_tool_use +# - Fail-open on unset $CLAUDE_SESSION_FILE. +# - Hit ⇒ exit 0. +# - Miss ⇒ emit NFR-10 stderr + exit 1. +require_skill_tool_use() { + local skill="$1" + local hook="$2" + if [ -z "${CLAUDE_SESSION_FILE:-}" ]; then + exit 0 + fi + if has_skill_tool_use "$skill"; then + exit 0 + fi + emit_nfr10_refusal "$skill" "$hook" + exit 1 +} diff --git a/plugins/dev-process-toolkit/templates/hooks/process/pre-commit-gate-check.sh b/plugins/dev-process-toolkit/templates/hooks/process/pre-commit-gate-check.sh new file mode 100755 index 0000000..41127ec --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/process/pre-commit-gate-check.sh @@ -0,0 +1,14 @@ +#!/usr/bin/env bash +# STE-285 AC-STE-285.2 — PreToolUse Bash:`git commit*` hook. +# +# Require a Skill(/dev-process-toolkit:gate-check) tool_use in the current +# Claude Code session log. Refuse with NFR-10-shape stderr on miss. +# Fail-open when $CLAUDE_SESSION_FILE is unset (non-Claude commits). + +set -u + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +# shellcheck source=../_lib/session.sh +. "${SCRIPT_DIR}/../_lib/session.sh" + +require_skill_tool_use "dev-process-toolkit:gate-check" "pre-commit-gate-check" diff --git a/plugins/dev-process-toolkit/templates/hooks/process/pre-commit-tdd-orchestrator.sh b/plugins/dev-process-toolkit/templates/hooks/process/pre-commit-tdd-orchestrator.sh new file mode 100755 index 0000000..ae3fd55 --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/process/pre-commit-tdd-orchestrator.sh @@ -0,0 +1,64 @@ +#!/usr/bin/env bash +# STE-285 AC-STE-285.2 — PreToolUse Bash:`git commit*` hook. +# +# If FR-related files are staged (specs/frs/.md or test files), require +# a Skill(/dev-process-toolkit:tdd) tool_use in the current session. +# Refuse with NFR-10-shape stderr on miss. Byte-checkable continuation of +# STE-283's TDD Orchestrator Contract. +# +# Fail-open when $CLAUDE_SESSION_FILE is unset. + +set -u + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +# shellcheck source=../_lib/session.sh +. "${SCRIPT_DIR}/../_lib/session.sh" + +# Fail-open: no Claude session ⇒ don't block the commit. +if [ -z "${CLAUDE_SESSION_FILE:-}" ]; then + exit 0 +fi + +# Resolve staged-files list. Prefer the explicit env var (test hook + future +# orchestrator integration). Fall back to `git diff --cached --name-only` +# inside a real repo. +STAGED_FILES="${CLAUDE_STAGED_FILES:-}" +if [ -z "$STAGED_FILES" ]; then + if command -v git >/dev/null 2>&1; then + # `|| true` is intentional fail-open: an unborn HEAD or detached repo + # state makes `git diff --cached` exit non-zero — we'd rather skip + # enforcement than refuse a legitimate commit on a fresh repo. The + # false-negative trade-off is documented in specs/frs/STE-285.md § Risks. + STAGED_FILES="$(git diff --cached --name-only 2>/dev/null || true)" + fi +fi + +# Heuristic for "FR-related staged": +# - specs/frs/.md (FR spec file itself) +# - any test file (path contains /__tests__/ OR ends in .test.ts/.spec.ts/.test.tsx) +fr_related=0 +while IFS= read -r path; do + [ -z "$path" ] && continue + case "$path" in + specs/frs/*.md) + fr_related=1 + break + ;; + *__tests__*|*.test.ts|*.test.tsx|*.spec.ts|*.spec.tsx|*.test.js|*.spec.js) + fr_related=1 + break + ;; + esac +done <<< "$STAGED_FILES" + +if [ "$fr_related" -eq 0 ]; then + # No FR-related files staged ⇒ commit is docs/config-only, skip enforcement. + exit 0 +fi + +if has_skill_tool_use "dev-process-toolkit:tdd"; then + exit 0 +fi + +emit_nfr10_refusal "dev-process-toolkit:tdd" "pre-commit-tdd-orchestrator" +exit 1 diff --git a/plugins/dev-process-toolkit/templates/hooks/process/pre-pr-spec-review.sh b/plugins/dev-process-toolkit/templates/hooks/process/pre-pr-spec-review.sh new file mode 100755 index 0000000..944bdf4 --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/process/pre-pr-spec-review.sh @@ -0,0 +1,14 @@ +#!/usr/bin/env bash +# STE-285 AC-STE-285.2 — PreToolUse Bash:`gh pr create*` hook. +# +# Require a Skill(/dev-process-toolkit:spec-review) tool_use in the current +# Claude Code session log. Refuse with NFR-10-shape stderr on miss. +# Fail-open when $CLAUDE_SESSION_FILE is unset. + +set -u + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +# shellcheck source=../_lib/session.sh +. "${SCRIPT_DIR}/../_lib/session.sh" + +require_skill_tool_use "dev-process-toolkit:spec-review" "pre-pr-spec-review" diff --git a/plugins/dev-process-toolkit/templates/hooks/process/pre-spec-write-brainstorm-reminder.sh b/plugins/dev-process-toolkit/templates/hooks/process/pre-spec-write-brainstorm-reminder.sh new file mode 100755 index 0000000..317a947 --- /dev/null +++ b/plugins/dev-process-toolkit/templates/hooks/process/pre-spec-write-brainstorm-reminder.sh @@ -0,0 +1,60 @@ +#!/usr/bin/env bash +# STE-285 AC-STE-285.2 — UserPromptSubmit hook on /dev-process-toolkit:spec-write. +# +# If no Skill(/dev-process-toolkit:brainstorm) tool_use is in the current +# session AND the user prompt has no resolved tracker ID arg (greenfield +# heuristic), inject a stderr reminder to consider /brainstorm first. +# +# This is an advisory hook (UserPromptSubmit). It never refuses the prompt; +# it only writes a reminder to stderr when triggered. + +set -u + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +# shellcheck source=../_lib/session.sh +. "${SCRIPT_DIR}/../_lib/session.sh" + +# Fail-open when there's no session context — we can't tell if brainstorm +# already ran. +if [ -z "${CLAUDE_SESSION_FILE:-}" ]; then + exit 0 +fi + +# Brainstorm already fired in this session ⇒ no reminder needed. +if has_skill_tool_use "dev-process-toolkit:brainstorm"; then + exit 0 +fi + +# Inspect the user prompt: env var first, fall back to stdin. +USER_PROMPT="${CLAUDE_USER_PROMPT:-}" +if [ -z "$USER_PROMPT" ] && [ ! -t 0 ]; then + USER_PROMPT="$(cat || true)" +fi + +# Only trigger on /dev-process-toolkit:spec-write invocations. +case "$USER_PROMPT" in + */dev-process-toolkit:spec-write*|/dev-process-toolkit:spec-write*) + : # matched, fall through + ;; + *) + exit 0 + ;; +esac + +# Tracker-ID heuristic: PROJECT-123 style token ⇒ not greenfield, skip. +# Match against the portion after the skill name. +if echo "$USER_PROMPT" | grep -Eq '[A-Z][A-Z0-9]+-[0-9]+'; then + exit 0 +fi + +# Greenfield invocation: emit advisory reminder (NFR-10 shape with +# Reminder: verdict, advisory only — does not block). +emit_nfr10_block \ + "Reminder" \ + "consider running /dev-process-toolkit:brainstorm before /spec-write for greenfield FRs." \ + "run /dev-process-toolkit:brainstorm to explore approach + tradeoffs, then re-invoke /spec-write." \ + "dev-process-toolkit:spec-write" \ + "pre-spec-write-brainstorm-reminder" + +# Advisory only — do not block. +exit 0 diff --git a/plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts b/plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts new file mode 100644 index 0000000..224835e --- /dev/null +++ b/plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts @@ -0,0 +1,111 @@ +import { describe, expect, test } from "bun:test"; +import { existsSync, readFileSync } from "node:fs"; +import { join } from "node:path"; + +// STE-285 AC-STE-285.6 — Cancellation chain citation. +// +// STE-262/STE-270/STE-276 cancellation chain explicitly cited in this FR's +// Notes section under "Why not session-wide bundled". Preserves the decision +// chain so future agents reading this FR understand the layer choice. +// +// AC verify line: `grep -E "STE-262|STE-270|STE-276" specs/frs/STE-285.md` +// returns ≥ 3 distinct refs in Notes section. +// +// STE-285 AC-STE-285.7 — `plugins/dev-process-toolkit/docs/hooks-reference.md` +// enumerates each seeded hook. AC verify: file exists; ≥ 4 level-3 headings. + +const REPO_ROOT = join(import.meta.dir, "..", "..", ".."); +const FR_PATH = join(REPO_ROOT, "specs", "frs", "STE-285.md"); +const HOOKS_REFERENCE_PATH = join( + REPO_ROOT, + "plugins", + "dev-process-toolkit", + "docs", + "hooks-reference.md", +); + +const SEEDED_HOOKS = [ + "pre-commit-gate-check", + "pre-pr-spec-review", + "pre-spec-write-brainstorm-reminder", + "pre-commit-tdd-orchestrator", +]; + +function readFr(): string { + return readFileSync(FR_PATH, "utf-8"); +} + +/** + * Slice the Notes section from the FR. Heading is `## Notes`; section + * extends to EOF or the next `## ` heading. + */ +function notesSection(body: string): string { + const notesIdx = body.indexOf("## Notes"); + if (notesIdx === -1) { + return ""; + } + const next = body.indexOf("\n## ", notesIdx + 1); + return next === -1 ? body.slice(notesIdx) : body.slice(notesIdx, next); +} + +describe("AC-STE-285.6 — cancellation chain cited in Notes section", () => { + test("Notes section exists in specs/frs/STE-285.md", () => { + const notes = notesSection(readFr()); + expect(notes.length).toBeGreaterThan(0); + }); + + test("Notes section contains ≥ 3 distinct refs to STE-262, STE-270, STE-276", () => { + const notes = notesSection(readFr()); + const refs = ["STE-262", "STE-270", "STE-276"]; + const hits = refs.filter((r) => notes.includes(r)); + expect(hits.length).toBeGreaterThanOrEqual(3); + }); + + test('Notes section carries the "Why not session-wide bundled" framing', () => { + const notes = notesSection(readFr()); + // The FR's Notes carries a "Why not session-wide bundled" subhead per + // the AC text. Accept exact phrase or close paraphrases that preserve + // the load-bearing fragments ("session-wide" + "bundled"). + expect(notes).toMatch(/session-wide.*bundled|bundled.*session-wide|not\s+session-wide\s+bundled/i); + }); +}); + +describe("AC-STE-285.7 — hooks-reference.md catalog enumerates each seeded hook", () => { + test("plugins/dev-process-toolkit/docs/hooks-reference.md exists", () => { + expect(existsSync(HOOKS_REFERENCE_PATH)).toBe(true); + }); + + test("hooks-reference.md has ≥ 4 level-3 (### ) headings (one section per seeded hook)", () => { + const body = readFileSync(HOOKS_REFERENCE_PATH, "utf-8"); + const h3Count = body + .split("\n") + .filter((line) => /^### /.test(line)).length; + expect(h3Count).toBeGreaterThanOrEqual(4); + }); + + test("hooks-reference.md names each of the four seeded hooks", () => { + const body = readFileSync(HOOKS_REFERENCE_PATH, "utf-8"); + for (const hook of SEEDED_HOOKS) { + expect(body).toContain(hook); + } + }); + + test("each seeded hook section documents event, matcher, requirement, refusal shape, override", () => { + const body = readFileSync(HOOKS_REFERENCE_PATH, "utf-8"); + // The catalog shape: each `### ` entry must carry the five canonical + // labels per AC.7's enumeration. We grep for each label across the file + // and require ≥ 4 hits (one per seeded hook). + const requiredLabels = [ + /event[:\s]/i, + /matcher[:\s]/i, + /requirement[:\s]/i, + /(refusal|NFR-10|miss)/i, + /override/i, + ]; + for (const label of requiredLabels) { + const lines = body.split("\n"); + const hits = lines.filter((l) => label.test(l)).length; + expect(hits).toBeGreaterThanOrEqual(4); + } + }); +}); diff --git a/plugins/dev-process-toolkit/tests/hooks-capability-rows.test.ts b/plugins/dev-process-toolkit/tests/hooks-capability-rows.test.ts new file mode 100644 index 0000000..304e666 --- /dev/null +++ b/plugins/dev-process-toolkit/tests/hooks-capability-rows.test.ts @@ -0,0 +1,53 @@ +import { describe, expect, test } from "bun:test"; +import { readFileSync } from "node:fs"; +import { join } from "node:path"; + +// STE-285 AC-STE-285.4 — `/setup` final summary report carries +// `hooks_installed` (or `hooks_skipped`) capability rows with the static +// plain-language map verbatim. Doc-conformance grep. + +const SKILL_PATH = join( + import.meta.dir, + "..", + "skills", + "setup", + "SKILL.md", +); + +function read(): string { + return readFileSync(SKILL_PATH, "utf-8"); +} + +describe("AC-STE-285.4 — hooks_installed capability row documented in /setup SKILL.md", () => { + test("SKILL.md mentions `hooks_installed` key at least once", () => { + const body = read(); + expect(body).toContain("hooks_installed"); + }); + + test("hooks_installed row text names the toggle-off remedy via .claude/settings.json", () => { + const body = read(); + // The static plain-language map text: + // "Installed N opt-in toolkit-contract enforcement hook(s): + // — toggle off any hook by editing .claude/settings.json" + // We check the load-bearing fragments rather than the exact string so + // minor wording adjustments don't churn the test. + expect(body).toMatch(/Installed.*opt-in.*toolkit-contract.*enforcement.*hook/i); + expect(body).toMatch(/toggle off.*\.claude\/settings\.json/i); + }); +}); + +describe("AC-STE-285.4 — hooks_skipped capability row documented in /setup SKILL.md", () => { + test("SKILL.md mentions `hooks_skipped` key at least once", () => { + const body = read(); + expect(body).toContain("hooks_skipped"); + }); + + test("hooks_skipped row text names the /setup --hooks remedy", () => { + const body = read(); + // The static plain-language map text: + // "User declined opt-in hooks during /setup — run /setup --hooks to + // reconsider, or edit .claude/settings.json manually" + expect(body).toMatch(/declined.*opt-in.*hook/i); + expect(body).toMatch(/\/setup\s+--hooks/); + }); +}); From 68ea88acb14906e2f061e7dc0908fa608da9e6bc Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 18:36:46 +0400 Subject: [PATCH 06/37] chore(specs): archive M71 (STE-283 + STE-285) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Atomic archival: move STE-283 + STE-285 FRs and the M71 plan file into archive/, flip status: active → archived + set archived_at timestamps, mark plan ACs [x], append shipped-AC traceability rows for both FRs to specs/requirements.md § 6, attach Implementation notes (advisory: STE-285 session.sh JSONL needle assumes compact serialization — documented in the FR's Risks table). Refs: M71 --- specs/frs/{ => archive}/STE-283.md | 8 ++++++-- specs/frs/{ => archive}/STE-285.md | 8 ++++++-- specs/plan/{ => archive}/M71.md | 8 ++++---- specs/requirements.md | 2 ++ 4 files changed, 18 insertions(+), 8 deletions(-) rename specs/frs/{ => archive}/STE-283.md (99%) rename specs/frs/{ => archive}/STE-285.md (96%) rename specs/plan/{ => archive}/M71.md (96%) diff --git a/specs/frs/STE-283.md b/specs/frs/archive/STE-283.md similarity index 99% rename from specs/frs/STE-283.md rename to specs/frs/archive/STE-283.md index f322f67..d19267a 100644 --- a/specs/frs/STE-283.md +++ b/specs/frs/archive/STE-283.md @@ -1,8 +1,8 @@ --- title: Honored Contracts enforcement — /implement → /tdd Contract block + Rationalization table + catalog milestone: M71 -status: active -archived_at: null +status: archived +archived_at: 2026-05-13T14:35:15Z tracker: linear: STE-283 created_at: 2026-05-13T10:28:02Z @@ -108,3 +108,7 @@ Deliverables are prose edits. No runtime behavior to assert. Test strategy: `bun - Escalation path if falsified: evidence-based gate (`tdd_orchestrator_invocation_evidence` /gate-check probe) → hard mechanic (STE-225-style first-action contract for Phase 2 step 8). - Parked follow-up: tracker ↔ local FR/milestone sync (a separate future-allocated FR after this one ships, candidate for M70 since theme fits runtime-emission) — the meta-bug behind the 2026-05-13 partial-scan trap. - Brainstorm session: 2026-05-13 (maximalist 1+2+3 design selected with explicit acknowledgement of STE-220→STE-270 precedent risk). + +## Implementation notes + +No advisory notes. diff --git a/specs/frs/STE-285.md b/specs/frs/archive/STE-285.md similarity index 96% rename from specs/frs/STE-285.md rename to specs/frs/archive/STE-285.md index 44cd540..6a1bd57 100644 --- a/specs/frs/STE-285.md +++ b/specs/frs/archive/STE-285.md @@ -1,8 +1,8 @@ --- title: Honored Contracts byte-checkable layer — opt-in toolkit-contract enforcement hooks via /setup milestone: M71 -status: active -archived_at: null +status: archived +archived_at: 2026-05-13T14:35:15Z tracker: linear: STE-285 created_at: 2026-05-13T13:29:22Z @@ -140,3 +140,7 @@ Test strategy: unit tests per hook script (mocked `$CLAUDE_SESSION_FILE`) + inte - **Update propagation.** Plugin updates the hook scripts → user's next session picks them up (no `/setup` re-run needed). Plugin-owned via `${CLAUDE_PLUGIN_ROOT}` substitution; same pattern as STE-133's commit-msg hook. - **STE-283 → STE-285 relationship.** STE-283 prose + STE-285 hooks together form the M71 Honored Contracts enforcement bundle. Independent ship gates — STE-285 doesn't depend on STE-283 shipping first, but the `docs/honored-contracts.md` catalog from STE-283 cross-references the hook IDs from STE-285. If both ship in M71's release commit, the cross-reference is byte-clean. - **Brainstorm session:** 2026-05-13 (Approach 1 = single-prompt all-off menu; Process-only after refinement from "all three categories"; M71 milestone fit). + +## Implementation notes + +- session.sh:35-36 — JSONL grep needle assumes compact JSON serialization (no space after colon). Reason: Claude Code emits compact JSONL via JSON.stringify; the test fixtures match. The broader format-fragility risk is already documented severity:medium in this FR's Risks table. A defensive regex would require dropping `-F` fixed-string mode without addressing the broader format-fragility class. diff --git a/specs/plan/M71.md b/specs/plan/archive/M71.md similarity index 96% rename from specs/plan/M71.md rename to specs/plan/archive/M71.md index c4c4fa8..9b74e4a 100644 --- a/specs/plan/M71.md +++ b/specs/plan/archive/M71.md @@ -1,6 +1,6 @@ --- -status: active -archived_at: null +status: archived +archived_at: 2026-05-13T14:35:15Z kind: feature --- @@ -21,9 +21,9 @@ Intentionally separated from M70's runtime-emission theme (STE-280/281/282). Tri ## In Scope -- [ ] STE-283 — Honored Contracts enforcement: /implement → /tdd Contract block + Rationalization table + catalog +- [x] STE-283 — Honored Contracts enforcement: /implement → /tdd Contract block + Rationalization table + catalog - verify: 7 ACs covering Contract block presence, Rationalization table, catalog file existence + entries, precedent FR citations, cross-references, residual-risk acknowledgement; tests in `plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts` -- [ ] STE-285 — Honored Contracts byte-checkable layer: opt-in toolkit-contract enforcement hooks via /setup +- [x] STE-285 — Honored Contracts byte-checkable layer: opt-in toolkit-contract enforcement hooks via /setup - verify: 7 ACs covering /setup menu prompt, ≥ 4 seeded process hooks (pre-commit-gate-check, pre-pr-spec-review, pre-spec-write-brainstorm-reminder, pre-commit-tdd-orchestrator), settings.json merge semantics (STE-133-style idempotent + diff-and-prompt), capability rows, `/setup --hooks` flag, STE-262/270/276 supersession framing, new `docs/hooks-reference.md`; tests in `plugins/dev-process-toolkit/{skills/setup,templates/hooks}/__tests__/`. ## Out of Scope (deferred) diff --git a/specs/requirements.md b/specs/requirements.md index edfcb66..29e3cef 100644 --- a/specs/requirements.md +++ b/specs/requirements.md @@ -327,4 +327,6 @@ Violations are review-blocking: a canonical reference doc that includes LLM-inve | AC-STE-238.1..8 | plugins/dev-process-toolkit/adapters/_shared/src/closing_summary_capability_keys.ts, plugins/dev-process-toolkit/adapters/_shared/src/tracker_probe_skip_reason.ts, plugins/dev-process-toolkit/skills/spec-write/SKILL.md, plugins/dev-process-toolkit/skills/gate-check/SKILL.md, .claude/skills/smoke-test/SKILL.md | plugins/dev-process-toolkit/adapters/_shared/src/tracker_probe_skip_reason.test.ts, plugins/dev-process-toolkit/tests/gate-check-closing-summary-capability-keys.test.ts | | AC-STE-251.1..5 | plugins/dev-process-toolkit/adapters/_shared/src/requires_input.ts, plugins/dev-process-toolkit/adapters/_shared/src/socratic_first_turn_post_hoc_drift.ts, plugins/dev-process-toolkit/skills/setup/SKILL.md, plugins/dev-process-toolkit/skills/brainstorm/SKILL.md, plugins/dev-process-toolkit/skills/spec-write/SKILL.md, plugins/dev-process-toolkit/skills/report-issue/SKILL.md, plugins/dev-process-toolkit/skills/gate-check/SKILL.md | plugins/dev-process-toolkit/adapters/_shared/src/requires_input.test.ts, plugins/dev-process-toolkit/adapters/_shared/src/socratic_first_turn_post_hoc_drift.test.ts, plugins/dev-process-toolkit/adapters/_shared/src/socratic_first_turn_replay.test.ts, plugins/dev-process-toolkit/tests/gate-check-socratic-first-turn-post-hoc-drift.test.ts | | AC-STE-252.1..6 | .claude/settings.json, .claude/skills/conformance-loop/SKILL.md, .claude/skills/smoke-test/SKILL.md, plugins/dev-process-toolkit/adapters/_shared/src/conformance_loop_bypass_removed.ts, plugins/dev-process-toolkit/adapters/_shared/src/markdown_fences.ts, plugins/dev-process-toolkit/adapters/_shared/src/auto_approve_marker.ts, plugins/dev-process-toolkit/skills/gate-check/SKILL.md | plugins/dev-process-toolkit/tests/gate-check-conformance-loop-bypass-removed.test.ts, plugins/dev-process-toolkit/tests/permissions-allow-tracked.test.ts, plugins/dev-process-toolkit/tests/conformance-loop-permissions-pre-flight.test.ts (AC-STE-252.5 deferred manual smoke per feedback_smoke_post_ship_retroactive) | +| AC-STE-283.1..7 | plugins/dev-process-toolkit/skills/implement/SKILL.md, plugins/dev-process-toolkit/docs/honored-contracts.md | plugins/dev-process-toolkit/tests/honored-contracts-implement-tdd.test.ts | +| AC-STE-285.1..7 | plugins/dev-process-toolkit/skills/setup/SKILL.md, plugins/dev-process-toolkit/skills/setup/install_hooks.ts, plugins/dev-process-toolkit/docs/hooks-reference.md, plugins/dev-process-toolkit/templates/hooks/_lib/session.sh, plugins/dev-process-toolkit/templates/hooks/process/pre-commit-gate-check.sh, plugins/dev-process-toolkit/templates/hooks/process/pre-pr-spec-review.sh, plugins/dev-process-toolkit/templates/hooks/process/pre-spec-write-brainstorm-reminder.sh, plugins/dev-process-toolkit/templates/hooks/process/pre-commit-tdd-orchestrator.sh | plugins/dev-process-toolkit/skills/setup/__tests__/hooks_menu_prompt.test.ts, plugins/dev-process-toolkit/skills/setup/__tests__/hooks_merge_settings.test.ts, plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_flag.test.ts, plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-gate-check.test.ts, plugins/dev-process-toolkit/templates/hooks/__tests__/pre-pr-spec-review.test.ts, plugins/dev-process-toolkit/templates/hooks/__tests__/pre-spec-write-brainstorm-reminder.test.ts, plugins/dev-process-toolkit/templates/hooks/__tests__/pre-commit-tdd-orchestrator.test.ts, plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts, plugins/dev-process-toolkit/tests/hooks-capability-rows.test.ts | From b01757f28b1e75487436ab92052e24833ce90c6e Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 18:38:15 +0400 Subject: [PATCH 07/37] test(byte-checkable): resolve FR file at active OR archive path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit AC-STE-285.6 doc-conformance test hardcoded `specs/frs/STE-285.md` — breaks after Phase 4 archival relocates the FR. Fall back to `specs/frs/archive/STE-285.md` via existsSync so the assertion holds for the same spec content across the active → archived transition. Refs: STE-285 --- .../tests/honored-contracts-byte-checkable-layer.test.ts | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts b/plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts index 224835e..7100e10 100644 --- a/plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts +++ b/plugins/dev-process-toolkit/tests/honored-contracts-byte-checkable-layer.test.ts @@ -15,7 +15,12 @@ import { join } from "node:path"; // enumerates each seeded hook. AC verify: file exists; ≥ 4 level-3 headings. const REPO_ROOT = join(import.meta.dir, "..", "..", ".."); -const FR_PATH = join(REPO_ROOT, "specs", "frs", "STE-285.md"); +// Resolve the FR file at either the active or archive path — the doc +// conformance assertion holds for the same spec content after Phase 4 +// milestone archival relocates the file. +const FR_ACTIVE = join(REPO_ROOT, "specs", "frs", "STE-285.md"); +const FR_ARCHIVE = join(REPO_ROOT, "specs", "frs", "archive", "STE-285.md"); +const FR_PATH = existsSync(FR_ACTIVE) ? FR_ACTIVE : FR_ARCHIVE; const HOOKS_REFERENCE_PATH = join( REPO_ROOT, "plugins", From 97dfed1c54954a47d37d2b1ed58fed803d07058b Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 18:43:57 +0400 Subject: [PATCH 08/37] chore(release): v2.21.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit M71 — Honored Contracts enforcement: prose layer (STE-283) + opt-in byte-checkable hook layer (STE-285). Release: v2.21.0 "Honored" Refs: M71 --- .claude-plugin/marketplace.json | 2 +- CHANGELOG.md | 11 +++++++++++ README.md | 2 +- .../dev-process-toolkit/.claude-plugin/plugin.json | 2 +- specs/requirements.md | 2 +- 5 files changed, 15 insertions(+), 4 deletions(-) diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 4e9cf75..09315d5 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -9,7 +9,7 @@ "plugins": [ { "name": "dev-process-toolkit", - "version": "2.20.0", + "version": "2.21.0", "source": "./plugins/dev-process-toolkit", "description": "Portable skills, agents, and templates that add SDD and TDD workflows to any Claude Code project.", "category": "development-workflow", diff --git a/CHANGELOG.md b/CHANGELOG.md index 571fb89..441dd18 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,17 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), > **Update discipline:** this file must be updated on every version bump. See the Release Checklist in `CLAUDE.md` for the required steps. +## [2.21.0] — 2026-05-13 — "Honored" + +M71 ships Honored Contracts enforcement — the toolkit's first explicit anti-falsification design after the STE-220→STE-270 6-FR chain. Two layers (prose + byte-checkable) together form a complete enforcement bundle: STE-283 makes the `/implement → /tdd` orchestrator contract maximally byte-checkable-by-a-human-reviewer (Contract callout + Rationalization Prevention table + new `docs/honored-contracts.md` catalog); STE-285 adds the per-project opt-in hook layer (`/setup --hooks` menu, 4 seeded Process-category hooks under `templates/hooks/process/`, settings.json merge helper, user manual). Per-project scope supersedes the session-wide-bundling rejections (STE-262/STE-270/STE-276). + +### Added + +- **STE-283 — TDD Orchestrator Contract prose layer.** Labeled `TDD Orchestrator Contract` callout at `/implement` Phase 2 step 8 names the violation (`Inline TDD Antipattern`) and the auditable evidence shape (`N Skill(/dev-process-toolkit:tdd ) tool_use entries where N = FR count in milestone scope`). Inline Rationalization Prevention table preempts three documented excuses (cost, no-N-times-pattern, shipping-over-fidelity). New `plugins/dev-process-toolkit/docs/honored-contracts.md` catalog enumerates cross-skill mandates under a uniform shape (Mandate / Violation name / Auditable evidence / Precedent FRs) with three seeded entries: `/implement → /tdd` (primary), `/spec-write → spec-research`, `/brainstorm → AskUserQuestion-first`. Operator-acknowledged residual risk per the STE-220→STE-270 chain; escalation path on falsification = evidence-based gate / hard mechanic. (STE-283) +- **STE-285 — Opt-in byte-checkable contract hooks via /setup.** `/setup` adds a multi-select `AskUserQuestion` step (all hooks default off) after stack detection; `/setup --hooks` re-runs only the hooks step (idempotent, pre-checked menu via `readInstalledHookNames`). Four seeded Process-category hook scripts under `plugins/dev-process-toolkit/templates/hooks/process/`: `pre-commit-gate-check.sh`, `pre-pr-spec-review.sh`, `pre-spec-write-brainstorm-reminder.sh`, `pre-commit-tdd-orchestrator.sh` (the last one is the byte-checkable continuation of STE-283's TDD Orchestrator Contract). Shared `_lib/session.sh` helper uses a single-line atomic `grep -F | grep -F` pipeline against `$CLAUDE_SESSION_FILE` and emits a 3-line NFR-10 stderr block (`Refusing: / Remedy: / Context:`). New `install_hooks.ts` helper handles `.claude/settings.json` key-level merge with idempotency on same-matcher-same-command and conflict-surfacing on differing command (per STE-133); per-hook event/matcher mapping via `HOOK_REGISTRATIONS` (PreToolUse Bash for three hooks, UserPromptSubmit `*` for the brainstorm reminder); malformed JSON surfaces as `SyntaxError` rather than silent overwrite. New `docs/hooks-reference.md` user manual enumerates each hook (name, event, matcher, requirement, NFR-10 refusal shape, override pattern). Per-project opt-in design supersedes the STE-262/STE-270/STE-276 session-wide-bundling rejections — bounded blast radius, user-consented per project. (STE-285) + +Total test count at release: 2293 tests, 0 failures, 0 errors. + ## [2.20.0] — 2026-05-11 — "Byte-Strict" M69 ships the third attempt at the `/spec-write` marker gate (STE-213/M55 prose-only, STE-220/M56 prose-only, STE-226/M59 marker-mechanism). Each prior attempt was falsified by subsequent smoke runs: the LLM imputed the marker from the harness's autonomous-mode reminder ("work without stopping") or from `claude -p` non-interactive inference, conflating prose context with the byte-checkable token. STE-262 introduces a runtime byte-grep helper that turns the gate decision into a deterministic Bash precheck — the SKILL.md gate sites invoke a CLI shim, branch strictly on its stdout (`PRESENT` / `ABSENT`), and never infer at the LLM layer. STE-270 hardens the parallel STE-251 first-turn contract for `/spec-write` (where `/conformance-loop` iter-1 on 2026-05-10 caught a new violation despite the prose mandate). Both FRs apply the recurring lesson: byte-checkable / structural enforcement at the source-level + runtime layer, not prose-only mandates. diff --git a/README.md b/README.md index ecfddf8..301f8d6 100644 --- a/README.md +++ b/README.md @@ -155,7 +155,7 @@ dev-process-toolkit/ ## Release Notes -See [`CHANGELOG.md`](./CHANGELOG.md) for the full release history. Latest: **v2.20.0 — "Byte-Strict"** (M69 — /spec-write marker-strict gate enforcement. STE-262 turns the marker detection from LLM-inferred to byte-checkable via a runtime Bash precheck — `/spec-write` gate sites now invoke `check_marker_runtime.ts` and branch strictly on its stdout (`PRESENT` / `ABSENT`); marker-PRESENT path byte-identical, marker-ABSENT + non-tty changes from silent commit (the BUG) to loud `RequiresInputRefusedError`. STE-270 locks the parallel first-turn contract drift surface with a regression fixture (2026-05-10 violation transcript) + source-level probe `spec_write_first_turn_drift_scan`. Third attempt at this gate — STE-213/M55 + STE-220/M56 were prose-only; STE-226/M59 added the marker mechanism but inference paths remained. Byte-grep at runtime closes the loop. Test count: 2235. +See [`CHANGELOG.md`](./CHANGELOG.md) for the full release history. Latest: **v2.21.0 — "Honored"** (M71 — Honored Contracts enforcement. STE-283 ships the prose layer: a `TDD Orchestrator Contract` callout at `/implement` Phase 2 step 8 naming the violation (`Inline TDD Antipattern`) and the auditable evidence shape, plus a Rationalization Prevention table and a new `docs/honored-contracts.md` catalog enumerating cross-skill mandates under a uniform shape. STE-285 ships the byte-checkable opt-in layer via `/setup --hooks` — a multi-select menu of Process-category enforcement hooks (all default off), 4 seeded scripts (`pre-commit-gate-check`, `pre-pr-spec-review`, `pre-spec-write-brainstorm-reminder`, `pre-commit-tdd-orchestrator`), an `install_hooks.ts` settings.json merge helper (idempotent on same-matcher-same-command, conflict-surfacing on diff, malformed-JSON surfaces as SyntaxError), and a `docs/hooks-reference.md` user manual. Per-project scope supersedes the session-wide-bundling rejections (STE-262/STE-270/STE-276) by bounding the blast radius to projects that opted in. Test count: 2293. ## Core Philosophy diff --git a/plugins/dev-process-toolkit/.claude-plugin/plugin.json b/plugins/dev-process-toolkit/.claude-plugin/plugin.json index 07e2ba5..29944d2 100644 --- a/plugins/dev-process-toolkit/.claude-plugin/plugin.json +++ b/plugins/dev-process-toolkit/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "dev-process-toolkit", - "version": "2.20.0", + "version": "2.21.0", "description": "Portable skills, agents, and templates that add Spec-Driven Development and TDD workflows to any Claude Code project.", "author": { "name": "nesquikm", diff --git a/specs/requirements.md b/specs/requirements.md index 29e3cef..d72c31a 100644 --- a/specs/requirements.md +++ b/specs/requirements.md @@ -4,7 +4,7 @@ **Project:** Dev Process Toolkit — a Claude Code plugin that bootstraps Spec-Driven Development (SDD) + TDD workflows into any project. **Users:** Claude Code plugin users running SDD/TDD workflows. -**Latest shipped release:** **v2.20.0 ("Byte-Strict")**. +**Latest shipped release:** **v2.21.0 ("Honored")**. ### Shipped milestones From eb433ace49f19fa527b1bf4f9c6047dcd639b489 Mon Sep 17 00:00:00 2001 From: nesquikm Date: Wed, 13 May 2026 19:30:24 +0400 Subject: [PATCH 09/37] chore(specs): write FR STE-286 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit M72 (STE-285 hooks E2E smoke coverage) — first and only FR. Adds /smoke-test fixture group 8 (STE-285 hooks runtime regression) covering install verification + runtime refusal probe for all 4 seeded hooks via one shared marker-stripped claude -p child spawn with 4 sequenced scenarios + $CLAUDE_SESSION_FILE rotation. Bundles a small STE-285 scope expansion: /setup --hooks non-interactive preselect flag (--hooks=all | --hooks=). Symmetric to STE-225 fixture group 7 (TDD orchestrator forks runtime, per STE-231). Refs: STE-286 --- specs/frs/STE-286.md | 113 +++++++++++++++++++++++++++++++++++++++++++ specs/plan/M72.md | 39 +++++++++++++++ 2 files changed, 152 insertions(+) create mode 100644 specs/frs/STE-286.md create mode 100644 specs/plan/M72.md diff --git a/specs/frs/STE-286.md b/specs/frs/STE-286.md new file mode 100644 index 0000000..4660f8f --- /dev/null +++ b/specs/frs/STE-286.md @@ -0,0 +1,113 @@ +--- +title: Smoke fixture group 8 — STE-285 hooks runtime regression +milestone: M72 +status: active +archived_at: null +tracker: + linear: STE-286 +created_at: 2026-05-13T15:26:43Z +--- + +# STE-286: Smoke fixture group 8 — STE-285 hooks runtime regression {#STE-286} + +## Requirement + +STE-285 (M71) shipped 4 opt-in Process-category hooks (`pre-commit-gate-check`, `pre-pr-spec-review`, `pre-spec-write-brainstorm-reminder`, `pre-commit-tdd-orchestrator`) with per-hook unit tests under `templates/hooks/__tests__/`. Per-hook unit tests cover script-level behavior, but the FULL pipeline — `settings.json` write via `installHooks` → Claude Code harness reads `.claude/settings.json` → PreToolUse hook fires on Bash → script reads `$CLAUDE_SESSION_FILE` → NFR-10 refusal emitted to stderr → harness surfaces refusal to the model — is **not** exercised at E2E level. STE-283's symmetric contract (TDD Orchestrator) already has runtime evidence via `/smoke-test` fixture group 7 (`STE-225 TDD orchestrator forks runtime`, per STE-231). STE-285 lacks the symmetric probe. + +This FR adds **`/smoke-test` Phase 2.X fixture group 8** — *STE-285 hooks runtime regression* — that exercises install + runtime refusal for all 4 hooks via one shared marker-stripped `claude -p` child spawn with 4 sequenced scenarios + `$CLAUDE_SESSION_FILE` rotation. Bundles a small STE-285 scope expansion: `/setup --hooks` accepts a non-interactive preselect flag (`--hooks=all` and `--hooks=`) so the smoke driver can install hooks without hitting the multi-select default-off. + +Operator-acknowledged residual risk: same as STE-285 itself — Claude Code's `$CLAUDE_SESSION_FILE` format is internal and a format change breaks every hook. The shared `_lib/session.sh` helper is the single insulation point. + +## Acceptance Criteria + +- AC-STE-286.1: `/setup --hooks` accepts a non-interactive preselect flag — `--hooks=all` selects all 4 seeded hooks; `--hooks=` selects the listed subset (validates each name against `HOOK_REGISTRATIONS`, refuses unknown names with NFR-10 shape). Both forms bypass `AskUserQuestion` and call `installHooks(...)` directly with the resolved list. Idempotent re-run preserves the existing preselected entries (no-op). + - verify: `bun test plugins/dev-process-toolkit/skills/setup/__tests__/setup_hooks_preselect.test.ts` PASSES — covers `--hooks=all`, `--hooks=pre-commit-gate-check,pre-pr-spec-review`, `--hooks=unknown-hook` (refusal), and idempotent re-run. + +- AC-STE-286.2: `.claude/skills/smoke-test/SKILL.md` carries a new Phase 2.X fixture group 8 (`STE-285 hooks runtime regression`) positioned after fixture group 7 (`STE-225 TDD orchestrator forks runtime`). Block names the diagnostic shape `STE-285 runtime regression: ` and references this FR. + - verify: `grep -F "STE-285 hooks runtime regression" .claude/skills/smoke-test/SKILL.md` ≥ 1; `grep -F "STE-285 runtime regression:" .claude/skills/smoke-test/SKILL.md` ≥ 1; `grep -F "STE-286" .claude/skills/smoke-test/SKILL.md` ≥ 1. + +- AC-STE-286.3: Install verification — after `claude -p /dev-process-toolkit:setup --hooks=all` runs in the test project, fixture group 8 reads `/.claude/settings.json` and asserts 4 hook entries land with the correct event/matcher pairs per `HOOK_REGISTRATIONS` (3 entries under `PreToolUse: Bash`, 1 under `UserPromptSubmit: *`). + - verify: per-scenario assertion shapes documented in fixture group 8's SKILL.md prose; `grep -F '"matcher": "Bash"' /.claude/settings.json` ≥ 1 AND `grep -F '"matcher": "*"' /.claude/settings.json` ≥ 1. + +- AC-STE-286.4: Runtime refusal probe — fixture group 8 spawns one marker-stripped `claude -p` child against the test project, with `$CLAUDE_SESSION_FILE` rotated to a fresh empty-JSONL fixture before each of 4 scenarios. The child runs a tight, single-action prompt per scenario (commit / commit-with-FR / `gh pr create` / `/spec-write` invocation). Stdout/stderr captured to `/tmp/dpt-smoke--hooks-runtime-.log`. + - verify: prose check on `.claude/skills/smoke-test/SKILL.md` fixture-group-8 block + driver invocation pattern; smoke run produces 4 log files with the documented names. + +- AC-STE-286.5: Per-scenario NFR-10 assertions: + - Scenario 1 (pre-commit-gate-check): `git commit` attempt with no Skill(/gate-check) ⇒ exit ≠ 0 + stderr contains `Refusing:` + `dev-process-toolkit:gate-check` + `hook=pre-commit-gate-check`. + - Scenario 2 (pre-commit-tdd-orchestrator): stage `specs/frs/.md` + `git commit` with no Skill(/tdd) ⇒ exit ≠ 0 + stderr contains `Refusing:` + `dev-process-toolkit:tdd` + `hook=pre-commit-tdd-orchestrator`. + - Scenario 3 (pre-pr-spec-review): `gh pr create` attempt with no Skill(/spec-review) ⇒ exit ≠ 0 + stderr contains `Refusing:` + `dev-process-toolkit:spec-review` + `hook=pre-pr-spec-review`. + - Scenario 4 (pre-spec-write-brainstorm-reminder): `/dev-process-toolkit:spec-write` invocation with no Skill(/brainstorm) and no tracker arg ⇒ exit = 0 (advisory) + stderr contains `Reminder:` + `dev-process-toolkit:brainstorm` + `hook=pre-spec-write-brainstorm-reminder`. + - verify: 4 `grep -F` calls on the per-scenario log files, one per scenario. + +- AC-STE-286.6: On any scenario regression, fixture group 8 appends to `/tmp/dpt-smoke--findings-.md` a line of the canonical shape `STE-285 runtime regression: ` (per STE-231 precedent). Clean run appends `STE-285 runtime check: PASS` to the per-tracker summary block. + - verify: prose check on SKILL.md; grep on findings file after a forced-fail dry-run (e.g., uninstall one hook before group 8 runs and confirm `regression:` line lands). + +- AC-STE-286.7: Driver-side `$CLAUDE_SESSION_FILE` rotation — between each of the 4 scenarios, the smoke driver writes a fresh empty-JSONL fixture (`/tmp/dpt-smoke-empty-session-.jsonl` containing one harmless non-Skill entry to satisfy the `[ -f $file ]` fail-open guard) and points the child's env at it via a Bash heredoc + `CLAUDE_SESSION_FILE= claude -p ...`. **Open implementation risk:** if the Claude Code harness overrides `$CLAUDE_SESSION_FILE` to its real session log path when spawning hooks, the env-injection won't work — implementation must verify this empirically and fall back to a different isolation mechanism (e.g., a Bash wrapper that intercepts `$CLAUDE_SESSION_FILE` before the hook script reads it, or invoke the hook scripts directly via `bash