Skip to content

feat(call-pipeline): auto-chain Stage A to B to C to D#47

Open
jsmninvest wants to merge 3 commits into
earlyaidopters:mainfrom
jsmninvest:builder/call-pipeline-chain-hook
Open

feat(call-pipeline): auto-chain Stage A to B to C to D#47
jsmninvest wants to merge 3 commits into
earlyaidopters:mainfrom
jsmninvest:builder/call-pipeline-chain-hook

Conversation

@jsmninvest

Copy link
Copy Markdown

Summary

  • Auto-chains the 4-stage call pipeline so Stages B, C, D fire automatically after Stage A completes [no more manual nudges from main after every call].
  • Adds maybeAdvanceCallPipeline hook that parses STAGE_X_DONE from a finishing mission result, looks up the pipeline run row, and calls the existing orchestrator onStageAccepted / onStageFailed.
  • Wires the hook into scheduler.ts next to notifyMissionCompletion at all four terminal sites [success, acceptance fail, timeout, runtime error]. Hook is a silent no-op for non-pipeline missions.

Why

Stage A has been spawning s2l missions since mission 3b81d301 shipped the worker hook, but nothing was calling onStageAccepted when those missions finished. Every real call stalled at Stage A and needed a hand-dispatched Stage B from @main [see auto-triage missions 2da336bc, ae439289 for Ashley and the current f607b7db for Sunil Khanna]. This closes the loop in the scheduler so the pipeline runs without operator babysitting.

Test plan

  • 9 new unit tests in src/call-pipeline/chain-hook.test.ts [parser robustness, A to B to C to D chain, idempotency, failure halts chain, no-op for non-pipeline missions]
  • Live-DB canary scripts/call-pipeline-canary.mjs passed [synthetic Stage A completes -> Stage B queued on s2l at priority 7 with correct acceptance, then stage B mission cancelled + synthetic rows cleaned]
  • npm run typecheck clean
  • Orchestrator tests (5) + chain-hook tests (9) + stage-a integration test all pass
  • Observed on next real incoming call post-deploy [acceptance criterion]

robkloti pushed a commit to robkloti/claudeclaw that referenced this pull request May 11, 2026
…aidopters#47)

The model dropdown for the main agent always rendered as "Opus 4.6"
regardless of the user's selection. /api/agents was returning a
hardcoded 'claude-opus-4-6' for the main entry, ignoring the in-memory
override set by /model and the dashboard PATCH endpoint.

Add getMainModelOverride() in bot.ts and read it in /api/agents so the
dropdown reflects the active selection. Reported by Isabelle Bordji.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@rubening rubening left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review (Overnight Agent, Night #97)

The call-pipeline chain hook is well-implemented -- maybeAdvanceCallPipeline() is clean, 9 unit tests + live canary script. The A->B->C->D auto-chaining concept is correct.

Same issue as #46: This PR inherits #46's branch (63 shared files of unrelated changes) plus 3 new files. The actual chain-hook feature is tiny relative to the 66-file diff.

Request: Once #46 is rebased onto current main (see review there), rebase this PR on top of the cleaned #46. The merge order matters: #46 first, then #47.

Both features are good and we want to merge them -- the branches just need to be cleaned up so the diffs only show the actual feature code.

Aditya_office_AI_assistant and others added 3 commits May 19, 2026 07:18
Restores the mission-autopush module (dropped during rebuild in 111d22a)
and wires notifyMissionCompletion into scheduler.ts so every completeMissionTask()
call in runDueMissionTasks also fires the autopush hook.

Why: missions created by Rudy (created_by='main') would complete silently.
Aditya had no visibility unless he polled the CLI. This closes the loop.

Exactly-once delivery is guaranteed by db.markMissionAutopushed(), which CAS-stamps
the autopushed_at column iff currently NULL. 4 call sites added in scheduler.ts
(timeout, acceptance-pass, acceptance-fail, catch-block) — the CAS absorbs any
double-invocation.

Restored from 1906b01:
- src/mission-autopush.ts + src/mission-autopush.test.ts (13 tests, all pass)
- db.ts: markMissionAutopushed, RetryReason, autopushed_at on MissionTask,
  autopushed_at column + idempotent migration

Canary mission 878dee40 (priority 5, ops) completed with autopushed_at set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the monolithic call-processing mission that kept hitting the
30-min wall-clock and 60-turn ceiling with a staged pipeline.

Stages (all Sonnet):
  A. Extract — parse GHL transcript, write STAGE_A_FACTS note + fields
  B. RAG    — Pinecone shortlist + rule-based rerank, STAGE_B_RECOMMENDATION
  C. Borrower draft — Gmail draft, STAGE_C_BORROWER with draftId
  D. AE draft       — C21 Outlook draft, STAGE_D_AE with draftId

Each stage is a separate mission_task with acceptance criteria. Stage N
creates stage N+1 when acceptance passes. Watchdog v2 can retry any
stage independently. No transcribe stage; GHL already handles that.

Durability: call_pipeline_runs table, (call_msg_id, stage) PRIMARY KEY
enforces idempotency across retries.

Artifacts:
- src/call-pipeline/orchestrator.ts
- src/call-pipeline/stage-prompts.ts
- src/call-pipeline/orchestrator.test.ts (5/5 pass)
- src/call-pipeline/stage-a.integration.test.ts
- scripts/start-call-pipeline.mjs
- src/db.ts migration (legacy + fresh install paths)

Builder mission 640e22d6 hit 60-turn ceiling before wiring the
/clawd call-worker hook and before running Stage A integration against
Ashley. Follow-up mission dispatched for those two gaps.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…pletion

Problem: call-transcript-worker spawns Stage A s2l mission, Stage A writes
STAGE_A_FACTS note and returns STAGE_A_DONE, then nothing. Stages B, C, D
never fire without a manual nudge from main. Blocking real revenue work
(Sunil Khanna 1M commercial call stopped at A today).

Root cause: the 4-stage orchestrator has onStageAccepted wired to mark the
stage completed and queue the next stage mission, but no caller invokes it
when a Stage X mission actually finishes. The scheduler terminates the
mission and moves on.

Fix: new chain-hook maybeAdvanceCallPipeline that parses the STAGE_X_DONE
marker from a freshly-completed mission result, looks up the matching
call_pipeline_runs row, and calls the orchestrator onStageAccepted /
onStageFailed. Wired into scheduler.ts next to notifyMissionCompletion at
all four completion sites [success, acceptance fail, timeout, runtime
error]. Hook is a silent no-op for any non-pipeline mission so it is safe
to run on every completion.

Canary (scripts/call-pipeline-canary.mjs) runs against the live DB with a
synthetic call_msg_id, confirms Stage B is queued on s2l at priority 7
with the correct acceptance string, then cancels the Stage B mission and
deletes the synthetic rows.

Tests: 9 unit tests cover the hook in isolation [parser robustness,
idempotency, failure path halts chain, end-to-end B to C to D chain, no-op
for non-pipeline missions].

Co-Authored-By: Claude Opus 4.7 (1M context) [noreply@anthropic.com]
@jsmninvest jsmninvest force-pushed the builder/call-pipeline-chain-hook branch from 8431a2c to 0abbc5f Compare May 19, 2026 14:21
mm-consult-gautam pushed a commit to morphingmachines/claudeclaw-os-mm-public-claudeclaw-backup-2026-05-25 that referenced this pull request May 25, 2026
…ToAgent (#67)

## Bug

`src/orchestrator.ts:209` calls `runAgent()` with `undefined` for the
`model` parameter inside `delegateToAgent`. When a sync delegation
(`@research: ...` from the chat parser) fires, the SDK falls through to
its default model regardless of the `model:` field declared in the
target specialist's `agent.yaml`.

Same theme as earlyaidopters#57 (`fix(scheduler): pass agentDefaultModel to runAgent
for mission + scheduled tasks`), but the orchestrator path is a separate
callsite that earlyaidopters#57 did not touch.

## Impact

For any user with non-default `model:` values across agents (e.g. main
on `claude-opus-4-7`, specialists on `claude-sonnet-4-6`), sync
delegations from the orchestrator have been silently running on the SDK
default. The interactive DM path to each specialist's own bot already
honoured the override (it goes through `bot.ts` which passes
`agentDefaultModel`), but every `@<id>:` delegation from the
orchestrator landed on the wrong model.

Empirically: in a deploy where specialists were configured as
`claude-sonnet-4-6` while main remained on Opus default, a delegate
roundtrip took ~68s pre-fix vs ~10s post-fix — consistent with the
Sonnet vs Opus latency ratio, confirming the target's model was being
ignored.

## Why `agentConfig.model`, not `agentDefaultModel`

`delegateToAgent` runs **in-process** inside the orchestrator (main).
The module-level `agentDefaultModel` exported from `config.ts` is set
once per process at startup, from the *caller* agent's `agent.yaml`.
Inside main's process that's main's model (typically `undefined` since
main doesn't have an agent.yaml in the default layout), NOT the
target's.

The target's config is already loaded on the line above the runAgent
call (`const agentConfig = loadAgentConfig(agentId)` at line 174), so
`agentConfig.model` is the correct source. Inline comment in the diff
documents the distinction so a future contributor doesn't "simplify" it
to `agentDefaultModel` and silently break delegations.

## Fix

One field in `src/orchestrator.ts`: pass `agentConfig.model` as the
positional `model` arg (instead of `undefined`) to the `runAgent()`
call inside `delegateToAgent`. No new imports (`loadAgentConfig` is
already imported and used a few lines above).

No behaviour change for users with no `model:` field set in their
agents — `agentConfig.model` is `string | undefined`, so the
fallthrough to SDK default still applies. Test changes not required;
existing tests don't exercise this parameter.

## Relation to prior work

- earlyaidopters#57 (merged) fixed the same theme in `src/scheduler.ts` for
  `runAgent` call sites that run inside the *target* process (where
  `agentDefaultModel` is correctly the target's). The orchestrator
  callsite needs `agentConfig.model` instead because it runs
  in-process from a different agent.
- Conceptually adjacent to earlyaidopters#47 — model overrides should be honoured
  consistently across surfaces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants