Support streaming tool output and deduplication#7
Merged
Conversation
3415d07 to
9101647
Compare
62b5363 to
cb357ac
Compare
773099b to
ee1e040
Compare
31c4d9c to
c7ee08b
Compare
2ee10cd to
1913d2e
Compare
timvisher-dd
added a commit
that referenced
this pull request
Apr 30, 2026
Reconciled conflicts: - README.org: keep both feature lists (streaming-dedup PR #7 entries plus queue-via-temp-gfm-compose-buffer entry). - bin/test: keep streaming-dedup's ci.yml-driven runner and add markdown-mode dependency resolution from the queue branch (markdown-mode is now in the merged ci.yml). Drop set -euo pipefail to match project bash conventions; replace with explicit || exit 1 checks. - tests/agent-shell-tests.el: keep both new test sections. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0676213 to
5a28f60
Compare
GitHub Actions workflow with four jobs: readme-updated (PR-only,
guards the soft-fork features list), agent-symlinks (verifies the
multi-IDE plumbing), dependency-dag (require graph must be acyclic),
and test (byte-compile + ERT under emacs 29.4 with acp.el and
shell-maker as checkout deps).
bin/test parses ci.yml with yq and dispatches each step locally, so
CI changes are picked up automatically. adapt_for_local rewrites
GitHub PR sha context to a single @{u}... three-dot range that the
git wrapper accepts. CONTRIBUTING.org documents the runner and the
acp_root / shell_maker_root overrides.
.claude / .codex / .gemini and CODEX.md are symlinks pointing at
.agents and AGENTS.md so the same config works across Claude Code,
Codex, and Gemini CLI. .agents/commands/live-validate.md describes
the live rendering-validation workflow.
README.org gets a "Features on top of agent-shell" section
enumerating the streaming-dedup work and follow-on polish.
agent-shell-devcontainer.el declares agent-shell-text-file-capabilities
to suppress a byte-compile warning for the cross-file reference.
Three send-command tests in tests/agent-shell-tests.el get :title
and :last-activity-time pre-seeded on their hand-rolled state alists
so the local runner produces a clean baseline. Without the
placeholders, agent-shell--set-session-title's map-put! call fails
with map-not-inplace because the alists lack the keys.
Quote the keymap argument to shell-maker-define-major-mode. Under
shell-maker 0.91.2 the macro expects a symbol it can resolve at mode
activation; passing the unquoted variable evaluates to the keymap
value before the macro can use it, and agent-shell-mode signals
(void-function keymap) when any test creates a fresh buffer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three tests cover the markdown table overlay pipeline end-to-end: overlay structure for a static buffer, mid-stream cleanup so stale overlays disappear when a row is rewritten, and a regression that guards against table rows being split across visual lines. The helpers inject ACP traffic via agent-shell--on-notification and fire pending debounce timers when present, so the tests reflect the real streaming path rather than direct markdown-overlays-put calls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lpers Three new modules and the agent-shell.el integration that wires them together: - agent-shell-meta.el extracts toolResponse and terminal_output from the ACP meta envelope so streaming code can fold mixed-source tool output (terminal stream + final meta.toolResponse) without showing duplicate content. - agent-shell-invariants.el is a runtime tracing and assertion library: a ring of recent ACP/UI events, process-mark and fragment-update guard wrappers, and a long-buffer head/tail snapshot included in violation reports. - agent-shell-streaming.el is the streaming tool_call_update handler with dedup, including a label cache cleared on completion and the generalized title upgrade path that survives buffer-kill races. agent-shell.el wires these in (require, on-notification dispatch, process-mark/fragment guards, insert-cursor reset, defcustom for the markdown-overlay debounce delay, and dropping session/update handlers when the shell buffer has been killed). agent-shell-ui.el gains the invariants require and the UI plumbing the streaming handler relies on. Tests cover the dedup logic across mixed sources, the invariants library's event ring and guard wrappers, and additional regression coverage in agent-shell-tests.el (cancel with nil transcript-file, markdown-overlay debounce buffer-kill race, "Thinking" label restoration on agent_thought_chunk). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ebb0848 to
25a3b75
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes xenodium#342
Closes xenodium#343
Checklist
M-x checkdocandM-x byte-compile-file.Problem
The two most popular agent ACPs,
codex-acpandclaude-agent-acp, perform very poorly inagent-shellwhen tool executions emit a lot of text:codex-acp: O(n²) rendering and massive data transfer. A 35k-line bash command takes ~60s and transfers ~890 MB of JSON — a 3,000× amplification of ~280 KB actual output. Each
tool_call_updatecarries the full accumulated output; agent-shell replaces the entire fragment body and rerunsmarkdown-overlays-puton every update.claude-agent-acp: output is silently lost. The same command truncates to 241 of 35,001 lines. The user sees raw
<persisted-output>XML tags rendered verbatim in the shell buffer.Cause
agent-shell does not advertise
_meta.terminal_outputinclientCapabilitiesduring the ACPinitializehandshake. Without this capability:tool_call_update(O(n²) content growth).Fix
Advertise
_meta.terminal_outputduring initialize and handle the resulting streaming behavior:acp.elto accept:meta-capabilitiesonacp-make-initialize-request.:meta-capabilities '((terminal_output . t))fromagent-shell.elduring initialize._meta.terminal_output.datachunks (codex-acp) and batch_meta.terminal_outputresults (claude-agent-acp) in a new streaming handler with deduplication.<persisted-output>tags and render previews cleanly.Implementation
New files
agent-shell-meta.el— extractors for ACP_metapayloads:agent-shell--meta-lookup— key lookup handling both symbol and string keys in alists.agent-shell--meta-find-tool-response— walks any_metanamespace to find atoolResponsevalue.agent-shell--tool-call-meta-response-text— extracts stdout text from_meta.*.toolResponsein its various shapes (string, alist withstdoutkey, vector of content blocks).agent-shell--tool-call-terminal-output-data— extracts_meta.terminal_output.data.agent-shell-streaming.el— streaming tool call update handler:agent-shell--tool-call-normalize-output— strips markdown fences, strips<persisted-output>XML tags (rendering the preview withfont-lock-comment-face), and ensures trailing newlines.agent-shell--append-tool-call-output— accumulates streamed output in the state's:tool-callshash under an:accumulatedkey per tool call ID.agent-shell--handle-tool-call-update-streaming— the main handler, replacing the inlinetool_call_updateblock inagent-shell.el. Three branches:_meta.terminal_output.data): normalize the chunk, accumulate it, and immediately append it to the fragment body for live streaming._meta.*.toolResponse): normalize and accumulate silently (rendered only on final update to avoid duplication)."completed"or"failed"): render accumulated output (or fall back tocontenttext), log to transcript, clean up permission dialogs, and apply title/label updates.agent-shell--mark-tool-calls-cancelled— marks all in-progress tool calls as cancelled (called fromagent-shell-interrupt).Changes to
agent-shell.el(require 'agent-shell-streaming)added.tool_call_updaterendering block is replaced by a single call toagent-shell--handle-tool-call-update-streaming. The metadata save (title/description/command/raw-input/diff) remains inline before the handler call.initializerequest now passes:meta-capabilities '((terminal_output . t))toacp-make-initialize-request.agent-shell-interruptcallsagent-shell--mark-tool-calls-cancelledafter sending the cancel notification.shell-maker-define-major-modecall passes'agent-shell-mode-map(quoted symbol) instead of the bare variable. This is a bug fix — the bare-variable form errors withvoid-function keymapevery timeagent-shell-modeis invoked, becauseshell-maker-define-major-modesplices the keymap value into a backquoted form that re-evaluates(keymap ...)as a function call. Quoting passes the symbol souse-local-mapresolves it correctly. Reproducible upstream; should be filed there separately.Tests
Three new test files cover the streaming/dedup work and the runtime invariants library:
tests/agent-shell-streaming-tests.el(40 tests) — streaming dispatcher, dedup math (agent-shell--thought-chunk-deltafour-case proof), label transitions, empty-chunk paragraph break, post-turn-end render, ID-split helpers, mixed-source dedup (terminal_outputmid-flight +_meta.toolResponseon final), append-boundary newline cap, and the streamingtool_call_updategeneral title upgrade.tests/agent-shell-invariants-tests.el(11 tests) — event ring, mutation hooks, violation handler bundle output (head + tail snapshot for long buffers).tests/agent-shell-table-tests.el(3 tests) — markdown table overlay rendering after streamed updates.Plus regression tests added to
tests/agent-shell-tests.elfor:_claude/sdkMessagelog surfacing, killed-buffer notification drop, and the markdown-overlay debounce buffer-kill race.Perf measurements
Test:
for x in {0..35000}; do printf 'line %d\n' "$x"; done(35,001 lines)codex-acp
~8× faster. Content drops from ~900 MB to ~3 KB.
claude-agent-acp
No timing improvement (execution is server-side), but
<persisted-output>tags are handled cleanly.Prerequisite: acp.el changes
acp.elneeds to accept the:meta-capabilitieskeyword argument onacp-make-initialize-requestso the meta capability map can be advertised during the initialize handshake. See xenodium/acp.el#15.Other changes bundled in this branch
The streaming/dedup work is the main motivation, but this branch also lands several tightly-coupled changes that share file boundaries with the streaming integration. They're noted here so reviewers see the full scope rather than discovering them as drift.
New libraries / files
agent-shell-meta.el—_metapayload extractors (described above; called out here because it's a new top-level module).agent-shell-streaming.el— streaming handler (described above).agent-shell-invariants.el— runtime buffer invariant checks with event tracing and a violation debug bundle (agent-shell-invariants-enabled, default off; gated for the live-validate workflow).Behavior changes outside the streaming handler
agent-shell--with-preserved-process-markmacro and:insert-cursorstate slot).agent-shell-ui.elappend-in-place rewrite — chunks append to existing fragments without rebuilding the body, with boundary-newline normalization so paragraph-break chunks don't compound newlines on consecutive appends.agent_message_chunkmid-stream is rewritten to\n\nso two content blocks in the same turn don't run together.invisibleproperty from trailing-whitespace hiding.tool_call_updatenotifications arriving aftersession/promptresolves still render — covers Claude Code's Stop-hook bounce-and-regen cycle that streams chunks for ~50s afterend_turn._claude/sdkMessagenotifications are pretty-printed into the per-shell debug log whenagent-shell-logging-enabledis set, surfacing hook lifecycle events that the ACP layer otherwise drops.defcustom agent-shell-markdown-overlay-debounce-delay(default 0.15s) for tuning idle-overlay cadence on slow terminals.Build / CI / docs
.github/workflows/ci.yml) with byte-compile, ERT, dependency-DAG, agent-symlinks, and PR-only README-update jobs.bin/testdriver that parsesci.ymlviayqso the local invocations stay in lockstep with CI; it bails fast on any untranslated${{ }}expressions..claude,.codex,.geminisymlinks →.agents;CLAUDE.md/CODEX.md/GEMINI.md→AGENTS.md).CONTRIBUTING.organdAGENTS.mddocument theyqprerequisite and the dev-deps env-var override (acp_root,shell_maker_root)..agents/commands/live-validate.mddescribes the live-batch validation workflow used for rendering changes.Bug fixes that ride along
default-directory.agent-shell-interrupt).