Skip to content

feat(ui): non-technical Runs dashboard with live SSE flow graph (RFC, #292)#356

Draft
call-me-ram wants to merge 1 commit into
awslabs:mainfrom
call-me-ram:feat/non-technical-runs-ui
Draft

feat(ui): non-technical Runs dashboard with live SSE flow graph (RFC, #292)#356
call-me-ram wants to merge 1 commit into
awslabs:mainfrom
call-me-ram:feat/non-technical-runs-ui

Conversation

@call-me-ram

Copy link
Copy Markdown
Contributor

Draft / RFC — opened to start a direction conversation, not to merge as-is. See Open question below.

What this is

Implements the non-technical "Runs" dashboard from #292: a plain-language front-door where a non-developer can start a run, watch an animated agent-to-agent flow graph, and answer / instruct / retry / end agents inline — no tmux or terminal jargon.

  • Runs board — each CAO session is a "run"; plain-English status ("The planner is working…", "Worker needs your answer"). Started via a 3-step wizard (goal → planner profile → launch).
  • Animated flow graph — agent-to-agent messages travel the edges as colored pulses (amber = delegation, emerald = report), pure SVG + SMIL, no graph library.
  • Live, without per-terminal polling — one EventSource('/events/runs') carries status + flow frames; a 10s REST reconcile is the fallback.

Design: additive & coexisting

Since #292 was filed, upstream chose its own UI direction (the sandboxed host-rendered mcp-apps fleet UI, #332 / #347). This PR is deliberately built so the two coexist — nothing existing is overwritten:

  • Backend rides the existing topic event_bus. A new /events/runs SSE stream (sse-starlette) multiplexes status frames (StatusMonitor transitions — already published upstream) and flow frames (flow.message, published from the inbox endpoint and terminal_service.send_input). Upstream's /events (the mcp-apps SseBus stream) is left completely untouched.
  • Frontend is a self-contained web/src/runs/ cluster + one new "Runs" tab. It carries its own copies of the few leaf components it needs, so it never overwrites AgentPanel / StatusBadge / etc. The existing Home / Agents / Flows / Settings / Memory tabs and the mcp-apps surface are unchanged.

Also included (small, additive): GET /fs/dirs (in-app folder browser, #282), Windows/WSL working_directory normalization (clear 400s instead of 500s), a friendly per-run label (POST /sessions/{name}/label), and an X-Server-Time response header for browser/server clock-skew correction.

Open question for maintainers 🧭

The Runs UI and the mcp-apps fleet UI are now two front-ends living in the same web/ app. How would you like them to relate?

  1. Coexist as-is — "Runs" is the non-technical front-door tab, mcp-apps is the fleet/developer surface; or
  2. Fold the Runs board + flow graph into the mcp-apps host-rendered surface.

This PR keeps them decoupled so that decision stays fully open — happy to take it either way.

Verification

  • Python: full suite green — 3326 passed. The one failure is a pre-existing, order-dependent TestGetServerSettings flake that also fails on clean main (stashed check) and passes in isolation; it exercises the server-tuning cache, which this PR doesn't touch. New tests: /events/runs status+flow multiplexer, flow.message producers (inbox + send_input), path normalization + /fs/dirs, session-label round-trip, and the create-session 400 path.
  • Frontend: tsc --noEmit clean; 81 vitest specs pass (incl. orchestration phase-mapping and the SSE-consumer store logic); npm run build emits the wheel bundle.
  • Live smoke: ran cao-server — dashboard served, /events/runs returns text/event-stream, /fs/dirs lists folders, and upstream /events still returns 404 (untouched).
  • black / isort clean.

Notes

  • The built UI (src/cli_agent_orchestrator/web_ui/) is gitignored and produced by npm run build; CI's web-build job regenerates it — there is no committed bundle.
  • Playwright e2e (web/e2e/*) needs a live server + a real provider CLI, so it stays out of CI (matches the existing convention); the wiring is covered by the unit + live-smoke path above.

Refs #292 (implements), #282 (in-app folder browser).

…abs#292)

Add a plain-language "Runs" front-door for non-developers, as an additive
surface that coexists with the existing fleet UI.

- A Runs board where a run is started via a 3-step wizard (goal -> planner
  profile -> launch), agents are answered/instructed/retried/ended inline, and
  an animated agent-to-agent flow graph (amber delegation / emerald report
  pulses, pure SVG + SMIL) shows what is happening -- no tmux jargon.
- Backend rides the existing topic event_bus: a new /events/runs SSE stream
  (sse-starlette) multiplexes `status` frames (StatusMonitor transitions,
  already published) and `flow` frames (flow.message, published from the inbox
  endpoint and terminal_service.send_input). Upstream's mcp-apps /events stream
  is left untouched so the two coexist.
- Frontend is a self-contained web/src/runs/ cluster plus one new "Runs" tab;
  it never overwrites existing components. The board is real-time over one
  EventSource with a 10s REST reconcile as the fallback.
- Also adds GET /fs/dirs (folder browser, awslabs#282), working_directory
  normalization for Windows/WSL paths (clear 400s instead of 500s), a friendly
  per-run label (POST /sessions/{name}/label), and an X-Server-Time header for
  browser/server clock-skew correction.

Tests: SSE status+flow multiplexer, flow.message producers, path normalization
+ /fs/dirs, label round-trip, orchestration phase mapping, and the SSE consumer
(vitest). Python suite and 81 vitest specs green; frontend builds into the wheel
via `npm run build`.
@haofeif

haofeif commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

@plauzy can you please help to respond to @call-me-ram 's questions ?

@codecov-commenter

codecov-commenter commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.12977% with 9 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@5dcf319). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/cli_agent_orchestrator/api/main.py 91.54% 6 Missing ⚠️
src/cli_agent_orchestrator/utils/paths.py 94.28% 2 Missing ⚠️
...li_agent_orchestrator/services/settings_service.py 93.33% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #356   +/-   ##
=======================================
  Coverage        ?   87.66%           
=======================================
  Files           ?      113           
  Lines           ?    13188           
  Branches        ?        0           
=======================================
  Hits            ?    11561           
  Misses          ?     1627           
  Partials        ?        0           
Flag Coverage Δ
unittests 87.66% <93.12%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements an additive “Runs” dashboard UX in the existing web/ app, backed by a new /events/runs SSE stream that multiplexes terminal status transitions and agent-to-agent “flow” pulses for a live flow graph. It also adds small backend enablers (server-side folder browser, session labels, working-directory normalization, server-time header) intended to coexist with the upstream MCP Apps fleet UI.

Changes:

  • Add a new Runs tab UI cluster (web/src/runs/*) including start-a-run wizard, run board, output viewer, and animated SVG flow graph driven by SSE pulses.
  • Add backend SSE endpoint /events/runs (sse-starlette) that relays terminal.*.status and flow.message, plus producers for flow.message and a heartbeat interval.
  • Add operator-facing ergonomics: /fs/dirs server-side folder listing, working_directory normalization (Windows/WSL), per-session friendly labels, and X-Server-Time header for clock-skew correction.

Reviewed changes

Copilot reviewed 29 out of 30 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
web/vite.config.ts Proxies /events and /fs to the backend during dev.
web/src/test/stripAnsi.test.ts Unit tests for stripping ANSI/OSC escape sequences in output viewer.
web/src/test/store.sse.test.ts Unit tests for SSE consumer logic in the zustand store.
web/src/test/orchestration.test.ts Unit tests for run derivation and plain-language status/phase mapping.
web/src/store.ts Adds flow pulse state and SSE connection logic for status/flow frames.
web/src/runs/StatusBadge.tsx Shared status styling config and Runs-friendly status pill UI.
web/src/runs/StartRunWizard.tsx 3-step “start a run” wizard (goal → planner → launch) + folder picker.
web/src/runs/SessionName.tsx Inline session label editor (friendly display alias).
web/src/runs/RunBoard.tsx Main Runs board UI (cards, inline Q&A panel, instruct/retry/end actions).
web/src/runs/OutputViewer.tsx Modal output viewer with ANSI/OSC stripping and copy/refresh.
web/src/runs/FolderBrowser.tsx In-app backend folder picker powered by /fs/dirs.
web/src/runs/FlowGraph.tsx SVG+SMIL flow graph with animated message pulses.
web/src/runs/ConfirmModal.tsx Generic confirm modal used for “end run” flows.
web/src/runs/AgentAvatar.tsx Role/status avatar tokenization used by Runs UI and flow graph nodes.
web/src/orchestration.ts Run model + derivation logic mapping sessions/terminals/status to phases.
web/src/index.css Adds design tokens + motion primitives (pulse-dot, ring-pulse) and reduced-motion handling.
web/src/App.tsx Adds “Runs” tab and wires up SSE connection at app startup.
web/src/api.ts Adds server clock-skew correction, improved error surfacing, /fs/dirs, and session label API.
uv.lock Adds sse-starlette dependency lock entry.
test/utils/test_paths.py Tests for working-directory normalization and /fs/dirs behavior.
test/api/test_terminals.py Updates session creation tests for working_directory validation; adds session label tests.
test/api/test_flow_producers.py Tests for flow.message producers (inbox endpoint + send_input delegation).
test/api/test_events_runs.py Tests /events/runs generator behavior and allowlist gating.
src/cli_agent_orchestrator/utils/paths.py Implements working-directory normalization (Windows→WSL mount, validation, mkdir).
src/cli_agent_orchestrator/services/terminal_service.py Publishes flow pulses on agent-to-agent sends (used by Runs flow graph).
src/cli_agent_orchestrator/services/settings_service.py Persists and retrieves per-session friendly labels.
src/cli_agent_orchestrator/services/session_service.py Attaches stored labels to listed sessions.
src/cli_agent_orchestrator/constants.py Adds SSE_HEARTBEAT_INTERVAL constant.
src/cli_agent_orchestrator/api/main.py Adds X-Server-Time header middleware, /events/runs SSE endpoint, /fs/dirs, session label endpoint, and working_directory normalization on session create.
pyproject.toml Adds sse-starlette dependency with rationale comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +516 to +529
if sender_id:
# Announce the agent-to-agent send (handoff/assign task delivery) on
# the bus so the Runs flow graph can animate sender -> receiver
# (supervisor -> worker delegation pulses). (GH #292)
from cli_agent_orchestrator.services.event_bus import bus as _bus

_bus.publish(
"flow.message",
{
"sender_id": sender_id,
"receiver_id": terminal_id,
"kind": orchestration_value or "task",
},
)
Comment thread web/src/store.ts
Comment on lines +134 to +140
const { sender_id, receiver_id, kind } = JSON.parse(e.data)
if (sender_id && receiver_id) {
get().pushFlowPulse({ sender: sender_id, receiver: receiver_id, kind: kind || 'message' })
// A flow event often means the roster just changed (a handoff
// spawned a worker) — refresh sooner than the slow reconcile.
get().fetchSessions()
}
Comment thread web/src/runs/RunBoard.tsx
Comment on lines +303 to +305
useEffect(() => {
if (flowPulses.length) fetchAll()
}, [flowPulses.length])
unaffected. (GH #292)
"""
response = await call_next(request)
response.headers["X-Server-Time"] = datetime.now().isoformat()
Comment on lines +50 to +53
raise ValueError(
f"'{working_directory}' is a Windows path, but drive {drive.upper()}: "
f"is not mounted at {drive_mount}. Use the Linux path instead."
)

path = Path(cleaned).expanduser()
if not path.is_absolute():
raise ValueError(f"Working directory must be an absolute path, got '{working_directory}'")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants