diff --git a/packages/multiagent/README.md b/packages/multiagent/README.md new file mode 100644 index 000000000..4075de1c0 --- /dev/null +++ b/packages/multiagent/README.md @@ -0,0 +1,622 @@ +# `@browserbasehq/multiagent` + +`@browserbasehq/multiagent` is a thin driver for combining three moving parts behind one CLI or library entrypoint: + +- a shared browser session +- one or more browser-control toolsets +- one or more agent harness loops + +The package is structured so browser state, MCP/tool state, and conversation state stay isolated instead of collapsing into one large orchestrator object. + +## Architecture + +### Codebase map + +The package is split by actor/service boundary: + +- `lib/runtime/driver.ts` + - top-level orchestration for one `MultiAgentDriver.run()` +- `lib/browser/*` + - shared browser session abstraction +- `lib/mcp/*` + - toolset/MCP abstraction and adapter registry +- `lib/agents/*` + - agent session abstraction and harness registry +- `lib/cli.ts` + - CLI entrypoint for `multiagent run` and internal stdio MCP server commands +- `lib/utils/*` + - process launching, runtime path resolution, and error helpers +- `lib/types.ts` + - shared public and internal option/result types + +### Service boundaries + +The package intentionally keeps state in a few small actors instead of one global mutable runtime: + +- `MultiAgentDriver` + - owns one run request + - creates and wires the other actors + - owns the top-level cleanup order +- `BrowserSession` + - owns browser process/connection state + - knows whether the browser was launched locally or attached externally +- `MCPServer` + - owns one toolset adapter and, when started in-process, one MCP client transport + - can either provide launch config to an external harness or act as an in-process MCP client +- `AgentSession` + - owns one harness instance + - owns one message history + - owns the list of attached tool servers for that harness +- harness implementation + - owns harness-specific resume/session state + - converts `AgentRunInput` into one external CLI call or one in-process agent execution + +### State ownership + +The core design goal is that each actor owns one category of state: + +- Browser state lives in `BrowserSession` + - local browser process handle or attached connection + - derived `cdpUrl` and `browserUrl` +- Tool server state lives in `MCPServer` + - adapter selection + - stdio client transport when `start()` is used +- Conversation state lives in `AgentSession` + - ordered `messages[]` + - attached `MCPServer[]` +- Harness-native session state lives inside each harness + - for example Claude session id, Codex thread id, Gemini session id, OpenCode session id + +That separation is why the driver can share one browser across multiple harnesses without collapsing their conversation history or tool wiring together. + +### Run lifecycle + +One `multiagent run` call currently follows this lifecycle: + +1. `MultiAgentDriver` builds a `BrowserSession` from `options.browser` +2. `BrowserSession.start()` launches local Chrome or attaches to an existing CDP target +3. `MultiAgentDriver` creates one `MCPServer` instance per requested toolset +4. `MultiAgentDriver` creates one `AgentSession` per requested harness +5. Each `AgentSession` attaches all selected `MCPServer` instances +6. Each `AgentSession.start()` calls harness-specific startup + - no-op for most CLI harnesses + - Stagehand allocates its in-process V3 runtime here +7. Each `AgentSession.addUserMessage(task)`: + - records the user message locally + - asks each attached `MCPServer` for a stdio launch config + - calls `harness.runTurn(...)` + - records the assistant message locally +8. `MultiAgentDriver` collects all per-agent results + - individual harness failures are captured per agent instead of aborting the whole run +9. Cleanup runs in `finally` + - stop all `AgentSession`s + - stop all started `MCPServer`s + - stop the shared `BrowserSession` + +### Turn lifecycle + +Inside one `AgentSession.addUserMessage(...)` call: + +1. A user message object is appended to the session history +2. Attached MCP servers are converted into named stdio launch configs +3. The harness receives: + - `prompt` + - `mcpServers` + - `cwd` +4. The harness executes one turn + - external CLI harnesses spawn a child process + - Stagehand runs in-process + - browser-use launches an inline Python agent via `uvx` +5. The returned content/raw/usage is wrapped into an assistant message +6. The assistant message is appended to the session history +7. The turn result is returned to the driver + +### MCP lifecycle + +`MCPServer` supports two distinct lifecycles: + +- launch-config only + - used when an external harness such as Claude Code or OpenCode consumes the tool server itself + - `getLaunchConfig()` is enough +- in-process MCP client + - used when this package wants to introspect or call tools directly + - `start()` creates a stdio transport and MCP client + - `listTools()` and `callTool()` operate through that client + - `stop()` closes the client and transport + +This split is important because most harnesses do not want the driver to proxy tool calls. They want raw stdio server definitions and talk to the MCP servers themselves. + +### Browser lifecycle + +`BrowserSession` has two modes with different shutdown semantics: + +- `local` + - launches a browser process through Puppeteer + - shutdown closes the browser process +- `cdp` + - attaches to an existing browser target + - shutdown disconnects only and leaves the external browser running + +Both modes normalize metadata into the same shape so adapters and harnesses can depend on: + +- `getCdpUrl()` +- `getBrowserUrl()` +- `getMetadata()` + +### Why this split exists + +The main tradeoff in this package is isolation over convenience: + +- browser ownership is centralized so all harnesses can share one target +- harness state is isolated so resume ids, prompts, and histories do not bleed together +- tool adapters are isolated so each server can define its own launch/config semantics +- the driver remains small because it mostly wires actors together instead of containing business logic + +That makes it easier to add new harnesses and toolsets without rewriting the orchestration layer. + +## Support Matrix + +### Agent harnesses + +| Harness | Status | Notes | +| --- | --- | --- | +| `claude-code` | implemented, manually verified | Verified with Playwright MCP + local browser | +| `codex` | implemented, partially verified | Session wiring works, but MCP tool calls were cancelled in this environment | +| `gemini-cli` | implemented | Uses isolated `GEMINI_CLI_HOME` + `.gemini/settings.json`; live run here stops on missing Gemini auth | +| `opencode` | implemented, manually verified | Verified with Playwright MCP + local browser via the native OpenCode binary | +| `browser-use` | implemented, manually verified | Verified with shared local browser via CDP; currently uses browser-use native tools instead of external MCP bridging | +| `stagehand` | implemented | In-process Stagehand V3 harness; supports `dom`, `hybrid`, and `cua` modes | + +### MCP servers / toolsets + +| Toolset | Status | Notes | +| --- | --- | --- | +| `playwright` | implemented, manually verified | Supports `--cdp-endpoint` and shared browser ownership | +| `chrome-devtools` | implemented | Adapter implemented; not manually re-verified in this pass | +| `agent-browser` | implemented | MCP/tool adapter only | +| `browser-use` | implemented | Uses `uvx browser-use[cli] --mcp` by default | +| `stagehand-agent` | implemented, smoke-tested | Internal stdio MCP server enumerates Stagehand agent tools against a shared CDP browser | +| `understudy` | implemented, smoke-tested | Internal stdio MCP server enumerates Understudy page tools against a shared CDP browser | + +### Browser modes + +| Browser mode | Status | Notes | +| --- | --- | --- | +| `local` | implemented, manually verified | Launches local Chrome/Chromium through Puppeteer | +| `cdp` | implemented | Attaches to an existing CDP target | + +## Harness reference + +### `claude-code` + +- Binary: `claude` +- Invocation shape: `claude -p --output-format json ...` +- Session behavior: resumes with `--resume ` when a prior turn exists +- Tool integration: writes a temporary MCP config JSON and passes `--mcp-config ... --strict-mcp-config` +- Model support: forwarded through `--model` +- Permission mode: forwarded through `--permission-mode`, defaulting to `bypassPermissions` +- Auth/setup: requires a working Claude Code install and auth on the machine +- Best fit: external agent loop that should consume MCP servers directly + +### `codex` + +- Binary: `codex` +- Invocation shape: `codex exec --json ...` +- Session behavior: resumes with `codex exec resume --json ` +- Tool integration: injects MCP server definitions through `-c mcp_servers..*=...` +- Model support: forwarded through `--model` +- Permission/sandbox behavior: forces `approval_policy`, `sandbox_mode=workspace-write`, and network access on +- Auth/setup: requires a working Codex CLI install and auth on the machine +- Known limitation: in this environment, Codex session wiring works but MCP tool calls were cancelled at runtime + +### `gemini-cli` + +- Binary: `gemini` +- Invocation shape: `gemini --prompt ... --output-format json ...` +- Session behavior: resumes with `--resume ` +- Tool integration: creates an isolated temporary `GEMINI_CLI_HOME`, writes `.gemini/settings.json`, and injects MCP servers under `mcpServers` +- MCP allow-listing: passes `--allowed-mcp-server-names ...` for the attached servers +- Model support: forwarded through `--model` +- Permission mode: mapped to Gemini approval modes; `never`/`bypassPermissions` become `yolo` +- Auth/setup: requires Gemini auth or `GEMINI_API_KEY` +- Current verification: the harness path is real and verified, but this environment does not have Gemini auth configured + +### `opencode` + +- Binary: resolves the native OpenCode binary directly, bypassing the broken Homebrew wrapper when necessary +- Invocation shape: `opencode run --format json ...` +- Session behavior: resumes with `--session ` +- Tool integration: injects an isolated `OPENCODE_CONFIG_CONTENT` JSON payload with local MCP server entries +- Model support: forwarded through `--model` +- Auth/setup: requires a working OpenCode install and auth on the machine +- Current verification: verified end-to-end with Playwright MCP + local browser + +### `browser-use` + +- Runtime: `uvx --from browser-use[...] python -c ...` +- Execution model: runs a small inline Python program that creates a `browser_use.Agent` +- Browser integration: connects to the shared browser via the session CDP URL +- Model support: provider is inferred from model prefix or available env vars + - `anthropic/...` + - `google/...` + - `browser-use/...` +- Auth/setup: requires `uv` plus provider credentials such as `ANTHROPIC_API_KEY` +- Tool integration: currently uses browser-use native tools only; attached MCP servers are intentionally rejected +- Current verification: verified end-to-end with a shared local browser and Anthropic-backed browser-use + +### `stagehand` + +- Runtime: in-process via `@browserbasehq/stagehand` +- Browser integration: connects Stagehand V3 to the shared browser CDP URL +- Tool integration: starts temporary MCP stdio clients and passes them as Stagehand integrations for the current turn +- Mode support: `dom`, `hybrid`, `cua` +- Model support: forwarded into `V3` and `agent(...)` +- Auth/setup: depends on whatever model/provider config Stagehand uses in your environment +- Best fit: native Stagehand agent loop over the shared browser + +## Toolset reference + +### `playwright` + +- Adapter binary: `@playwright/mcp` +- Launch shape: runs the package bin with Node +- Browser wiring: + - shared browser present: passes `--cdp-endpoint ` + - no shared browser: lets Playwright MCP launch its own browser +- Viewport support: passes `--viewport-size x` when configured +- Headless behavior: passes `--headless` only when not attaching to an existing browser +- Best fit: Claude Code, OpenCode, Codex, or Gemini harnesses that should drive the browser through MCP + +### `chrome-devtools` + +- Adapter binary: `chrome-devtools-mcp` +- Launch shape: runs the package bin with Node +- Browser wiring: + - shared browser present: passes `--browser-url=` + - no shared browser and headless requested: passes `--headless=true --isolated=true` +- Viewport support: passes `--viewport x` +- Extra behavior: always passes `--no-usage-statistics` +- Best fit: DevTools-oriented browser inspection/control through MCP + +### `agent-browser` + +- Adapter binary: `agent-browser-mcp` +- Additional runtime dependency: resolves the `agent-browser` CLI and exports it as `AGENT_BROWSER_PATH` +- Browser wiring: delegated to the upstream MCP server/runtime +- Best fit: exposing agent-browser capabilities as tools to another harness +- Note: this exists as a tool adapter, not as a first-class harness + +### `browser-use` + +- Adapter command: defaults to `uvx browser-use[cli] --mcp` +- Override support: `command` and `args` can replace the default launcher +- Browser wiring: delegated to the upstream browser-use MCP server +- Best fit: exposing browser-use MCP tools to another external harness +- Note: separate from the native `browser-use` harness described above + +### `stagehand-agent` + +- Adapter command: loops back into `multiagent mcp-server stagehand-agent` +- Runtime choice: + - built package: uses `dist/cli.js` + - source tree without build output: falls back to `node --import tsx lib/cli.ts` +- Browser wiring: passes `--cdp-url ` from the shared browser session +- Best fit: exposing Stagehand agent tools over stdio MCP to another harness + +### `understudy` + +- Adapter command: loops back into `multiagent mcp-server understudy` +- Runtime choice: + - built package: uses `dist/cli.js` + - source tree without build output: falls back to `node --import tsx lib/cli.ts` +- Browser wiring: passes `--cdp-url ` from the shared browser session +- Best fit: exposing Understudy page tools over stdio MCP to another harness + +## Browser mode reference + +### `local` + +- Launches Chrome/Chromium via `puppeteer-core` +- Default channel: `chrome` +- Default headless mode: `true` +- Exposes: + - a CDP WebSocket URL for harnesses and MCP servers + - a derived browser HTTP URL for adapters that expect `browserURL` +- Supports: + - `channel` + - `executablePath` + - `userDataDir` + - `viewport` + - `args` + - `ignoreHTTPSErrors` + - `connectTimeoutMs` + +### `cdp` + +- Attaches to an existing CDP browser via either: + - `browserURL` when the configured URL is `http(s)://...` + - `browserWSEndpoint` when the configured URL is `ws(s)://...` +- Treats the browser as externally owned +- On shutdown: disconnects instead of closing the browser +- Requires: `browser.cdpUrl` + +## Known limitations + +- `agent-browser` is supported as a tool/runtime adapter, not as an agent harness. +- `browser-use` currently does not bridge external MCP toolsets into the browser-use agent. It uses browser-use native tools only. +- `gemini-cli` requires Gemini auth. In this environment, the harness wiring works, but no `GEMINI_API_KEY` or Gemini auth profile is configured. + +## CLI usage + +Install the package: + +```bash +npm install @browserbasehq/multiagent +``` + +Run a single verified agent + toolset + browser combination: + +```bash +multiagent run \ + --task "Use the Playwright browser tools to open https://example.com and reply with only the page title." \ + --agent opencode \ + --mcp playwright \ + --browser local \ + --headless \ + --json +``` + +Run multiple harnesses against the same browser/tool combination: + +```bash +multiagent run \ + --task "Open https://example.com and summarize what is on the page." \ + --agent claude-code \ + --agent opencode \ + --mcp playwright \ + --browser local \ + --headless +``` + +Attach to an existing browser instead of launching one: + +```bash +multiagent run \ + --task "Inspect the current tab and return its title." \ + --agent claude-code \ + --mcp chrome-devtools \ + --browser cdp \ + --cdp-url ws://127.0.0.1:9222/devtools/browser/... +``` + +Use the browser-use harness with its native tool stack: + +```bash +multiagent run \ + --task "Open https://example.com and reply with only the page title." \ + --agent browser-use \ + --browser local \ + --headless \ + --model anthropic/claude-sonnet-4-20250514 \ + --json +``` + +Run Gemini with isolated settings: + +```bash +multiagent run \ + --task "Inspect the current page and summarize it." \ + --agent gemini-cli \ + --mcp playwright \ + --browser local \ + --headless \ + --json +``` + +Serve the internal MCP servers directly: + +```bash +multiagent mcp-server stagehand-agent --cdp-url ws://127.0.0.1:9222/devtools/browser/... +multiagent mcp-server understudy --cdp-url ws://127.0.0.1:9222/devtools/browser/... +``` + +## JSON config + +Use `--config` when you need per-agent options such as distinct models or Stagehand modes. + +```json +{ + "task": "Open example.com and summarize the page.", + "cwd": "/path/to/project", + "browser": { + "type": "local", + "headless": true, + "viewport": { + "width": 1440, + "height": 900 + } + }, + "mcpServers": [ + { "type": "playwright" }, + { "type": "understudy" } + ], + "agents": [ + { "type": "claude-code" }, + { "type": "opencode" }, + { + "type": "stagehand", + "stagehandMode": "hybrid", + "model": "google/gemini-2.0-flash" + } + ] +} +``` + +Run it: + +```bash +multiagent run --config ./multiagent.json --json +``` + +## Config reference + +### `browser` + +Supported keys come from `BrowserSessionOptions`: + +- `type`: `local` or `cdp` +- `cdpUrl`: required for `type: "cdp"` +- `headless` +- `executablePath` +- `channel` +- `userDataDir` +- `viewport` +- `args` +- `ignoreHTTPSErrors` +- `connectTimeoutMs` + +### `mcpServers[]` + +Supported keys come from `MCPServerOptions`: + +- `type` + - `playwright` + - `chrome-devtools` + - `agent-browser` + - `browser-use` + - `stagehand-agent` + - `understudy` +- `name` +- `enabled` +- `env` +- `args` +- `command` +- `browser` +- `transport` + +### `agents[]` + +Supported keys come from `AgentHarnessOptions`: + +- `type` + - `claude-code` + - `codex` + - `gemini-cli` + - `opencode` + - `browser-use` + - `stagehand` +- `model` +- `cwd` +- `env` +- `args` +- `permissionMode` +- `stagehandMode` + +## Library surface + +The package currently exports: + +- `BrowserSession` +- `MCPServer` +- `AgentSession` +- `MultiAgentDriver` + +That is enough to either use the bundled CLI or build a higher-level scheduler that decides which agents, toolsets, and browser sessions to compose for a task. + +## Development setup + +When you add or change a harness, toolset, browser mode, config field, auth prerequisite, or verification status, update this README in the same change. + +### Prerequisites + +- Node `^20.19.0 || >=22.12.0` +- `pnpm` +- local Chrome/Chromium if you want `browser.type = "local"` +- `uv` if you want the `browser-use` harness or `browser-use` MCP server +- external agent CLIs on your `PATH` for the harnesses you plan to use + - `claude` + - `codex` + - `gemini` + - `opencode` + +### Install + +From the repo root: + +```bash +pnpm install --ignore-scripts +``` + +### Build and test + +```bash +pnpm --filter @browserbasehq/multiagent run typecheck +pnpm --filter @browserbasehq/multiagent run test +pnpm --filter @browserbasehq/multiagent run build +``` + +### Useful smoke tests + +OpenCode + Playwright + local browser: + +```bash +node packages/multiagent/dist/cli.js run \ + --task "Use the Playwright browser tools to open https://example.com and reply with only the page title." \ + --agent opencode \ + --mcp playwright \ + --browser local \ + --headless \ + --json +``` + +browser-use + local browser: + +```bash +node packages/multiagent/dist/cli.js run \ + --task "Open https://example.com and reply with only the page title." \ + --agent browser-use \ + --browser local \ + --headless \ + --model anthropic/claude-sonnet-4-20250514 \ + --json +``` + +Gemini auth-path verification: + +```bash +node packages/multiagent/dist/cli.js run \ + --task "Open https://example.com and reply with only the page title." \ + --agent gemini-cli \ + --browser local \ + --headless \ + --json +``` + +### Credentials + +Each harness uses its native auth flow: + +- `claude-code`: Anthropic / Claude Code auth +- `codex`: Codex auth +- `gemini-cli`: Gemini auth or `GEMINI_API_KEY` +- `opencode`: OpenCode auth +- `browser-use`: whichever provider matches the selected model + - for example `ANTHROPIC_API_KEY` with `anthropic/...` + +## Verification + +Manual verification completed for: + +- `claude-code` harness +- `opencode` harness with Playwright MCP + local browser +- `browser-use` harness with shared local browser via CDP +- `gemini-cli` harness error path through the driver +- `playwright` MCP adapter +- local browser launch owned by `BrowserSession` +- internal `stagehand-agent` MCP server boot + tool enumeration +- internal `understudy` MCP server boot + tool enumeration + +Successful end-to-end runs returned `Example Domain` from `https://example.com` through the built `multiagent` CLI for both: + +- OpenCode + Playwright + local browser +- browser-use + local browser diff --git a/packages/multiagent/lib/agents/harnesses/base.ts b/packages/multiagent/lib/agents/harnesses/base.ts new file mode 100644 index 000000000..b6e9f0224 --- /dev/null +++ b/packages/multiagent/lib/agents/harnesses/base.ts @@ -0,0 +1,73 @@ +import { randomUUID } from "node:crypto"; +import fs from "node:fs/promises"; +import os from "node:os"; +import path from "node:path"; +import type { + AgentHarnessOptions, + AgentHarnessRunResult, + AgentRunInput, + NamedStdioLaunchConfig, +} from "../../types.js"; + +export interface AgentHarness { + readonly name: AgentHarnessOptions["type"]; + start(): Promise; + stop(): Promise; + runTurn(input: AgentRunInput): Promise; +} + +export abstract class BaseHarness implements AgentHarness { + protected sessionId?: string; + private readonly tempDirs = new Set(); + + constructor(protected readonly options: AgentHarnessOptions) {} + + abstract readonly name: AgentHarnessOptions["type"]; + + async start(): Promise { + // default no-op + } + + async stop(): Promise { + await Promise.all( + [...this.tempDirs].map(async (tempDir) => { + await fs.rm(tempDir, { recursive: true, force: true }); + }), + ); + this.tempDirs.clear(); + } + + abstract runTurn(input: AgentRunInput): Promise; + + protected async writeTempFile( + baseName: string, + contents: string, + ): Promise { + const tempDir = await this.createTempDir(); + const filePath = path.join(tempDir, baseName); + await fs.writeFile(filePath, contents, "utf8"); + return filePath; + } + + protected async createTempDir(prefix = "multiagent"): Promise { + const tempDir = await fs.mkdtemp( + path.join(os.tmpdir(), `${prefix}-${randomUUID()}-`), + ); + this.tempDirs.add(tempDir); + return tempDir; + } + + protected normalizeMcpServers( + servers: NamedStdioLaunchConfig[], + ): NamedStdioLaunchConfig[] { + return servers.map((server) => ({ + name: server.name, + config: { + command: server.config.command, + args: server.config.args ?? [], + env: server.config.env ?? {}, + cwd: server.config.cwd, + }, + })); + } +} diff --git a/packages/multiagent/lib/agents/harnesses/browserUse.ts b/packages/multiagent/lib/agents/harnesses/browserUse.ts new file mode 100644 index 000000000..7d2debd9f --- /dev/null +++ b/packages/multiagent/lib/agents/harnesses/browserUse.ts @@ -0,0 +1,240 @@ +import type { + AgentHarnessOptions, + AgentHarnessRunResult, + AgentRunInput, +} from "../../types.js"; +import { BrowserSession } from "../../browser/session.js"; +import { + CommandExecutionError, + MultiagentError, +} from "../../utils/errors.js"; +import { runCommand } from "../../utils/process.js"; +import { BaseHarness } from "./base.js"; + +export interface BrowserUseProviderConfig { + packageSpec: string; + importStatement: string; + llmFactory: string; +} + +export interface BrowserUseScriptPayload { + task: string; + cdpUrl: string; + model?: string; +} + +const BROWSER_USE_PROVIDER_CONFIG: Record = { + anthropic: { + packageSpec: "browser-use[anthropic]", + importStatement: "from browser_use import ChatAnthropic", + llmFactory: "ChatAnthropic(model=model_name)", + }, + google: { + packageSpec: "browser-use[google]", + importStatement: "from browser_use import ChatGoogle", + llmFactory: "ChatGoogle(model=model_name)", + }, + "browser-use": { + packageSpec: "browser-use", + importStatement: "from browser_use import ChatBrowserUse", + llmFactory: "ChatBrowserUse()", + }, +}; + +export function resolveBrowserUseProvider( + model?: string, + env: NodeJS.ProcessEnv = process.env, +): { + provider: keyof typeof BROWSER_USE_PROVIDER_CONFIG; + modelName?: string; +} { + if (model?.startsWith("anthropic/")) { + return { + provider: "anthropic", + modelName: model.slice("anthropic/".length), + }; + } + + if (model?.startsWith("google/")) { + return { + provider: "google", + modelName: model.slice("google/".length), + }; + } + + if (model?.startsWith("browser-use/")) { + return { + provider: "browser-use", + modelName: model.slice("browser-use/".length), + }; + } + + if (env.ANTHROPIC_API_KEY) { + return { + provider: "anthropic", + modelName: model ?? "claude-sonnet-4-20250514", + }; + } + + if (env.GOOGLE_API_KEY || env.GEMINI_API_KEY) { + return { + provider: "google", + modelName: model ?? "gemini-2.5-flash", + }; + } + + if (env.BROWSER_USE_API_KEY) { + return { + provider: "browser-use", + modelName: model, + }; + } + + throw new MultiagentError( + "Browser Use requires a supported model provider. Set ANTHROPIC_API_KEY, GOOGLE_API_KEY/GEMINI_API_KEY, or BROWSER_USE_API_KEY, or pass an explicit model prefix such as anthropic/... or google/....", + ); +} + +export function buildBrowserUseScript( + providerConfig: BrowserUseProviderConfig, +): string { + return ` +import asyncio +import json +import sys + +from browser_use import Agent, Browser +${providerConfig.importStatement} + + +async def main() -> None: + payload = json.loads(sys.stdin.read()) + browser = Browser(cdp_url=payload["cdpUrl"]) + model_name = payload.get("model") + llm = ${providerConfig.llmFactory} + agent = Agent( + task=payload["task"], + llm=llm, + browser=browser, + ) + try: + history = await agent.run(max_steps=20) + result = { + "finalResult": history.final_result(), + "errors": history.errors(), + "urls": history.urls(), + "raw": history.model_dump(mode="json"), + } + print(json.dumps(result)) + finally: + await browser.stop() + + +asyncio.run(main()) +`.trim(); +} + +export function parseBrowserUseResult(stdout: string): AgentHarnessRunResult { + const parsed = JSON.parse(stdout.trim()) as { + finalResult?: string | null; + errors?: Array; + urls?: string[]; + raw?: unknown; + }; + const errors = (parsed.errors ?? []).filter( + (value): value is string => typeof value === "string" && value.length > 0, + ); + + return { + content: + parsed.finalResult ?? + errors.join("\n") ?? + "", + raw: parsed, + }; +} + +export class BrowserUseHarness extends BaseHarness { + readonly name = "browser-use" as const; + + constructor( + options: AgentHarnessOptions, + private readonly browserSession: BrowserSession, + ) { + super(options); + } + + async runTurn(input: AgentRunInput): Promise { + const cdpUrl = this.browserSession.getCdpUrl(); + if (!cdpUrl) { + throw new MultiagentError( + "Browser Use requires a BrowserSession with an active CDP URL.", + ); + } + + if (input.mcpServers.length > 0) { + throw new MultiagentError( + "Browser Use is implemented with its native tool stack, but external MCP server bridging is not implemented yet for this harness.", + ); + } + + const provider = resolveBrowserUseProvider(this.options.model, { + ...process.env, + ...(this.options.env ?? {}), + }); + const providerConfig = BROWSER_USE_PROVIDER_CONFIG[provider.provider]; + const script = buildBrowserUseScript(providerConfig); + const payload: BrowserUseScriptPayload = { + task: input.prompt, + cdpUrl, + model: provider.modelName, + }; + + try { + const { stdout } = await runCommand({ + command: "uvx", + args: [ + "--python", + "3.11", + "--from", + providerConfig.packageSpec, + "python", + "-c", + script, + ], + cwd: this.options.cwd ?? input.cwd, + env: this.options.env, + input: JSON.stringify(payload), + }); + + return parseBrowserUseResult(stdout); + } catch (error) { + if (error instanceof CommandExecutionError && error.details.stdout.trim()) { + try { + const raw = JSON.parse(error.details.stdout.trim()) as { + finalResult?: string | null; + errors?: Array; + }; + const errors = (raw.errors ?? []).filter( + (value): value is string => + typeof value === "string" && value.length > 0, + ); + + if (typeof raw.finalResult === "string" && raw.finalResult.length > 0) { + return parseBrowserUseResult(error.details.stdout); + } + + if (errors.length > 0) { + throw new MultiagentError(errors.join("\n")); + } + } catch (parseError) { + if (parseError instanceof MultiagentError) { + throw parseError; + } + } + } + + throw error; + } + } +} diff --git a/packages/multiagent/lib/agents/harnesses/claudeCode.ts b/packages/multiagent/lib/agents/harnesses/claudeCode.ts new file mode 100644 index 000000000..6d448a6a1 --- /dev/null +++ b/packages/multiagent/lib/agents/harnesses/claudeCode.ts @@ -0,0 +1,101 @@ +import type { + AgentHarnessRunResult, + AgentHarnessOptions, + AgentRunInput, + NamedStdioLaunchConfig, +} from "../../types.js"; +import { runCommand } from "../../utils/process.js"; +import { BaseHarness } from "./base.js"; + +type ClaudeJsonResult = { + result?: string; + session_id?: string; + usage?: { + input_tokens?: number; + output_tokens?: number; + cache_read_input_tokens?: number; + cache_creation_input_tokens?: number; + }; +}; + +function buildClaudeMcpConfig(mcpServers: NamedStdioLaunchConfig[]): string { + const mcpServersRecord = Object.fromEntries( + mcpServers.map((server) => [ + server.name, + { + command: server.config.command, + args: server.config.args ?? [], + env: server.config.env ?? {}, + }, + ]), + ); + + return JSON.stringify({ mcpServers: mcpServersRecord }, null, 2); +} + +export class ClaudeCodeHarness extends BaseHarness { + readonly name = "claude-code" as const; + + constructor(options: AgentHarnessOptions) { + super(options); + } + + async runTurn(input: AgentRunInput): Promise { + const args = [ + "-p", + "--output-format", + "json", + "--permission-mode", + this.options.permissionMode ?? "bypassPermissions", + ]; + + if (this.options.model) { + args.push("--model", this.options.model); + } + + const normalizedServers = this.normalizeMcpServers(input.mcpServers); + if (normalizedServers.length > 0) { + const configPath = await this.writeTempFile( + "claude-mcp.json", + buildClaudeMcpConfig(normalizedServers), + ); + args.push("--mcp-config", configPath, "--strict-mcp-config"); + } + + if (this.sessionId) { + args.push("--resume", this.sessionId); + } + + if (this.options.args?.length) { + args.push(...this.options.args); + } + + args.push(input.prompt); + + const { stdout } = await runCommand({ + command: "claude", + args, + cwd: this.options.cwd ?? input.cwd, + env: this.options.env, + }); + + const parsed = JSON.parse(stdout.trim()) as ClaudeJsonResult; + this.sessionId = parsed.session_id ?? this.sessionId; + + return { + sessionId: this.sessionId, + content: String(parsed.result ?? ""), + raw: parsed, + usage: parsed.usage + ? { + inputTokens: parsed.usage.input_tokens, + outputTokens: parsed.usage.output_tokens, + cachedInputTokens: + (parsed.usage.cache_creation_input_tokens ?? 0) + + (parsed.usage.cache_read_input_tokens ?? 0), + raw: parsed.usage, + } + : undefined, + }; + } +} diff --git a/packages/multiagent/lib/agents/harnesses/codex.ts b/packages/multiagent/lib/agents/harnesses/codex.ts new file mode 100644 index 000000000..55b9b01d9 --- /dev/null +++ b/packages/multiagent/lib/agents/harnesses/codex.ts @@ -0,0 +1,146 @@ +import type { + AgentHarnessOptions, + AgentHarnessRunResult, + AgentRunInput, + NamedStdioLaunchConfig, +} from "../../types.js"; +import { runCommand } from "../../utils/process.js"; +import { BaseHarness } from "./base.js"; + +type CodexEvent = + | { type: "thread.started"; thread_id: string } + | { type: "item.completed"; item: { type: string; text?: string } } + | { + type: "turn.completed"; + usage?: { + input_tokens?: number; + output_tokens?: number; + cached_input_tokens?: number; + }; + }; + +function sanitizeServerName(name: string): string { + return name.replace(/[^a-zA-Z0-9_-]/g, "_"); +} + +function tomlString(value: string): string { + return JSON.stringify(value); +} + +function tomlArray(values: string[]): string { + return `[${values.map(tomlString).join(", ")}]`; +} + +function tomlInlineTable(values: Record): string { + const entries = Object.entries(values).map( + ([key, value]) => `${key}=${tomlString(value)}`, + ); + return `{${entries.join(", ")}}`; +} + +function buildCodexMcpArgs(mcpServers: NamedStdioLaunchConfig[]): string[] { + const args: string[] = []; + + for (const server of mcpServers) { + const name = sanitizeServerName(server.name); + args.push("-c", `mcp_servers.${name}.command=${tomlString(server.config.command)}`); + args.push( + "-c", + `mcp_servers.${name}.args=${tomlArray(server.config.args ?? [])}`, + ); + if (server.config.env && Object.keys(server.config.env).length > 0) { + args.push( + "-c", + `mcp_servers.${name}.env=${tomlInlineTable(server.config.env)}`, + ); + } + } + + return args; +} + +function parseCodexJsonl(stdout: string): AgentHarnessRunResult { + const events = stdout + .split("\n") + .map((line) => line.trim()) + .filter(Boolean) + .map((line) => JSON.parse(line) as CodexEvent); + + const threadStarted = events.find( + (event): event is Extract => + event.type === "thread.started", + ); + const agentMessages = events.filter( + (event): event is Extract => + event.type === "item.completed" && event.item.type === "agent_message", + ); + const turnCompleted = events.find( + (event): event is Extract => + event.type === "turn.completed", + ); + + return { + sessionId: threadStarted?.thread_id, + content: agentMessages.at(-1)?.item.text ?? "", + raw: events, + usage: turnCompleted?.usage + ? { + inputTokens: turnCompleted.usage.input_tokens, + outputTokens: turnCompleted.usage.output_tokens, + cachedInputTokens: turnCompleted.usage.cached_input_tokens, + raw: turnCompleted.usage, + } + : undefined, + }; +} + +export class CodexHarness extends BaseHarness { + readonly name = "codex" as const; + + constructor(options: AgentHarnessOptions) { + super(options); + } + + async runTurn(input: AgentRunInput): Promise { + const baseArgs = this.sessionId + ? ["exec", "resume", "--json", this.sessionId] + : ["exec", "--json"]; + + const args = [...baseArgs]; + + if (this.options.model) { + args.push("--model", this.options.model); + } + + args.push( + "-c", + `approval_policy=${tomlString(this.options.permissionMode ?? "never")}`, + "-c", + `sandbox_mode=${tomlString("workspace-write")}`, + "-c", + "sandbox_workspace_write.network_access=true", + ); + + args.push(...buildCodexMcpArgs(this.normalizeMcpServers(input.mcpServers))); + + if (this.options.args?.length) { + args.push(...this.options.args); + } + + args.push(input.prompt); + + const { stdout } = await runCommand({ + command: "codex", + args, + cwd: this.options.cwd ?? input.cwd, + env: this.options.env, + }); + + const result = parseCodexJsonl(stdout); + this.sessionId = result.sessionId ?? this.sessionId; + return { + ...result, + sessionId: this.sessionId, + }; + } +} diff --git a/packages/multiagent/lib/agents/harnesses/geminiCli.ts b/packages/multiagent/lib/agents/harnesses/geminiCli.ts new file mode 100644 index 000000000..fbef3e735 --- /dev/null +++ b/packages/multiagent/lib/agents/harnesses/geminiCli.ts @@ -0,0 +1,192 @@ +import fs from "node:fs/promises"; +import path from "node:path"; +import type { + AgentHarnessOptions, + AgentHarnessRunResult, + AgentRunInput, + NamedStdioLaunchConfig, +} from "../../types.js"; +import { + CommandExecutionError, + MultiagentError, +} from "../../utils/errors.js"; +import { runCommand } from "../../utils/process.js"; +import { BaseHarness } from "./base.js"; + +export interface GeminiJsonResult { + session_id?: string; + response?: string; + stats?: Record; + error?: { + type?: string; + message?: string; + code?: number | string; + }; +} + +function normalizeApprovalMode(permissionMode?: string): string { + if ( + permissionMode === "default" || + permissionMode === "auto_edit" || + permissionMode === "yolo" || + permissionMode === "plan" + ) { + return permissionMode; + } + + if ( + permissionMode === "bypassPermissions" || + permissionMode === "never" + ) { + return "yolo"; + } + + return "yolo"; +} + +export function buildGeminiSettings( + mcpServers: NamedStdioLaunchConfig[], +): Record { + const settings: Record = {}; + + if (mcpServers.length > 0) { + settings.mcpServers = Object.fromEntries( + mcpServers.map((server) => [ + server.name, + { + type: "stdio", + command: server.config.command, + args: server.config.args ?? [], + env: server.config.env ?? {}, + cwd: server.config.cwd, + }, + ]), + ); + settings.mcp = { + allowed: mcpServers.map((server) => server.name), + }; + } + + return settings; +} + +export function parseGeminiJsonResult(stdout: string): GeminiJsonResult { + const trimmed = stdout.trim(); + const jsonStart = trimmed.indexOf("{"); + + if (jsonStart === -1) { + throw new Error("Gemini CLI did not emit JSON output."); + } + + return JSON.parse(trimmed.slice(jsonStart)) as GeminiJsonResult; +} + +function buildGeminiError( + parsed: GeminiJsonResult, + fallback: string, +): MultiagentError { + return new MultiagentError(parsed.error?.message ?? fallback); +} + +export class GeminiCliHarness extends BaseHarness { + readonly name = "gemini-cli" as const; + + constructor(options: AgentHarnessOptions) { + super(options); + } + + async runTurn(input: AgentRunInput): Promise { + const tempHome = await this.createTempDir("multiagent-gemini"); + const geminiDir = path.join(tempHome, ".gemini"); + await fs.mkdir(geminiDir, { recursive: true }); + + const normalizedServers = this.normalizeMcpServers(input.mcpServers); + const settingsPath = path.join(geminiDir, "settings.json"); + await fs.writeFile( + settingsPath, + JSON.stringify(buildGeminiSettings(normalizedServers), null, 2), + "utf8", + ); + + const args = [ + "--prompt", + input.prompt, + "--output-format", + "json", + "--approval-mode", + normalizeApprovalMode(this.options.permissionMode), + ]; + + if (this.options.model) { + args.push("--model", this.options.model); + } + + if (this.sessionId) { + args.push("--resume", this.sessionId); + } + + if (normalizedServers.length > 0) { + args.push( + "--allowed-mcp-server-names", + ...normalizedServers.map((server) => server.name), + ); + } + + if (this.options.args?.length) { + args.push(...this.options.args); + } + + try { + const { stdout } = await runCommand({ + command: "gemini", + args, + cwd: this.options.cwd ?? input.cwd, + env: { + GEMINI_CLI_HOME: tempHome, + ...(this.options.env ?? {}), + }, + }); + + const parsed = parseGeminiJsonResult(stdout); + if (parsed.error) { + throw buildGeminiError(parsed, "Gemini CLI returned an error."); + } + + this.sessionId = parsed.session_id ?? this.sessionId; + return { + sessionId: this.sessionId, + content: parsed.response ?? "", + raw: parsed, + usage: parsed.stats + ? { + raw: parsed.stats, + } + : undefined, + }; + } catch (error) { + if (error instanceof CommandExecutionError) { + for (const candidate of [error.details.stdout, error.details.stderr]) { + if (!candidate.trim()) { + continue; + } + + try { + const parsed = parseGeminiJsonResult(candidate); + if (parsed.error) { + throw buildGeminiError( + parsed, + `Gemini CLI exited with code ${error.details.exitCode ?? "unknown"}.`, + ); + } + } catch (parseError) { + if (parseError instanceof MultiagentError) { + throw parseError; + } + } + } + } + + throw error; + } + } +} diff --git a/packages/multiagent/lib/agents/harnesses/opencode.ts b/packages/multiagent/lib/agents/harnesses/opencode.ts new file mode 100644 index 000000000..0d9831eee --- /dev/null +++ b/packages/multiagent/lib/agents/harnesses/opencode.ts @@ -0,0 +1,281 @@ +import fs from "node:fs"; +import type { + AgentHarnessOptions, + AgentHarnessRunResult, + AgentRunInput, + NamedStdioLaunchConfig, +} from "../../types.js"; +import { + CommandExecutionError, + MultiagentError, +} from "../../utils/errors.js"; +import { runCommand } from "../../utils/process.js"; +import { BaseHarness } from "./base.js"; + +export type OpencodeEvent = + | { type: "step_start"; sessionID?: string } + | { + type: "text"; + sessionID?: string; + part?: { text?: string }; + } + | { + type: "step_finish"; + sessionID?: string; + part?: { + cost?: number; + tokens?: { + input?: number; + output?: number; + reasoning?: number; + cache?: { + read?: number; + write?: number; + }; + }; + }; + } + | { + type: "error"; + sessionID?: string; + error?: { + name?: string; + data?: { + message?: string; + }; + message?: string; + }; + }; + +function getOpencodePlatformPrefix( + platform: string = process.platform, + arch: string = process.arch, +): string | null { + const platformName = + platform === "darwin" + ? "darwin" + : platform === "linux" + ? "linux" + : platform === "win32" + ? "windows" + : null; + const archName = + arch === "arm64" ? "arm64" : arch === "x64" ? "x64" : arch === "arm" ? "arm" : null; + + if (!platformName || !archName) { + return null; + } + + return `opencode-${platformName}-${archName}`; +} + +function getHomebrewCellars(): string[] { + return [ + "/opt/homebrew/Cellar/opencode", + "/usr/local/Cellar/opencode", + "/home/linuxbrew/.linuxbrew/Cellar/opencode", + ]; +} + +export function resolveOpencodeBinaryPath(runtime?: { + env?: NodeJS.ProcessEnv; + existsSync?: typeof fs.existsSync; + readdirSync?: typeof fs.readdirSync; + platform?: string; + arch?: string; +}): string { + const env = runtime?.env ?? process.env; + const existsSync = runtime?.existsSync ?? fs.existsSync; + const readdirSync = runtime?.readdirSync ?? fs.readdirSync; + + const envOverride = + env.MULTIAGENT_OPENCODE_BIN ?? env.OPENCODE_BIN_PATH; + if (envOverride && existsSync(envOverride)) { + return envOverride; + } + + const packagePrefix = getOpencodePlatformPrefix( + runtime?.platform, + runtime?.arch, + ); + if (packagePrefix) { + for (const cellar of getHomebrewCellars()) { + if (!existsSync(cellar)) { + continue; + } + + const versions = readdirSync(cellar).sort().reverse(); + for (const version of versions) { + const modulesDir = `${cellar}/${version}/libexec/lib/node_modules/opencode-ai/node_modules`; + if (!existsSync(modulesDir)) { + continue; + } + + const packageDirs = readdirSync(modulesDir) + .filter((entry) => entry.startsWith(packagePrefix)) + .sort((left, right) => { + const leftPenalty = left.includes("baseline") ? 1 : 0; + const rightPenalty = right.includes("baseline") ? 1 : 0; + return leftPenalty - rightPenalty; + }); + + for (const packageDir of packageDirs) { + const binaryPath = `${modulesDir}/${packageDir}/bin/opencode`; + if (existsSync(binaryPath)) { + return binaryPath; + } + } + } + } + } + + return "opencode"; +} + +export function buildOpencodeConfig( + mcpServers: NamedStdioLaunchConfig[], +): Record { + return { + mcp: Object.fromEntries( + mcpServers.map((server) => [ + server.name, + { + type: "local", + enabled: true, + command: [ + server.config.command, + ...(server.config.args ?? []), + ], + environment: server.config.env ?? {}, + }, + ]), + ), + }; +} + +export function parseOpencodeJsonl(stdout: string): AgentHarnessRunResult { + const events = stdout + .split("\n") + .map((line) => line.trim()) + .filter(Boolean) + .map((line) => JSON.parse(line) as OpencodeEvent); + + const sessionId = + [...events] + .reverse() + .find((event) => typeof event.sessionID === "string")?.sessionID ?? + undefined; + const content = events + .filter( + (event): event is Extract => + event.type === "text", + ) + .map((event) => event.part?.text ?? "") + .join(""); + const lastStepFinish = [...events] + .reverse() + .find( + (event): event is Extract => + event.type === "step_finish", + ); + + return { + sessionId, + content, + raw: events, + usage: lastStepFinish?.part?.tokens + ? { + inputTokens: lastStepFinish.part.tokens.input, + outputTokens: lastStepFinish.part.tokens.output, + cachedInputTokens: + (lastStepFinish.part.tokens.cache?.read ?? 0) + + (lastStepFinish.part.tokens.cache?.write ?? 0), + raw: lastStepFinish.part.tokens, + } + : undefined, + }; +} + +function parseOpencodeError(stdout: string, fallback: string): MultiagentError | null { + try { + const lastEvent = stdout + .split("\n") + .map((line) => line.trim()) + .filter(Boolean) + .map((line) => JSON.parse(line) as OpencodeEvent) + .at(-1); + + if (lastEvent?.type === "error") { + return new MultiagentError( + lastEvent.error?.data?.message ?? + lastEvent.error?.message ?? + fallback, + ); + } + } catch { + // ignore JSON parsing fallback errors + } + + return null; +} + +export class OpencodeHarness extends BaseHarness { + readonly name = "opencode" as const; + + constructor(options: AgentHarnessOptions) { + super(options); + } + + async runTurn(input: AgentRunInput): Promise { + const binaryPath = resolveOpencodeBinaryPath(); + const normalizedServers = this.normalizeMcpServers(input.mcpServers); + const args = ["run", "--format", "json"]; + + if (this.options.model) { + args.push("--model", this.options.model); + } + + if (this.sessionId) { + args.push("--session", this.sessionId); + } + + if (this.options.args?.length) { + args.push(...this.options.args); + } + + args.push(input.prompt); + + try { + const { stdout } = await runCommand({ + command: binaryPath, + args, + cwd: this.options.cwd ?? input.cwd, + env: { + OPENCODE_CONFIG_CONTENT: JSON.stringify( + buildOpencodeConfig(normalizedServers), + ), + ...(this.options.env ?? {}), + }, + }); + + const result = parseOpencodeJsonl(stdout); + this.sessionId = result.sessionId ?? this.sessionId; + return { + ...result, + sessionId: this.sessionId, + }; + } catch (error) { + if (error instanceof CommandExecutionError) { + const parsedError = parseOpencodeError( + error.details.stdout, + `OpenCode exited with code ${error.details.exitCode ?? "unknown"}.`, + ); + if (parsedError) { + throw parsedError; + } + } + + throw error; + } + } +} diff --git a/packages/multiagent/lib/agents/harnesses/stagehand.ts b/packages/multiagent/lib/agents/harnesses/stagehand.ts new file mode 100644 index 000000000..4f8b9781d --- /dev/null +++ b/packages/multiagent/lib/agents/harnesses/stagehand.ts @@ -0,0 +1,123 @@ +import { Client } from "@modelcontextprotocol/sdk/client/index.js"; +import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js"; +import { V3 } from "@browserbasehq/stagehand"; +import type { + AgentHarnessOptions, + AgentHarnessRunResult, + AgentRunInput, +} from "../../types.js"; +import { BrowserSession } from "../../browser/session.js"; +import { BaseHarness } from "./base.js"; + +function isRecord(value: unknown): value is Record { + return !!value && typeof value === "object"; +} + +export class StagehandHarness extends BaseHarness { + readonly name = "stagehand" as const; + private v3: V3 | null = null; + private readonly browserSession: BrowserSession; + + constructor(options: AgentHarnessOptions, browserSession: BrowserSession) { + super(options); + this.browserSession = browserSession; + } + + async start(): Promise { + if (this.v3) { + return; + } + + this.v3 = new V3({ + env: "LOCAL", + localBrowserLaunchOptions: { + cdpUrl: this.browserSession.getCdpUrl(), + }, + model: this.options.model, + experimental: true, + verbose: 0, + }); + await this.v3.init(); + } + + async stop(): Promise { + await this.v3?.close(); + this.v3 = null; + await super.stop(); + } + + async runTurn(input: AgentRunInput): Promise { + await this.start(); + + const integrations: Client[] = []; + const clients: Client[] = []; + const transports: StdioClientTransport[] = []; + + try { + for (const serverConfig of input.mcpServers) { + const transport = new StdioClientTransport({ + command: serverConfig.config.command, + args: serverConfig.config.args, + env: { + ...process.env, + ...(serverConfig.config.env ?? {}), + }, + cwd: serverConfig.config.cwd, + }); + const client = new Client({ + name: `multiagent-stagehand-${serverConfig.name}`, + version: "0.1.0", + }); + await client.connect(transport); + await client.ping(); + integrations.push(client); + clients.push(client); + transports.push(transport); + } + + const mode = this.options.stagehandMode ?? "dom"; + const agent = this.v3!.agent({ + mode, + model: this.options.model, + integrations, + }); + const result = await agent.execute({ + instruction: input.prompt, + maxSteps: 20, + }); + + const usage = + isRecord(result) && isRecord(result.usage) + ? { + inputTokens: + typeof result.usage.input_tokens === "number" + ? result.usage.input_tokens + : undefined, + outputTokens: + typeof result.usage.output_tokens === "number" + ? result.usage.output_tokens + : undefined, + cachedInputTokens: + typeof result.usage.cached_input_tokens === "number" + ? result.usage.cached_input_tokens + : undefined, + raw: result.usage, + } + : undefined; + + return { + content: + isRecord(result) && typeof result.message === "string" + ? result.message + : JSON.stringify(result), + raw: result, + usage, + }; + } finally { + await Promise.all(clients.map(async (client) => await client.close())); + await Promise.all( + transports.map(async (transport) => await transport.close()), + ); + } + } +} diff --git a/packages/multiagent/lib/agents/harnesses/stub.ts b/packages/multiagent/lib/agents/harnesses/stub.ts new file mode 100644 index 000000000..1257e29cf --- /dev/null +++ b/packages/multiagent/lib/agents/harnesses/stub.ts @@ -0,0 +1,23 @@ +import type { + AgentHarnessOptions, + AgentHarnessRunResult, + AgentRunInput, +} from "../../types.js"; +import { UnsupportedAdapterError } from "../../utils/errors.js"; +import { BaseHarness } from "./base.js"; + +export class StubHarness extends BaseHarness { + constructor( + readonly name: + | "gemini-cli" + | "opencode" + | "browser-use", + options: AgentHarnessOptions, + ) { + super(options); + } + + async runTurn(_input: AgentRunInput): Promise { + throw new UnsupportedAdapterError("Agent harness", this.name); + } +} diff --git a/packages/multiagent/lib/agents/index.ts b/packages/multiagent/lib/agents/index.ts new file mode 100644 index 000000000..4ca9526af --- /dev/null +++ b/packages/multiagent/lib/agents/index.ts @@ -0,0 +1 @@ +export { AgentSession } from "./session.js"; diff --git a/packages/multiagent/lib/agents/registry.ts b/packages/multiagent/lib/agents/registry.ts new file mode 100644 index 000000000..72914d1f7 --- /dev/null +++ b/packages/multiagent/lib/agents/registry.ts @@ -0,0 +1,32 @@ +import type { BrowserSession } from "../browser/session.js"; +import type { AgentHarnessOptions } from "../types.js"; +import { UnsupportedAdapterError } from "../utils/errors.js"; +import { BrowserUseHarness } from "./harnesses/browserUse.js"; +import type { AgentHarness } from "./harnesses/base.js"; +import { ClaudeCodeHarness } from "./harnesses/claudeCode.js"; +import { CodexHarness } from "./harnesses/codex.js"; +import { GeminiCliHarness } from "./harnesses/geminiCli.js"; +import { OpencodeHarness } from "./harnesses/opencode.js"; +import { StagehandHarness } from "./harnesses/stagehand.js"; + +export function createAgentHarness( + options: AgentHarnessOptions, + browserSession: BrowserSession, +): AgentHarness { + switch (options.type) { + case "claude-code": + return new ClaudeCodeHarness(options); + case "codex": + return new CodexHarness(options); + case "gemini-cli": + return new GeminiCliHarness(options); + case "opencode": + return new OpencodeHarness(options); + case "browser-use": + return new BrowserUseHarness(options, browserSession); + case "stagehand": + return new StagehandHarness(options, browserSession); + default: + throw new UnsupportedAdapterError("Agent harness", String(options.type)); + } +} diff --git a/packages/multiagent/lib/agents/session.ts b/packages/multiagent/lib/agents/session.ts new file mode 100644 index 000000000..7a56401b9 --- /dev/null +++ b/packages/multiagent/lib/agents/session.ts @@ -0,0 +1,87 @@ +import { randomUUID } from "node:crypto"; +import type { BrowserSession } from "../browser/session.js"; +import type { + AgentHarnessOptions, + AgentMessage, + AgentTurnResult, + NamedStdioLaunchConfig, +} from "../types.js"; +import { MCPServer } from "../mcp/server.js"; +import { createAgentHarness } from "./registry.js"; + +export interface AgentSessionOptions { + harness: AgentHarnessOptions; + browserSession: BrowserSession; + cwd: string; +} + +export class AgentSession { + readonly id = randomUUID(); + private readonly harness; + private readonly messages: AgentMessage[] = []; + private readonly mcpServers: MCPServer[] = []; + + constructor(private readonly options: AgentSessionOptions) { + this.harness = createAgentHarness( + options.harness, + options.browserSession, + ); + } + + async start(): Promise { + await this.harness.start(); + } + + async stop(): Promise { + await this.harness.stop(); + } + + attachMCPServer(server: MCPServer): void { + this.mcpServers.push(server); + } + + getMessages(): AgentMessage[] { + return [...this.messages]; + } + + private async getNamedLaunchConfigs(): Promise { + return await Promise.all( + this.mcpServers.map(async (server) => ({ + name: server.getName(), + config: await server.getLaunchConfig(), + })), + ); + } + + async addUserMessage(content: string): Promise { + const userMessage: AgentMessage = { + id: randomUUID(), + role: "user", + content, + createdAt: new Date().toISOString(), + }; + this.messages.push(userMessage); + + const result = await this.harness.runTurn({ + prompt: content, + mcpServers: await this.getNamedLaunchConfigs(), + cwd: this.options.cwd, + }); + + const assistantMessage: AgentMessage = { + id: randomUUID(), + role: "assistant", + content: result.content, + createdAt: new Date().toISOString(), + raw: result.raw, + }; + this.messages.push(assistantMessage); + + return { + sessionId: result.sessionId, + message: assistantMessage, + raw: result.raw, + usage: result.usage, + }; + } +} diff --git a/packages/multiagent/lib/browser/index.ts b/packages/multiagent/lib/browser/index.ts new file mode 100644 index 000000000..374a4a0af --- /dev/null +++ b/packages/multiagent/lib/browser/index.ts @@ -0,0 +1 @@ +export { BrowserSession } from "./session.js"; diff --git a/packages/multiagent/lib/browser/session.ts b/packages/multiagent/lib/browser/session.ts new file mode 100644 index 000000000..61d28e074 --- /dev/null +++ b/packages/multiagent/lib/browser/session.ts @@ -0,0 +1,140 @@ +import { randomUUID } from "node:crypto"; +import puppeteer, { type Browser } from "puppeteer-core"; +import type { + BrowserSessionMetadata, + BrowserSessionOptions, + BrowserTargetName, +} from "../types.js"; +import { MultiagentError } from "../utils/errors.js"; + +function deriveBrowserUrl(cdpUrl?: string): string | undefined { + if (!cdpUrl) { + return undefined; + } + + if (cdpUrl.startsWith("http://") || cdpUrl.startsWith("https://")) { + return cdpUrl; + } + + try { + const url = new URL(cdpUrl); + if (url.protocol === "ws:" || url.protocol === "wss:") { + url.protocol = url.protocol === "ws:" ? "http:" : "https:"; + url.pathname = ""; + url.search = ""; + url.hash = ""; + return url.toString().replace(/\/$/, ""); + } + } catch { + // best-effort normalization only + } + + return undefined; +} + +function normalizeType(options: BrowserSessionOptions): BrowserTargetName { + return options.type ?? (options.cdpUrl ? "cdp" : "local"); +} + +export class BrowserSession { + private browser: Browser | null = null; + private connected = false; + private readonly id = randomUUID(); + private cdpUrl?: string; + private browserUrl?: string; + private readonly type: BrowserTargetName; + private readonly launched: boolean; + + constructor(private readonly options: BrowserSessionOptions = {}) { + this.type = normalizeType(options); + this.launched = this.type === "local"; + } + + async start(): Promise { + if (this.connected) { + return; + } + + if (this.type === "cdp") { + const cdpUrl = this.options.cdpUrl?.trim(); + if (!cdpUrl) { + throw new MultiagentError( + "BrowserSession configured for CDP mode without a cdpUrl.", + ); + } + + this.browser = + cdpUrl.startsWith("http://") || cdpUrl.startsWith("https://") + ? await puppeteer.connect({ + browserURL: cdpUrl, + protocolTimeout: this.options.connectTimeoutMs, + }) + : await puppeteer.connect({ + browserWSEndpoint: cdpUrl, + protocolTimeout: this.options.connectTimeoutMs, + }); + this.cdpUrl = this.browser.wsEndpoint(); + this.browserUrl = deriveBrowserUrl(this.cdpUrl) ?? deriveBrowserUrl(cdpUrl); + this.connected = true; + return; + } + + this.browser = await puppeteer.launch({ + channel: this.options.executablePath ? undefined : this.options.channel ?? "chrome", + executablePath: this.options.executablePath, + headless: this.options.headless ?? true, + userDataDir: this.options.userDataDir, + args: [ + "--remote-allow-origins=*", + "--no-first-run", + "--no-default-browser-check", + ...(this.options.ignoreHTTPSErrors + ? ["--ignore-certificate-errors"] + : []), + ...(this.options.args ?? []), + ], + defaultViewport: this.options.viewport ?? null, + protocolTimeout: this.options.connectTimeoutMs, + }); + + this.cdpUrl = this.browser.wsEndpoint(); + this.browserUrl = deriveBrowserUrl(this.cdpUrl); + this.connected = true; + } + + async stop(): Promise { + if (!this.browser) { + return; + } + + if (this.launched) { + await this.browser.close(); + } else { + await this.browser.disconnect(); + } + + this.browser = null; + this.connected = false; + } + + getMetadata(): BrowserSessionMetadata { + return { + id: this.id, + type: this.type, + cdpUrl: this.cdpUrl, + browserUrl: this.browserUrl, + launched: this.launched, + headless: this.options.headless ?? true, + userDataDir: this.options.userDataDir, + viewport: this.options.viewport, + }; + } + + getCdpUrl(): string | undefined { + return this.cdpUrl; + } + + getBrowserUrl(): string | undefined { + return this.browserUrl; + } +} diff --git a/packages/multiagent/lib/cli.ts b/packages/multiagent/lib/cli.ts new file mode 100644 index 000000000..d41d94b48 --- /dev/null +++ b/packages/multiagent/lib/cli.ts @@ -0,0 +1,205 @@ +#!/usr/bin/env node + +import fs from "node:fs/promises"; +import { parseArgs } from "node:util"; +import type { + AgentHarnessName, + MCPServerName, + MultiAgentRunOptions, +} from "./types.js"; +import { + startStagehandAgentMCPServer, + startUnderstudyMcpServer, +} from "./mcp/internal/index.js"; +import { MultiAgentDriver } from "./runtime/driver.js"; +import { UnsupportedAdapterError } from "./utils/errors.js"; + +function readList(value: string | string[] | undefined): string[] { + if (!value) { + return []; + } + return Array.isArray(value) ? value : [value]; +} + +function printUsage(): void { + console.error(`Usage: + multiagent run --task "..." --agent claude-code --mcp playwright + multiagent mcp-server --cdp-url ws://127.0.0.1:9222/devtools/browser/... +`); +} + +async function runCommand(argv: string[]): Promise { + const { values } = parseArgs({ + args: argv, + options: { + config: { type: "string" }, + task: { type: "string" }, + agent: { type: "string", multiple: true }, + mcp: { type: "string", multiple: true }, + browser: { type: "string" }, + "cdp-url": { type: "string" }, + headless: { type: "boolean" }, + cwd: { type: "string" }, + json: { type: "boolean" }, + model: { type: "string" }, + }, + allowPositionals: true, + }); + + if (values.config) { + const configContents = await fs.readFile(values.config, "utf8"); + const config = JSON.parse(configContents) as MultiAgentRunOptions; + const driver = new MultiAgentDriver(config); + const result = await driver.run(); + + if (values.json) { + process.stdout.write(`${JSON.stringify(result, null, 2)}\n`); + return; + } + + process.stdout.write(`Browser ${result.browser.id} (${result.browser.type})\n`); + for (const agent of result.agents) { + process.stdout.write(`\n[${agent.harness}]\n`); + if (agent.error) { + process.stdout.write(`error: ${agent.error}\n`); + continue; + } + process.stdout.write(`${agent.content}\n`); + } + return; + } + + const agents = readList(values.agent) as AgentHarnessName[]; + const mcpServers = readList(values.mcp) as MCPServerName[]; + const task = + values.task ?? + (argv.length > 0 && !argv[0]?.startsWith("-") ? argv[0] : undefined); + + if (!task || agents.length === 0) { + printUsage(); + process.exitCode = 1; + return; + } + + const options: MultiAgentRunOptions = { + task, + cwd: values.cwd, + browser: { + type: + values.browser === "cdp" || values["cdp-url"] + ? "cdp" + : "local", + cdpUrl: values["cdp-url"], + headless: values.headless, + }, + agents: agents.map((agent) => ({ + type: agent, + model: values.model, + })), + mcpServers: mcpServers.map((server) => ({ + type: server, + })), + }; + + const driver = new MultiAgentDriver(options); + const result = await driver.run(); + + if (values.json) { + process.stdout.write(`${JSON.stringify(result, null, 2)}\n`); + return; + } + + process.stdout.write(`Browser ${result.browser.id} (${result.browser.type})\n`); + for (const agent of result.agents) { + process.stdout.write(`\n[${agent.harness}]\n`); + if (agent.error) { + process.stdout.write(`error: ${agent.error}\n`); + continue; + } + process.stdout.write(`${agent.content}\n`); + } +} + +function assertMcpServerType(value: string): asserts value is "stagehand-agent" | "understudy" { + if (value !== "stagehand-agent" && value !== "understudy") { + throw new UnsupportedAdapterError("Internal MCP server", value); + } +} + +async function runInternalMcpServer(argv: string[]): Promise { + const [serverType, ...rest] = argv; + if (!serverType) { + printUsage(); + process.exitCode = 1; + return; + } + + assertMcpServerType(serverType); + const { values } = parseArgs({ + args: rest, + options: { + "cdp-url": { type: "string" }, + model: { type: "string" }, + mode: { type: "string" }, + provider: { type: "string" }, + "execution-model": { type: "string" }, + "exclude-tool": { type: "string", multiple: true }, + "tool-timeout": { type: "string" }, + }, + }); + + if (!values["cdp-url"]) { + throw new Error(`Internal MCP server "${serverType}" requires --cdp-url.`); + } + + if (serverType === "stagehand-agent") { + await startStagehandAgentMCPServer({ + cdpUrl: values["cdp-url"], + model: values.model, + mode: + values.mode === "dom" || + values.mode === "hybrid" || + values.mode === "cua" + ? values.mode + : undefined, + provider: values.provider, + executionModel: values["execution-model"], + excludeTools: readList(values["exclude-tool"]), + toolTimeout: values["tool-timeout"] + ? Number(values["tool-timeout"]) + : undefined, + }); + return; + } + + await startUnderstudyMcpServer({ + cdpUrl: values["cdp-url"], + }); +} + +async function main(): Promise { + const [command = "run", ...rest] = process.argv.slice(2); + + if (command === "run") { + await runCommand(rest); + return; + } + + if (command === "mcp-server") { + await runInternalMcpServer(rest); + return; + } + + // Allow `multiagent "task"` as shorthand. + if (!command.startsWith("-")) { + await runCommand([command, ...rest]); + return; + } + + await runCommand([command, ...rest]); +} + +main().catch((error) => { + console.error(error instanceof Error ? error.message : String(error)); + process.exit(1); +}); diff --git a/packages/multiagent/lib/index.ts b/packages/multiagent/lib/index.ts new file mode 100644 index 000000000..9f348bc4a --- /dev/null +++ b/packages/multiagent/lib/index.ts @@ -0,0 +1,5 @@ +export * from "./types.js"; +export * from "./browser/index.js"; +export * from "./mcp/index.js"; +export * from "./agents/index.js"; +export { MultiAgentDriver } from "./runtime/driver.js"; diff --git a/packages/multiagent/lib/mcp/adapters/agentBrowser.ts b/packages/multiagent/lib/mcp/adapters/agentBrowser.ts new file mode 100644 index 000000000..884374865 --- /dev/null +++ b/packages/multiagent/lib/mcp/adapters/agentBrowser.ts @@ -0,0 +1,23 @@ +import type { StdioLaunchConfig } from "../../types.js"; +import { resolvePackageBin } from "../../utils/process.js"; +import type { MCPServerAdapter, MCPServerAdapterContext } from "./base.js"; + +export class AgentBrowserMCPAdapter implements MCPServerAdapter { + readonly type = "agent-browser" as const; + + async getLaunchConfig( + context: MCPServerAdapterContext, + ): Promise { + const serverEntry = resolvePackageBin("agent-browser-mcp", "agent-browser-mcp"); + const agentBrowserPath = resolvePackageBin("agent-browser", "agent-browser"); + + return { + command: process.execPath, + args: [serverEntry, ...(context.options.args ?? [])], + env: { + AGENT_BROWSER_PATH: agentBrowserPath, + ...(context.options.env ?? {}), + }, + }; + } +} diff --git a/packages/multiagent/lib/mcp/adapters/base.ts b/packages/multiagent/lib/mcp/adapters/base.ts new file mode 100644 index 000000000..5d2fe7ac0 --- /dev/null +++ b/packages/multiagent/lib/mcp/adapters/base.ts @@ -0,0 +1,12 @@ +import type { BrowserSession } from "../../browser/session.js"; +import type { MCPServerOptions, StdioLaunchConfig } from "../../types.js"; + +export interface MCPServerAdapterContext { + browserSession?: BrowserSession; + options: MCPServerOptions; +} + +export interface MCPServerAdapter { + readonly type: MCPServerOptions["type"]; + getLaunchConfig(context: MCPServerAdapterContext): Promise; +} diff --git a/packages/multiagent/lib/mcp/adapters/browserUse.ts b/packages/multiagent/lib/mcp/adapters/browserUse.ts new file mode 100644 index 000000000..60ed72c7f --- /dev/null +++ b/packages/multiagent/lib/mcp/adapters/browserUse.ts @@ -0,0 +1,18 @@ +import type { StdioLaunchConfig } from "../../types.js"; +import type { MCPServerAdapter, MCPServerAdapterContext } from "./base.js"; + +export class BrowserUseMCPAdapter implements MCPServerAdapter { + readonly type = "browser-use" as const; + + async getLaunchConfig( + context: MCPServerAdapterContext, + ): Promise { + return { + command: context.options.command ?? "uvx", + args: context.options.args?.length + ? [...context.options.args] + : ["browser-use[cli]", "--mcp"], + env: context.options.env, + }; + } +} diff --git a/packages/multiagent/lib/mcp/adapters/chromeDevtools.ts b/packages/multiagent/lib/mcp/adapters/chromeDevtools.ts new file mode 100644 index 000000000..9d03b937b --- /dev/null +++ b/packages/multiagent/lib/mcp/adapters/chromeDevtools.ts @@ -0,0 +1,39 @@ +import type { StdioLaunchConfig } from "../../types.js"; +import { resolvePackageBin } from "../../utils/process.js"; +import type { MCPServerAdapter, MCPServerAdapterContext } from "./base.js"; + +export class ChromeDevtoolsMCPAdapter implements MCPServerAdapter { + readonly type = "chrome-devtools" as const; + + async getLaunchConfig( + context: MCPServerAdapterContext, + ): Promise { + const entry = resolvePackageBin("chrome-devtools-mcp", "chrome-devtools-mcp"); + const args = [entry, "--no-usage-statistics"]; + const browserUrl = context.browserSession?.getBrowserUrl(); + + if (browserUrl) { + args.push(`--browser-url=${browserUrl}`); + } else if (context.options.browser?.headless ?? true) { + args.push("--headless=true"); + args.push("--isolated=true"); + } + + if (context.options.browser?.viewport) { + args.push( + "--viewport", + `${context.options.browser.viewport.width}x${context.options.browser.viewport.height}`, + ); + } + + if (context.options.args?.length) { + args.push(...context.options.args); + } + + return { + command: process.execPath, + args, + env: context.options.env, + }; + } +} diff --git a/packages/multiagent/lib/mcp/adapters/playwright.ts b/packages/multiagent/lib/mcp/adapters/playwright.ts new file mode 100644 index 000000000..c94c174fc --- /dev/null +++ b/packages/multiagent/lib/mcp/adapters/playwright.ts @@ -0,0 +1,38 @@ +import type { StdioLaunchConfig } from "../../types.js"; +import { resolvePackageBin } from "../../utils/process.js"; +import type { MCPServerAdapter, MCPServerAdapterContext } from "./base.js"; + +export class PlaywrightMCPAdapter implements MCPServerAdapter { + readonly type = "playwright" as const; + + async getLaunchConfig( + context: MCPServerAdapterContext, + ): Promise { + const entry = resolvePackageBin("@playwright/mcp", "playwright-mcp"); + const args = [entry]; + const cdpUrl = context.browserSession?.getCdpUrl(); + + if (cdpUrl) { + args.push("--cdp-endpoint", cdpUrl); + } else if (context.options.browser?.headless ?? true) { + args.push("--headless"); + } + + if (context.options.browser?.viewport) { + args.push( + "--viewport-size", + `${context.options.browser.viewport.width}x${context.options.browser.viewport.height}`, + ); + } + + if (context.options.args?.length) { + args.push(...context.options.args); + } + + return { + command: process.execPath, + args, + env: context.options.env, + }; + } +} diff --git a/packages/multiagent/lib/mcp/adapters/stagehandAgent.ts b/packages/multiagent/lib/mcp/adapters/stagehandAgent.ts new file mode 100644 index 000000000..94c8f14f2 --- /dev/null +++ b/packages/multiagent/lib/mcp/adapters/stagehandAgent.ts @@ -0,0 +1,46 @@ +import fs from "node:fs"; +import type { StdioLaunchConfig } from "../../types.js"; +import { getDistCliPath, getSourceCliPath } from "../../utils/runtimePaths.js"; +import type { MCPServerAdapter, MCPServerAdapterContext } from "./base.js"; + +function getSelfCommand(): StdioLaunchConfig { + const distCli = getDistCliPath(); + const sourceCli = getSourceCliPath(); + + if (!fs.existsSync(distCli)) { + return { + command: process.execPath, + args: ["--import", "tsx", sourceCli], + }; + } + + return { + command: process.execPath, + args: [distCli], + }; +} + +export class StagehandAgentMCPAdapter implements MCPServerAdapter { + readonly type = "stagehand-agent" as const; + + async getLaunchConfig( + context: MCPServerAdapterContext, + ): Promise { + const self = getSelfCommand(); + const args = [ + ...(self.args ?? []), + "mcp-server", + "stagehand-agent", + ...(context.browserSession?.getCdpUrl() + ? ["--cdp-url", context.browserSession.getCdpUrl() as string] + : []), + ...(context.options.args ?? []), + ]; + + return { + command: self.command, + args, + env: context.options.env, + }; + } +} diff --git a/packages/multiagent/lib/mcp/adapters/understudy.ts b/packages/multiagent/lib/mcp/adapters/understudy.ts new file mode 100644 index 000000000..5e96ed5a0 --- /dev/null +++ b/packages/multiagent/lib/mcp/adapters/understudy.ts @@ -0,0 +1,46 @@ +import fs from "node:fs"; +import type { StdioLaunchConfig } from "../../types.js"; +import { getDistCliPath, getSourceCliPath } from "../../utils/runtimePaths.js"; +import type { MCPServerAdapter, MCPServerAdapterContext } from "./base.js"; + +function getSelfCommand(): StdioLaunchConfig { + const distCli = getDistCliPath(); + const sourceCli = getSourceCliPath(); + + if (!fs.existsSync(distCli)) { + return { + command: process.execPath, + args: ["--import", "tsx", sourceCli], + }; + } + + return { + command: process.execPath, + args: [distCli], + }; +} + +export class UnderstudyMCPAdapter implements MCPServerAdapter { + readonly type = "understudy" as const; + + async getLaunchConfig( + context: MCPServerAdapterContext, + ): Promise { + const self = getSelfCommand(); + const args = [ + ...(self.args ?? []), + "mcp-server", + "understudy", + ...(context.browserSession?.getCdpUrl() + ? ["--cdp-url", context.browserSession.getCdpUrl() as string] + : []), + ...(context.options.args ?? []), + ]; + + return { + command: self.command, + args, + env: context.options.env, + }; + } +} diff --git a/packages/multiagent/lib/mcp/index.ts b/packages/multiagent/lib/mcp/index.ts new file mode 100644 index 000000000..b62f130a3 --- /dev/null +++ b/packages/multiagent/lib/mcp/index.ts @@ -0,0 +1,12 @@ +import type { BrowserSession } from "../browser/session.js"; +import type { MCPServerOptions } from "../types.js"; +import { MCPServer } from "./server.js"; + +export function createMCPServer( + options: MCPServerOptions, + browserSession?: BrowserSession, +): MCPServer { + return new MCPServer(options, browserSession); +} + +export { MCPServer } from "./server.js"; diff --git a/packages/multiagent/lib/mcp/internal/index.ts b/packages/multiagent/lib/mcp/internal/index.ts new file mode 100644 index 000000000..6418f28c2 --- /dev/null +++ b/packages/multiagent/lib/mcp/internal/index.ts @@ -0,0 +1,8 @@ +export { + startStagehandAgentMCPServer, + type StagehandAgentMCPServerOptions, +} from "./stagehandAgentServer.js"; +export { + startUnderstudyMcpServer, + type StartUnderstudyMcpServerOptions, +} from "./understudyServer.js"; diff --git a/packages/multiagent/lib/mcp/internal/stagehand-agent-utils.ts b/packages/multiagent/lib/mcp/internal/stagehand-agent-utils.ts new file mode 100644 index 000000000..f306b879f --- /dev/null +++ b/packages/multiagent/lib/mcp/internal/stagehand-agent-utils.ts @@ -0,0 +1,64 @@ +import { z, type ZodRawShape, type ZodTypeAny } from "zod"; + +function isRecord(value: unknown): value is Record { + return !!value && typeof value === "object" && !Array.isArray(value); +} + +function jsonReplacer(_key: string, value: unknown): unknown { + if (typeof value === "bigint") { + return value.toString(); + } + + if (value instanceof Uint8Array) { + return Buffer.from(value).toString("base64"); + } + + if (value instanceof Error) { + return { + name: value.name, + message: value.message, + stack: value.stack, + }; + } + + return value; +} + +export function safeJson(value: unknown): string { + try { + return JSON.stringify(value, jsonReplacer, 2); + } catch (error) { + return JSON.stringify( + { + error: "Failed to serialize tool result", + message: error instanceof Error ? error.message : String(error), + }, + null, + 2, + ); + } +} + +export function extractToolShape(schema: ZodTypeAny | undefined): ZodRawShape | undefined { + if (!schema) { + return undefined; + } + + if (schema instanceof z.ZodObject) { + return schema.shape; + } + + return undefined; +} + +export function inferIsError(result: unknown): boolean { + if (!isRecord(result)) { + return false; + } + + if (result.success === false) { + return true; + } + + return typeof result.error === "string" && result.error.length > 0; +} diff --git a/packages/multiagent/lib/mcp/internal/stagehandAgentServer.ts b/packages/multiagent/lib/mcp/internal/stagehandAgentServer.ts new file mode 100644 index 000000000..b8659d891 --- /dev/null +++ b/packages/multiagent/lib/mcp/internal/stagehandAgentServer.ts @@ -0,0 +1,115 @@ +import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; +import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; +import { V3 } from "@browserbasehq/stagehand/lib/v3/index.js"; +import { + createAgentTools, + type V3AgentToolOptions, +} from "@browserbasehq/stagehand/lib/v3/agent/tools/index.js"; +import { extractToolShape, inferIsError, safeJson } from "./stagehand-agent-utils.js"; + +export interface StagehandAgentMCPServerOptions { + cdpUrl: string; + model?: string; + executionModel?: string; + provider?: string; + mode?: NonNullable; + excludeTools?: string[]; + toolTimeout?: number; + variables?: V3AgentToolOptions["variables"]; + serverName?: string; + serverVersion?: string; +} + +interface StagehandToolLike { + description?: string; + inputSchema?: unknown; + execute: (args: Record) => Promise | unknown; +} + +export async function startStagehandAgentMCPServer( + options: StagehandAgentMCPServerOptions, +): Promise { + const v3 = new V3({ + env: "LOCAL", + model: options.model, + verbose: 0, + localBrowserLaunchOptions: { + cdpUrl: options.cdpUrl, + }, + }); + await v3.init(); + + const server = new McpServer({ + name: options.serverName ?? "multiagent-stagehand-agent", + version: options.serverVersion ?? "0.1.0", + }); + + registerStagehandAgentTools( + server, + createAgentTools(v3, { + executionModel: options.executionModel, + provider: options.provider, + mode: options.mode ?? "dom", + excludeTools: options.excludeTools, + variables: options.variables, + toolTimeout: options.toolTimeout, + }) as unknown as Record, + ); + + const cleanup = async () => { + await server.close().catch(() => {}); + await v3.close().catch(() => {}); + }; + + process.once("SIGINT", () => { + void cleanup().finally(() => process.exit(0)); + }); + process.once("SIGTERM", () => { + void cleanup().finally(() => process.exit(0)); + }); + process.once("exit", () => { + void cleanup(); + }); + + const transport = new StdioServerTransport(); + await server.connect(transport); +} + +function registerStagehandAgentTools( + server: McpServer, + tools: Record, +): void { + for (const [name, tool] of Object.entries(tools)) { + const schema = extractToolShape(tool.inputSchema as never); + const description = tool.description ?? name; + + if (!schema) { + server.tool(name, description, async () => { + const result = await tool.execute({}); + return { + content: [ + { + type: "text", + text: safeJson(result), + }, + ], + isError: inferIsError(result), + }; + }); + continue; + } + + server.tool(name, description, schema, async (args) => { + const result = await tool.execute(args); + return { + content: [ + { + type: "text", + text: safeJson(result), + }, + ], + isError: inferIsError(result), + }; + }); + } +} diff --git a/packages/multiagent/lib/mcp/internal/understudy-runtime.ts b/packages/multiagent/lib/mcp/internal/understudy-runtime.ts new file mode 100644 index 000000000..993249c3a --- /dev/null +++ b/packages/multiagent/lib/mcp/internal/understudy-runtime.ts @@ -0,0 +1,206 @@ +import { V3 } from "@browserbasehq/stagehand"; +import type { LoadState, PageSnapshotOptions } from "@browserbasehq/stagehand"; +import type { Page } from "@browserbasehq/stagehand"; + +export interface UnderstudyRuntimeOptions { + cdpUrl: string; +} + +export interface UnderstudyGotoInput { + url: string; + waitUntil?: LoadState; + timeoutMs?: number; +} + +export interface UnderstudyScreenshotInput { + type?: "png" | "jpeg"; + fullPage?: boolean; + quality?: number; + path?: string; + omitBackground?: boolean; + timeoutMs?: number; +} + +export interface UnderstudySnapshotInput extends PageSnapshotOptions {} + +export interface UnderstudyClickInput { + x: number; + y: number; + button?: "left" | "right" | "middle"; + clickCount?: number; + returnXpath?: boolean; +} + +export interface UnderstudyTypeInput { + text: string; + delay?: number; + withMistakes?: boolean; +} + +export interface UnderstudyKeyPressInput { + key: string; + delay?: number; +} + +export interface UnderstudyWaitForSelectorInput { + selector: string; + state?: "attached" | "detached" | "visible" | "hidden"; + timeout?: number; + pierceShadow?: boolean; +} + +export interface UnderstudyScrollInput { + x: number; + y: number; + deltaX: number; + deltaY: number; + returnXpath?: boolean; +} + +export class UnderstudyRuntime { + private v3: V3 | null = null; + + constructor(private readonly options: UnderstudyRuntimeOptions) {} + + async start(): Promise { + if (this.v3) { + return; + } + + this.v3 = new V3({ + env: "LOCAL", + verbose: 0, + localBrowserLaunchOptions: { + cdpUrl: this.options.cdpUrl, + }, + }); + await this.v3.init(); + } + + async stop(): Promise { + await this.v3?.close(); + this.v3 = null; + } + + async getPage(): Promise { + await this.start(); + const page = await this.v3!.context.awaitActivePage(); + this.v3!.context.setActivePage(page); + return page; + } + + async newPage(url = "about:blank"): Promise { + await this.start(); + const page = await this.v3!.context.newPage(url); + this.v3!.context.setActivePage(page); + return page; + } + + async goto(input: UnderstudyGotoInput): Promise<{ + url: string; + title: string; + }> { + const page = await this.getPage(); + await page.goto(input.url, { + waitUntil: input.waitUntil, + timeoutMs: input.timeoutMs, + }); + return { + url: page.url(), + title: await page.title(), + }; + } + + async getUrl(): Promise { + return (await this.getPage()).url(); + } + + async getTitle(): Promise { + return await (await this.getPage()).title(); + } + + async screenshot(input: UnderstudyScreenshotInput): Promise<{ + mimeType: string; + base64: string; + path?: string; + }> { + const page = await this.getPage(); + const type = input.type ?? "png"; + const buffer = await page.screenshot({ + type, + fullPage: input.fullPage, + quality: input.quality, + path: input.path, + omitBackground: input.omitBackground, + timeout: input.timeoutMs, + }); + + return { + mimeType: type === "jpeg" ? "image/jpeg" : "image/png", + base64: buffer.toString("base64"), + path: input.path, + }; + } + + async snapshot(input: UnderstudySnapshotInput): Promise<{ + formattedTree: string; + xpathMap: Record; + urlMap: Record; + }> { + const snapshot = await (await this.getPage()).snapshot(input); + return { + formattedTree: snapshot.formattedTree, + xpathMap: snapshot.xpathMap, + urlMap: snapshot.urlMap, + }; + } + + async click(input: UnderstudyClickInput): Promise<{ xpath?: string }> { + const xpath = await (await this.getPage()).click(input.x, input.y, { + button: input.button, + clickCount: input.clickCount, + returnXpath: input.returnXpath, + }); + return xpath ? { xpath } : {}; + } + + async type(input: UnderstudyTypeInput): Promise { + await (await this.getPage()).type(input.text, { + delay: input.delay, + withMistakes: input.withMistakes, + }); + } + + async keyPress(input: UnderstudyKeyPressInput): Promise { + await (await this.getPage()).keyPress(input.key, { + delay: input.delay, + }); + } + + async waitForSelector( + input: UnderstudyWaitForSelectorInput, + ): Promise { + return await (await this.getPage()).waitForSelector(input.selector, { + state: input.state, + timeout: input.timeout, + pierceShadow: input.pierceShadow, + }); + } + + async waitForTimeout(ms: number): Promise { + await (await this.getPage()).waitForTimeout(ms); + } + + async scroll(input: UnderstudyScrollInput): Promise<{ xpath?: string }> { + const xpath = await (await this.getPage()).scroll( + input.x, + input.y, + input.deltaX, + input.deltaY, + { + returnXpath: input.returnXpath, + }, + ); + return xpath ? { xpath } : {}; + } +} diff --git a/packages/multiagent/lib/mcp/internal/understudyServer.ts b/packages/multiagent/lib/mcp/internal/understudyServer.ts new file mode 100644 index 000000000..963a1b404 --- /dev/null +++ b/packages/multiagent/lib/mcp/internal/understudyServer.ts @@ -0,0 +1,337 @@ +import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; +import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; +import { z } from "zod"; +import { + UnderstudyRuntime, + type UnderstudyRuntimeOptions, +} from "./understudy-runtime.js"; + +function formatText(value: unknown): string { + if (typeof value === "string") { + return value; + } + + return JSON.stringify(value, null, 2); +} + +function createTextResult(value: unknown) { + return { + content: [{ type: "text" as const, text: formatText(value) }], + structuredContent: + value && typeof value === "object" + ? (value as Record) + : undefined, + }; +} + +export interface StartUnderstudyMcpServerOptions + extends UnderstudyRuntimeOptions {} + +export async function startUnderstudyMcpServer( + options: StartUnderstudyMcpServerOptions, +): Promise { + const runtime = new UnderstudyRuntime(options); + const server = new McpServer({ + name: "multiagent-understudy", + version: "0.1.0", + }); + + const shutdown = async () => { + await runtime.stop(); + process.exit(0); + }; + + process.once("SIGINT", () => { + void shutdown(); + }); + process.once("SIGTERM", () => { + void shutdown(); + }); + + server.tool( + "understudy_new_page", + "Create a new top-level page and make it the active page.", + { + url: z + .string() + .optional() + .describe("Optional URL to open in the new page."), + }, + async ({ url }) => { + const page = await runtime.newPage(url); + return createTextResult({ + url: page.url(), + title: await page.title(), + }); + }, + ); + + server.tool( + "understudy_goto", + "Navigate the active page to a URL using Understudy.", + { + url: z.string().url().describe("The destination URL."), + waitUntil: z + .enum(["load", "domcontentloaded", "networkidle"]) + .optional() + .describe("Lifecycle state to wait for before returning."), + timeoutMs: z + .number() + .int() + .positive() + .optional() + .describe("Navigation timeout in milliseconds."), + }, + async ({ url, waitUntil, timeoutMs }) => { + return createTextResult( + await runtime.goto({ + url, + waitUntil, + timeoutMs, + }), + ); + }, + ); + + server.tool( + "understudy_get_url", + "Get the active page URL.", + {}, + async () => createTextResult({ url: await runtime.getUrl() }), + ); + + server.tool( + "understudy_get_title", + "Get the active page title.", + {}, + async () => createTextResult({ title: await runtime.getTitle() }), + ); + + server.tool( + "understudy_screenshot", + "Capture a screenshot of the active page.", + { + type: z + .enum(["png", "jpeg"]) + .optional() + .describe("Image format for the screenshot."), + fullPage: z + .boolean() + .optional() + .describe("Capture the full scrollable page instead of the viewport."), + quality: z + .number() + .int() + .min(0) + .max(100) + .optional() + .describe("JPEG quality from 0-100. Only applies to jpeg."), + path: z + .string() + .optional() + .describe("Optional file path to save the screenshot to."), + omitBackground: z + .boolean() + .optional() + .describe("Use a transparent background when supported."), + timeoutMs: z + .number() + .int() + .positive() + .optional() + .describe("Screenshot timeout in milliseconds."), + }, + async ({ type, fullPage, quality, path, omitBackground, timeoutMs }) => { + return createTextResult( + await runtime.screenshot({ + type, + fullPage, + quality, + path, + omitBackground, + timeoutMs, + }), + ); + }, + ); + + server.tool( + "understudy_snapshot", + "Capture an Understudy snapshot of the active page for DOM-aware browsing.", + { + includeIframes: z + .boolean() + .optional() + .describe("Whether to include iframe contents in the snapshot."), + }, + async ({ includeIframes }) => { + return createTextResult( + await runtime.snapshot({ + includeIframes, + }), + ); + }, + ); + + server.tool( + "understudy_click", + "Click at viewport coordinates on the active page.", + { + x: z.number().describe("Viewport x coordinate in CSS pixels."), + y: z.number().describe("Viewport y coordinate in CSS pixels."), + button: z + .enum(["left", "right", "middle"]) + .optional() + .describe("Mouse button to use."), + clickCount: z + .number() + .int() + .positive() + .optional() + .describe("Number of sequential clicks to dispatch."), + returnXpath: z + .boolean() + .optional() + .describe("Return the resolved XPath for the hit target when possible."), + }, + async ({ x, y, button, clickCount, returnXpath }) => { + return createTextResult( + await runtime.click({ + x, + y, + button, + clickCount, + returnXpath, + }), + ); + }, + ); + + server.tool( + "understudy_type", + "Type text into the currently focused element on the active page.", + { + text: z.string().describe("Text to type."), + delay: z + .number() + .int() + .min(0) + .optional() + .describe("Optional delay between keystrokes in milliseconds."), + withMistakes: z + .boolean() + .optional() + .describe("Whether to simulate occasional typos and corrections."), + }, + async ({ text, delay, withMistakes }) => { + await runtime.type({ + text, + delay, + withMistakes, + }); + return createTextResult({ ok: true }); + }, + ); + + server.tool( + "understudy_key_press", + "Press a single key or key combination on the active page.", + { + key: z + .string() + .describe("Key or combination like Enter, Tab, Cmd+A, Ctrl+C."), + delay: z + .number() + .int() + .min(0) + .optional() + .describe("Optional delay between key down and key up."), + }, + async ({ key, delay }) => { + await runtime.keyPress({ + key, + delay, + }); + return createTextResult({ ok: true }); + }, + ); + + server.tool( + "understudy_wait_for_selector", + "Wait for a selector to reach a target state on the active page.", + { + selector: z + .string() + .describe("Selector to wait for. Supports iframe hop notation."), + state: z + .enum(["attached", "detached", "visible", "hidden"]) + .optional() + .describe("Desired selector state."), + timeout: z + .number() + .int() + .positive() + .optional() + .describe("Maximum time to wait in milliseconds."), + pierceShadow: z + .boolean() + .optional() + .describe("Whether to pierce shadow DOM while resolving the selector."), + }, + async ({ selector, state, timeout, pierceShadow }) => { + return createTextResult({ + matched: await runtime.waitForSelector({ + selector, + state, + timeout, + pierceShadow, + }), + }); + }, + ); + + server.tool( + "understudy_wait_for_timeout", + "Sleep for a fixed amount of time on the active page session.", + { + ms: z + .number() + .int() + .min(0) + .describe("Milliseconds to wait."), + }, + async ({ ms }) => { + await runtime.waitForTimeout(ms); + return createTextResult({ ok: true, waitedMs: ms }); + }, + ); + + server.tool( + "understudy_scroll", + "Dispatch a wheel scroll gesture at viewport coordinates.", + { + x: z.number().describe("Viewport x coordinate in CSS pixels."), + y: z.number().describe("Viewport y coordinate in CSS pixels."), + deltaX: z.number().describe("Horizontal wheel delta."), + deltaY: z.number().describe("Vertical wheel delta."), + returnXpath: z + .boolean() + .optional() + .describe("Return the resolved XPath for the hit target when possible."), + }, + async ({ x, y, deltaX, deltaY, returnXpath }) => { + return createTextResult( + await runtime.scroll({ + x, + y, + deltaX, + deltaY, + returnXpath, + }), + ); + }, + ); + + const transport = new StdioServerTransport(); + await server.connect(transport); +} diff --git a/packages/multiagent/lib/mcp/registry.ts b/packages/multiagent/lib/mcp/registry.ts new file mode 100644 index 000000000..7a9129d95 --- /dev/null +++ b/packages/multiagent/lib/mcp/registry.ts @@ -0,0 +1,28 @@ +import type { MCPServerOptions } from "../types.js"; +import { UnsupportedAdapterError } from "../utils/errors.js"; +import { AgentBrowserMCPAdapter } from "./adapters/agentBrowser.js"; +import type { MCPServerAdapter } from "./adapters/base.js"; +import { BrowserUseMCPAdapter } from "./adapters/browserUse.js"; +import { ChromeDevtoolsMCPAdapter } from "./adapters/chromeDevtools.js"; +import { PlaywrightMCPAdapter } from "./adapters/playwright.js"; +import { StagehandAgentMCPAdapter } from "./adapters/stagehandAgent.js"; +import { UnderstudyMCPAdapter } from "./adapters/understudy.js"; + +const adapters: Record = { + playwright: new PlaywrightMCPAdapter(), + "chrome-devtools": new ChromeDevtoolsMCPAdapter(), + "agent-browser": new AgentBrowserMCPAdapter(), + "browser-use": new BrowserUseMCPAdapter(), + "stagehand-agent": new StagehandAgentMCPAdapter(), + understudy: new UnderstudyMCPAdapter(), +}; + +export function getMCPServerAdapter( + type: MCPServerOptions["type"], +): MCPServerAdapter { + const adapter = adapters[type]; + if (!adapter) { + throw new UnsupportedAdapterError("MCP server", type); + } + return adapter; +} diff --git a/packages/multiagent/lib/mcp/server.ts b/packages/multiagent/lib/mcp/server.ts new file mode 100644 index 000000000..363d918c2 --- /dev/null +++ b/packages/multiagent/lib/mcp/server.ts @@ -0,0 +1,120 @@ +import { randomUUID } from "node:crypto"; +import { + Client, + type ClientOptions, +} from "@modelcontextprotocol/sdk/client/index.js"; +import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js"; +import type { BrowserSession } from "../browser/session.js"; +import type { + MCPServerOptions, + StdioLaunchConfig, +} from "../types.js"; +import { MultiagentError } from "../utils/errors.js"; +import { getMCPServerAdapter } from "./registry.js"; + +export class MCPServer { + readonly id = randomUUID(); + private client: Client | null = null; + private transport: StdioClientTransport | null = null; + + constructor( + private readonly options: MCPServerOptions, + private readonly browserSession?: BrowserSession, + ) {} + + getName(): string { + return this.options.name ?? this.options.type; + } + + async getLaunchConfig(): Promise { + const adapter = getMCPServerAdapter(this.options.type); + return await adapter.getLaunchConfig({ + browserSession: this.browserSession, + options: this.options, + }); + } + + async start(clientOptions?: ClientOptions): Promise { + if (this.client) { + return; + } + + const launchConfig = await this.getLaunchConfig(); + this.transport = new StdioClientTransport({ + command: launchConfig.command, + args: launchConfig.args, + env: { + ...process.env, + ...(launchConfig.env ?? {}), + }, + cwd: launchConfig.cwd, + }); + + this.client = new Client({ + name: "multiagent", + version: "0.1.0", + ...clientOptions, + }); + await this.client.connect(this.transport); + await this.client.ping(); + } + + async stop(): Promise { + await this.client?.close(); + await this.transport?.close(); + this.client = null; + this.transport = null; + } + + getClient(): Client | null { + return this.client; + } + + async listTools(): Promise< + Array<{ + name: string; + description?: string; + inputSchema?: unknown; + }> + > { + if (!this.client) { + throw new MultiagentError( + `MCP server ${this.getName()} is not started yet.`, + ); + } + + const tools: Array<{ + name: string; + description?: string; + inputSchema?: unknown; + }> = []; + let cursor: string | undefined; + + do { + const page = await this.client.listTools({ cursor }); + tools.push( + ...page.tools.map((tool) => ({ + name: tool.name ?? "unknown", + description: tool.description, + inputSchema: tool.inputSchema, + })), + ); + cursor = page.nextCursor; + } while (cursor); + + return tools; + } + + async callTool(name: string, args: Record = {}): Promise { + if (!this.client) { + throw new MultiagentError( + `MCP server ${this.getName()} is not started yet.`, + ); + } + + return await this.client.callTool({ + name, + arguments: args, + }); + } +} diff --git a/packages/multiagent/lib/runtime/driver.ts b/packages/multiagent/lib/runtime/driver.ts new file mode 100644 index 000000000..ba6127d67 --- /dev/null +++ b/packages/multiagent/lib/runtime/driver.ts @@ -0,0 +1,70 @@ +import { AgentSession } from "../agents/session.js"; +import { BrowserSession } from "../browser/session.js"; +import { createMCPServer } from "../mcp/index.js"; +import type { + MultiAgentRunOptions, + MultiAgentRunResult, +} from "../types.js"; + +export class MultiAgentDriver { + constructor(private readonly options: MultiAgentRunOptions) {} + + async run(): Promise { + const cwd = this.options.cwd ?? process.cwd(); + const browserSession = new BrowserSession(this.options.browser); + await browserSession.start(); + + const mcpServers = (this.options.mcpServers ?? []) + .filter((server) => server.enabled !== false) + .map((server) => createMCPServer(server, browserSession)); + + const agentSessions = this.options.agents.map((harness) => { + const session = new AgentSession({ + harness, + browserSession, + cwd, + }); + for (const server of mcpServers) { + session.attachMCPServer(server); + } + return session; + }); + + try { + await Promise.all(agentSessions.map(async (session) => await session.start())); + + const agentResults = await Promise.all( + agentSessions.map(async (session, index) => { + const harness = this.options.agents[index]; + + try { + const turn = await session.addUserMessage(this.options.task); + return { + harness: harness.type, + sessionId: turn.sessionId, + content: turn.message.content, + raw: turn.raw, + usage: turn.usage, + }; + } catch (error) { + return { + harness: harness.type, + content: "", + error: error instanceof Error ? error.message : String(error), + raw: error, + }; + } + }), + ); + + return { + browser: browserSession.getMetadata(), + agents: agentResults, + }; + } finally { + await Promise.all(agentSessions.map(async (session) => await session.stop())); + await Promise.all(mcpServers.map(async (server) => await server.stop())); + await browserSession.stop(); + } + } +} diff --git a/packages/multiagent/lib/types.ts b/packages/multiagent/lib/types.ts new file mode 100644 index 000000000..095544412 --- /dev/null +++ b/packages/multiagent/lib/types.ts @@ -0,0 +1,134 @@ +export type AgentHarnessName = + | "claude-code" + | "codex" + | "gemini-cli" + | "opencode" + | "browser-use" + | "stagehand"; + +export type MCPServerName = + | "playwright" + | "chrome-devtools" + | "agent-browser" + | "browser-use" + | "stagehand-agent" + | "understudy"; + +export type BrowserTargetName = "local" | "cdp"; + +export interface BrowserViewport { + width: number; + height: number; +} + +export interface BrowserSessionOptions { + type?: BrowserTargetName; + cdpUrl?: string | null; + executablePath?: string; + channel?: "chrome" | "chrome-beta" | "chrome-dev" | "chrome-canary"; + headless?: boolean; + userDataDir?: string; + viewport?: BrowserViewport; + args?: string[]; + ignoreHTTPSErrors?: boolean; + connectTimeoutMs?: number; +} + +export interface BrowserSessionMetadata { + id: string; + type: BrowserTargetName; + cdpUrl?: string; + browserUrl?: string; + launched: boolean; + headless: boolean; + userDataDir?: string; + viewport?: BrowserViewport; +} + +export interface StdioLaunchConfig { + command: string; + args?: string[]; + env?: Record; + cwd?: string; +} + +export interface NamedStdioLaunchConfig { + name: string; + config: StdioLaunchConfig; +} + +export interface MCPServerOptions { + type: MCPServerName; + name?: string; + enabled?: boolean; + env?: Record; + args?: string[]; + command?: string; + browser?: Partial; + transport?: "stdio"; +} + +export interface AgentHarnessOptions { + type: AgentHarnessName; + model?: string; + cwd?: string; + env?: Record; + args?: string[]; + permissionMode?: string; + stagehandMode?: "dom" | "hybrid" | "cua"; +} + +export interface AgentMessage { + id: string; + role: "system" | "user" | "assistant"; + content: string; + createdAt: string; + raw?: unknown; +} + +export interface AgentTurnUsage { + inputTokens?: number; + outputTokens?: number; + cachedInputTokens?: number; + raw?: unknown; +} + +export interface AgentTurnResult { + sessionId?: string; + message: AgentMessage; + raw?: unknown; + usage?: AgentTurnUsage; +} + +export interface AgentRunInput { + prompt: string; + mcpServers: NamedStdioLaunchConfig[]; + cwd: string; +} + +export interface AgentHarnessRunResult { + sessionId?: string; + content: string; + raw?: unknown; + usage?: AgentTurnUsage; +} + +export interface MultiAgentRunOptions { + task: string; + cwd?: string; + browser?: BrowserSessionOptions; + mcpServers?: MCPServerOptions[]; + agents: AgentHarnessOptions[]; +} + +export interface MultiAgentRunResult { + browser: BrowserSessionMetadata; + agents: Array<{ + harness: AgentHarnessName; + sessionId?: string; + content: string; + error?: string; + raw?: unknown; + usage?: AgentTurnUsage; + }>; +} diff --git a/packages/multiagent/lib/utils/errors.ts b/packages/multiagent/lib/utils/errors.ts new file mode 100644 index 000000000..4681e1987 --- /dev/null +++ b/packages/multiagent/lib/utils/errors.ts @@ -0,0 +1,32 @@ +export class MultiagentError extends Error { + constructor(message: string, options?: { cause?: unknown }) { + super(message); + this.name = "MultiagentError"; + if (options?.cause !== undefined) { + (this as Error & { cause?: unknown }).cause = options.cause; + } + } +} + +export class UnsupportedAdapterError extends MultiagentError { + constructor(kind: string, name: string) { + super(`${kind} "${name}" is not implemented yet.`); + this.name = "UnsupportedAdapterError"; + } +} + +export class CommandExecutionError extends MultiagentError { + constructor( + message: string, + readonly details: { + command: string; + args: string[]; + exitCode?: number | null; + stdout: string; + stderr: string; + }, + ) { + super(message); + this.name = "CommandExecutionError"; + } +} diff --git a/packages/multiagent/lib/utils/process.ts b/packages/multiagent/lib/utils/process.ts new file mode 100644 index 000000000..072fe9346 --- /dev/null +++ b/packages/multiagent/lib/utils/process.ts @@ -0,0 +1,118 @@ +import { spawn } from "node:child_process"; +import { createRequire } from "node:module"; +import path from "node:path"; +import { CommandExecutionError } from "./errors.js"; + +const require = createRequire(import.meta.url); + +export interface CommandSpec { + command: string; + args?: string[]; + cwd?: string; + env?: Record; + input?: string; +} + +export interface CommandResult { + stdout: string; + stderr: string; + exitCode: number | null; +} + +export async function runCommand(spec: CommandSpec): Promise { + const args = spec.args ?? []; + + return await new Promise((resolve, reject) => { + const child = spawn(spec.command, args, { + cwd: spec.cwd, + env: { + ...process.env, + ...(spec.env ?? {}), + }, + stdio: "pipe", + }); + + let stdout = ""; + let stderr = ""; + + child.stdout.on("data", (chunk) => { + stdout += String(chunk); + }); + + child.stderr.on("data", (chunk) => { + stderr += String(chunk); + }); + + child.on("error", (error) => { + reject( + new CommandExecutionError( + `Failed to start command: ${spec.command}`, + { + command: spec.command, + args, + exitCode: null, + stdout, + stderr: error instanceof Error ? `${stderr}${error.message}` : stderr, + }, + ), + ); + }); + + child.on("close", (exitCode) => { + if (exitCode !== 0) { + reject( + new CommandExecutionError( + `Command exited with code ${exitCode}: ${spec.command}`, + { + command: spec.command, + args, + exitCode, + stdout, + stderr, + }, + ), + ); + return; + } + + resolve({ + stdout, + stderr, + exitCode, + }); + }); + + if (spec.input) { + child.stdin.write(spec.input); + } + child.stdin.end(); + }); +} + +export function resolvePackageBin( + packageName: string, + binName?: string, +): string { + const packageJsonPath = require.resolve(`${packageName}/package.json`); + const packageDir = path.dirname(packageJsonPath); + const packageJson = require(packageJsonPath) as { + bin?: string | Record; + }; + + if (!packageJson.bin) { + throw new Error(`Package ${packageName} does not expose a binary.`); + } + + if (typeof packageJson.bin === "string") { + return path.join(packageDir, packageJson.bin); + } + + const resolvedBinName = binName ?? Object.keys(packageJson.bin)[0]; + if (!resolvedBinName || !packageJson.bin[resolvedBinName]) { + throw new Error( + `Package ${packageName} does not expose the binary ${binName ?? ""}.`, + ); + } + + return path.join(packageDir, packageJson.bin[resolvedBinName]); +} diff --git a/packages/multiagent/lib/utils/runtimePaths.ts b/packages/multiagent/lib/utils/runtimePaths.ts new file mode 100644 index 000000000..cbce937ae --- /dev/null +++ b/packages/multiagent/lib/utils/runtimePaths.ts @@ -0,0 +1,20 @@ +import path from "node:path"; +import { fileURLToPath } from "node:url"; + +const packageRoot = path.resolve( + path.dirname(fileURLToPath(import.meta.url)), + "..", + "..", +); + +export function getPackageRootDir(): string { + return packageRoot; +} + +export function getDistCliPath(): string { + return path.join(packageRoot, "dist", "cli.js"); +} + +export function getSourceCliPath(): string { + return path.join(packageRoot, "lib", "cli.ts"); +} diff --git a/packages/multiagent/package.json b/packages/multiagent/package.json new file mode 100644 index 000000000..44d99af8c --- /dev/null +++ b/packages/multiagent/package.json @@ -0,0 +1,53 @@ +{ + "name": "@browserbasehq/multiagent", + "version": "0.1.0", + "description": "Multi-agent browser automation driver for combining agents, MCP servers, and shared browser sessions.", + "type": "module", + "main": "./dist/index.js", + "types": "./dist/index.d.ts", + "bin": { + "multiagent": "./dist/cli.js" + }, + "exports": { + ".": { + "types": "./dist/index.d.ts", + "import": "./dist/index.js" + }, + "./package.json": "./package.json" + }, + "scripts": { + "build": "pnpm -w --dir ../.. exec tsc -p packages/multiagent/tsconfig.json", + "typecheck": "pnpm -w --dir ../.. exec tsc -p packages/multiagent/tsconfig.json --noEmit", + "lint": "cd ../.. && prettier --check packages/multiagent && cd packages/multiagent && eslint . && pnpm run typecheck", + "test": "pnpm -w --dir ../.. exec vitest run packages/multiagent/tests/unit" + }, + "dependencies": { + "@browserbasehq/stagehand": "workspace:*", + "@modelcontextprotocol/sdk": "^1.17.2", + "@playwright/mcp": "0.0.68", + "agent-browser": "^0.20.6", + "agent-browser-mcp": "^0.1.3", + "chrome-devtools-mcp": "^0.20.0", + "puppeteer-core": "^24.39.1", + "zod": "^4.2.1" + }, + "devDependencies": { + "@types/node": "22.13.1", + "eslint": "10.0.2", + "prettier": "^3.2.5", + "tsx": "*", + "vitest": "^4.0.8" + }, + "engines": { + "node": "^20.19.0 || >=22.12.0" + }, + "repository": { + "type": "git", + "url": "git+https://github.com/browserbase/stagehand.git", + "directory": "packages/multiagent" + }, + "bugs": { + "url": "https://github.com/browserbase/stagehand/issues" + }, + "homepage": "https://stagehand.dev" +} diff --git a/packages/multiagent/tests/unit/browser-session.test.ts b/packages/multiagent/tests/unit/browser-session.test.ts new file mode 100644 index 000000000..ced98b487 --- /dev/null +++ b/packages/multiagent/tests/unit/browser-session.test.ts @@ -0,0 +1,91 @@ +import { beforeEach, describe, expect, it, vi } from "vitest"; + +const { connectMock, launchMock } = vi.hoisted(() => ({ + connectMock: vi.fn(), + launchMock: vi.fn(), +})); + +vi.mock("puppeteer-core", () => ({ + default: { + connect: connectMock, + launch: launchMock, + }, +})); + +import { BrowserSession } from "../../lib/browser/session.js"; + +describe("BrowserSession", () => { + beforeEach(() => { + connectMock.mockReset(); + launchMock.mockReset(); + }); + + it("derives browser metadata from a launched local browser", async () => { + const close = vi.fn(); + launchMock.mockResolvedValue({ + wsEndpoint: () => "ws://127.0.0.1:9333/devtools/browser/local-session-id", + close, + }); + + const session = new BrowserSession({ + type: "local", + headless: true, + }); + + await session.start(); + + expect(launchMock).toHaveBeenCalledTimes(1); + expect(session.getMetadata()).toMatchObject({ + type: "local", + launched: true, + headless: true, + cdpUrl: "ws://127.0.0.1:9333/devtools/browser/local-session-id", + browserUrl: "http://127.0.0.1:9333", + }); + + await session.stop(); + expect(close).toHaveBeenCalledTimes(1); + }); + + it("connects to an existing CDP target and disconnects instead of closing", async () => { + const disconnect = vi.fn(); + connectMock.mockResolvedValue({ + wsEndpoint: () => + "ws://127.0.0.1:9222/devtools/browser/existing-session-id", + disconnect, + }); + + const session = new BrowserSession({ + type: "cdp", + cdpUrl: "http://127.0.0.1:9222", + connectTimeoutMs: 1234, + }); + + await session.start(); + + expect(connectMock).toHaveBeenCalledWith({ + browserURL: "http://127.0.0.1:9222", + protocolTimeout: 1234, + }); + expect(session.getMetadata()).toMatchObject({ + type: "cdp", + launched: false, + cdpUrl: "ws://127.0.0.1:9222/devtools/browser/existing-session-id", + browserUrl: "http://127.0.0.1:9222", + }); + + await session.stop(); + expect(disconnect).toHaveBeenCalledTimes(1); + }); + + it("rejects CDP mode without a target URL", async () => { + const session = new BrowserSession({ + type: "cdp", + cdpUrl: " ", + }); + + await expect(session.start()).rejects.toThrow( + "BrowserSession configured for CDP mode without a cdpUrl.", + ); + }); +}); diff --git a/packages/multiagent/tests/unit/browser-use-harness.test.ts b/packages/multiagent/tests/unit/browser-use-harness.test.ts new file mode 100644 index 000000000..ddcd55b2d --- /dev/null +++ b/packages/multiagent/tests/unit/browser-use-harness.test.ts @@ -0,0 +1,51 @@ +import { describe, expect, it } from "vitest"; +import { + buildBrowserUseScript, + parseBrowserUseResult, + resolveBrowserUseProvider, +} from "../../lib/agents/harnesses/browserUse.js"; + +describe("BrowserUseHarness helpers", () => { + it("maps Anthropic-prefixed models to the Anthropic provider", () => { + expect(resolveBrowserUseProvider("anthropic/claude-3-7-sonnet-latest")).toEqual({ + provider: "anthropic", + modelName: "claude-3-7-sonnet-latest", + }); + }); + + it("builds an inline browser-use script for the selected provider", () => { + const script = buildBrowserUseScript({ + packageSpec: "browser-use[anthropic]", + importStatement: "from browser_use import ChatAnthropic", + llmFactory: "ChatAnthropic(model=model_name)", + }); + + expect(script).toContain("from browser_use import Agent, Browser"); + expect(script).toContain("from browser_use import ChatAnthropic"); + expect(script).toContain("history.final_result()"); + expect(script).toContain('Browser(cdp_url=payload["cdpUrl"])'); + }); + + it("parses browser-use JSON output", () => { + const result = parseBrowserUseResult( + JSON.stringify({ + finalResult: "Example Domain", + errors: [], + raw: { + history: [], + }, + }), + ); + + expect(result).toEqual({ + content: "Example Domain", + raw: { + finalResult: "Example Domain", + errors: [], + raw: { + history: [], + }, + }, + }); + }); +}); diff --git a/packages/multiagent/tests/unit/driver.test.ts b/packages/multiagent/tests/unit/driver.test.ts new file mode 100644 index 000000000..af49fc06f --- /dev/null +++ b/packages/multiagent/tests/unit/driver.test.ts @@ -0,0 +1,136 @@ +import { beforeEach, describe, expect, it, vi } from "vitest"; + +const browserStartMock = vi.fn(); +const browserStopMock = vi.fn(); +const browserMetadata = { + id: "browser-1", + type: "local" as const, + cdpUrl: "ws://127.0.0.1:9222/devtools/browser/test", + browserUrl: "http://127.0.0.1:9222", + launched: true, + headless: true, +}; + +const createdServers: Array<{ + stop: ReturnType; + getName: ReturnType; +}> = []; + +const agentSessionInstances: Array<{ + harnessType: string; + start: ReturnType; + stop: ReturnType; + attachMCPServer: ReturnType; + addUserMessage: ReturnType; +}> = []; + +vi.mock("../../lib/browser/session.js", () => ({ + BrowserSession: class { + async start() { + await browserStartMock(); + } + + async stop() { + await browserStopMock(); + } + + getMetadata() { + return browserMetadata; + } + }, +})); + +vi.mock("../../lib/mcp/index.js", () => ({ + createMCPServer: vi.fn((options: { type: string }) => { + const server = { + stop: vi.fn(async () => {}), + getName: vi.fn(() => options.type), + }; + createdServers.push(server); + return server; + }), +})); + +vi.mock("../../lib/agents/session.js", () => ({ + AgentSession: class { + readonly harnessType: string; + readonly start = vi.fn(async () => {}); + readonly stop = vi.fn(async () => {}); + readonly attachMCPServer = vi.fn(); + readonly addUserMessage = vi.fn(async (task: string) => { + if (this.harnessType === "codex") { + throw new Error("codex failed"); + } + + return { + sessionId: `${this.harnessType}-session`, + message: { + content: `${this.harnessType}:${task}`, + }, + usage: { + inputTokens: 11, + }, + }; + }); + + constructor(options: { harness: { type: string } }) { + this.harnessType = options.harness.type; + agentSessionInstances.push(this); + } + }, +})); + +import { MultiAgentDriver } from "../../lib/runtime/driver.js"; + +describe("MultiAgentDriver", () => { + beforeEach(() => { + browserStartMock.mockReset(); + browserStopMock.mockReset(); + createdServers.length = 0; + agentSessionInstances.length = 0; + }); + + it("fans out one task across agent sessions and cleans up shared resources", async () => { + const driver = new MultiAgentDriver({ + task: "open example.com", + agents: [{ type: "claude-code" }, { type: "codex" }], + mcpServers: [{ type: "playwright" }, { type: "chrome-devtools" }], + }); + + const result = await driver.run(); + + expect(browserStartMock).toHaveBeenCalledTimes(1); + expect(browserStopMock).toHaveBeenCalledTimes(1); + expect(result.browser).toEqual(browserMetadata); + expect(result.agents).toEqual([ + { + harness: "claude-code", + sessionId: "claude-code-session", + content: "claude-code:open example.com", + raw: undefined, + usage: { + inputTokens: 11, + }, + }, + { + harness: "codex", + content: "", + error: "codex failed", + raw: expect.any(Error), + }, + ]); + + expect(createdServers).toHaveLength(2); + for (const server of createdServers) { + expect(server.stop).toHaveBeenCalledTimes(1); + } + + expect(agentSessionInstances).toHaveLength(2); + for (const session of agentSessionInstances) { + expect(session.start).toHaveBeenCalledTimes(1); + expect(session.stop).toHaveBeenCalledTimes(1); + expect(session.attachMCPServer).toHaveBeenCalledTimes(2); + expect(session.addUserMessage).toHaveBeenCalledWith("open example.com"); + } + }); +}); diff --git a/packages/multiagent/tests/unit/gemini-opencode-harness.test.ts b/packages/multiagent/tests/unit/gemini-opencode-harness.test.ts new file mode 100644 index 000000000..a3407dded --- /dev/null +++ b/packages/multiagent/tests/unit/gemini-opencode-harness.test.ts @@ -0,0 +1,143 @@ +import { describe, expect, it } from "vitest"; +import { + buildGeminiSettings, + parseGeminiJsonResult, +} from "../../lib/agents/harnesses/geminiCli.js"; +import { + buildOpencodeConfig, + parseOpencodeJsonl, + resolveOpencodeBinaryPath, +} from "../../lib/agents/harnesses/opencode.js"; + +describe("GeminiCliHarness helpers", () => { + it("builds isolated Gemini MCP settings from named stdio servers", () => { + expect( + buildGeminiSettings([ + { + name: "playwright", + config: { + command: "node", + args: ["playwright-mcp.js"], + env: { FOO: "bar" }, + cwd: "/tmp/project", + }, + }, + ]), + ).toEqual({ + mcpServers: { + playwright: { + type: "stdio", + command: "node", + args: ["playwright-mcp.js"], + env: { FOO: "bar" }, + cwd: "/tmp/project", + }, + }, + mcp: { + allowed: ["playwright"], + }, + }); + }); + + it("parses Gemini JSON output", () => { + expect( + parseGeminiJsonResult( + JSON.stringify({ + session_id: "session-1", + response: "Example Domain", + stats: { latencyMs: 10 }, + }), + ), + ).toEqual({ + session_id: "session-1", + response: "Example Domain", + stats: { latencyMs: 10 }, + }); + }); +}); + +describe("OpencodeHarness helpers", () => { + it("builds isolated OpenCode MCP config", () => { + expect( + buildOpencodeConfig([ + { + name: "chrome-devtools", + config: { + command: "node", + args: ["chrome-devtools-mcp.js", "--headless"], + env: { DEBUG: "0" }, + }, + }, + ]), + ).toEqual({ + mcp: { + "chrome-devtools": { + type: "local", + enabled: true, + command: ["node", "chrome-devtools-mcp.js", "--headless"], + environment: { DEBUG: "0" }, + }, + }, + }); + }); + + it("parses OpenCode JSONL output into content and usage", () => { + const result = parseOpencodeJsonl( + [ + JSON.stringify({ + type: "step_start", + sessionID: "ses_123", + }), + JSON.stringify({ + type: "text", + sessionID: "ses_123", + part: { + text: "Example", + }, + }), + JSON.stringify({ + type: "text", + sessionID: "ses_123", + part: { + text: " Domain", + }, + }), + JSON.stringify({ + type: "step_finish", + sessionID: "ses_123", + part: { + tokens: { + input: 11, + output: 7, + cache: { + read: 3, + write: 5, + }, + }, + }, + }), + ].join("\n"), + ); + + expect(result).toMatchObject({ + sessionId: "ses_123", + content: "Example Domain", + usage: { + inputTokens: 11, + outputTokens: 7, + cachedInputTokens: 8, + }, + }); + }); + + it("prefers an explicit OpenCode binary override", () => { + expect( + resolveOpencodeBinaryPath({ + env: { + MULTIAGENT_OPENCODE_BIN: "/tmp/opencode-native", + }, + existsSync: (value) => value === "/tmp/opencode-native", + }), + ).toBe("/tmp/opencode-native"); + }); +}); diff --git a/packages/multiagent/tsconfig.json b/packages/multiagent/tsconfig.json new file mode 100644 index 000000000..66c0e7bf9 --- /dev/null +++ b/packages/multiagent/tsconfig.json @@ -0,0 +1,9 @@ +{ + "extends": "../../tsconfig.base.json", + "compilerOptions": { + "outDir": "./dist", + "rootDir": "./lib" + }, + "include": ["lib/**/*.ts"], + "exclude": ["node_modules", "dist"] +} diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index 998e383a6..32cf97ad7 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -283,7 +283,7 @@ importers: version: 17.3.1 openai: specifier: ^4.87.1 - version: 4.87.1(ws@8.18.3(bufferutil@4.0.9))(zod@4.2.1) + version: 4.87.1(ws@8.19.0(bufferutil@4.0.9))(zod@4.2.1) sharp: specifier: ^0.34.5 version: 0.34.5 @@ -304,6 +304,49 @@ importers: specifier: '*' version: 4.19.4 + packages/multiagent: + dependencies: + '@browserbasehq/stagehand': + specifier: workspace:* + version: link:../core + '@modelcontextprotocol/sdk': + specifier: ^1.17.2 + version: 1.17.2 + '@playwright/mcp': + specifier: 0.0.68 + version: 0.0.68 + agent-browser: + specifier: ^0.20.6 + version: 0.20.6 + agent-browser-mcp: + specifier: ^0.1.3 + version: 0.1.3 + chrome-devtools-mcp: + specifier: ^0.20.0 + version: 0.20.0 + puppeteer-core: + specifier: ^24.39.1 + version: 24.39.1(bufferutil@4.0.9) + zod: + specifier: ^4.2.1 + version: 4.2.1 + devDependencies: + '@types/node': + specifier: 22.13.1 + version: 22.13.1 + eslint: + specifier: 10.0.2 + version: 10.0.2(jiti@1.21.7) + prettier: + specifier: ^3.2.5 + version: 3.5.3 + tsx: + specifier: '*' + version: 4.19.4 + vitest: + specifier: ^4.0.8 + version: 4.0.8(@types/debug@4.1.12)(@types/node@22.13.1)(jiti@1.21.7)(jsdom@24.1.3(bufferutil@4.0.9))(tsx@4.19.4)(yaml@2.7.1) + packages/server-v3: dependencies: '@browserbasehq/sdk': @@ -366,7 +409,7 @@ importers: version: 3.0.1 openai: specifier: 4.87.1 - version: 4.87.1(ws@8.18.3(bufferutil@4.0.9))(zod@4.2.1) + version: 4.87.1(ws@8.19.0(bufferutil@4.0.9))(zod@4.2.1) postject: specifier: 1.0.0-alpha.6 version: 1.0.0-alpha.6 @@ -448,7 +491,7 @@ importers: version: 3.0.1 openai: specifier: 4.87.1 - version: 4.87.1(ws@8.18.3(bufferutil@4.0.9))(zod@4.2.1) + version: 4.87.1(ws@8.19.0(bufferutil@4.0.9))(zod@4.2.1) postject: specifier: 1.0.0-alpha.6 version: 1.0.0-alpha.6 @@ -619,6 +662,10 @@ packages: '@anthropic-ai/sdk@0.39.0': resolution: {integrity: sha512-eMyDIPRZbt1CCLErRCi3exlAvNkBtRe+kW5vvJyef93PmNr/clstYgHhtvmkxN82nlKgzyGPCyGxrm0JQ1ZIdg==} + '@anthropic-ai/sdk@0.52.0': + resolution: {integrity: sha512-d4c+fg+xy9e46c8+YnrrgIQR45CZlAi7PwdzIfDXDM6ACxEZli1/fxhURsq30ZpMZy6LvSkr41jGq5aF5TD7rQ==} + hasBin: true + '@ark/schema@0.46.0': resolution: {integrity: sha512-c2UQdKgP2eqqDArfBqQIJppxJHvNNXuQPeuSPlDML4rjw+f1cu0qAlzOG4b8ujgm9ctIDWwhpyw6gjG5ledIVQ==} @@ -1716,11 +1763,21 @@ packages: resolution: {integrity: sha512-+1VkjdD0QBLPodGrJUeqarH8VAIvQODIbwh9XpP5Syisf7YoQgsJKPNFoqqLQlu+VQ/tVSshMR6loPMn8U+dPg==} engines: {node: '>=14'} + '@playwright/mcp@0.0.68': + resolution: {integrity: sha512-oP9I9ghXKuQEBo4xaC7HgsS2gRTxyMzlBm3UEhYj4VqqrqbPQUX2shATPaNA/am9joBzq9v0OXISzeIgP+zmHA==} + engines: {node: '>=18'} + hasBin: true + '@playwright/test@1.54.2': resolution: {integrity: sha512-A+znathYxPf+72riFd1r1ovOLqsIIB0jKIoPjyK2kqEIe30/6jF6BC7QNluHuwUmsD2tv1XZVugN8GqfTMOxsA==} engines: {node: '>=18'} hasBin: true + '@puppeteer/browsers@2.13.0': + resolution: {integrity: sha512-46BZJYJjc/WwmKjsvDFykHtXrtomsCIrwYQPOP7VfMJoZY2bsDF9oROBABR3paDjDcmkUye1Pb1BqdcdiipaWA==} + engines: {node: '>=18'} + hasBin: true + '@puppeteer/browsers@2.3.0': resolution: {integrity: sha512-ioXoq9gPxkss4MYhD+SFaU9p1IHFUX0ILAWFPyjGaBdjLsYAlZw6j1iLA0N/m12uVHLFDfSYNF7EQccjinIMDA==} engines: {node: '>=18'} @@ -2267,6 +2324,15 @@ packages: resolution: {integrity: sha512-jRR5wdylq8CkOe6hei19GGZnxM6rBGwFl3Bg0YItGDimvjGtAvdZk4Pu6Cl4u4Igsws4a1fd1Vq3ezrhn4KmFw==} engines: {node: '>= 14'} + agent-browser-mcp@0.1.3: + resolution: {integrity: sha512-Es4ERBKeY74vldv1Z5jRV1bkhE9jUBp0mcFQ5XQIrcuYqDX24Z1GeTjkAx9T5bk1X7TeaFX3t1fCuhlA7m83eg==} + engines: {node: '>=18.0.0'} + hasBin: true + + agent-browser@0.20.6: + resolution: {integrity: sha512-n5C/txjzKD/5K2Mpw6UPXd++sAaBhPr3xLkVK11PDfTeCD+uo7fzhDIaLvFiccAA8HR57nHxZP+YI8jW3Kdadg==} + hasBin: true + agentkeepalive@4.6.0: resolution: {integrity: sha512-kja8j7PjmncONqaTsB8fQ+wE2mSU2DJ9D4XKoJ5PFWIdRMa6SLSN1ff4mOr4jCbfRSsxR4keIiySJU0N9T5hIQ==} engines: {node: '>= 8.0.0'} @@ -2650,6 +2716,11 @@ packages: resolution: {integrity: sha512-bIomtDF5KGpdogkLd9VspvFzk9KfpyyGlS8YFVZl7TGPBHL5snIOnxeshwVgPteQ9b4Eydl+pVbIyE1DcvCWgQ==} engines: {node: '>=10'} + chrome-devtools-mcp@0.20.0: + resolution: {integrity: sha512-wBnt8901lAXdac3AB7WdONYTAXGW+YqqIVVg7PztxYVNPs3VVgM2UZnZT/ICYPIofKTuRBOkRdEE/VYm90ZgYA==} + engines: {node: ^20.19.0 || ^22.12.0 || >=23} + hasBin: true + chrome-launcher@1.2.0: resolution: {integrity: sha512-JbuGuBNss258bvGil7FT4HKdC3SC2K7UAEUqiPy3ACS3Yxo3hAW6bvFpCu2HsIJLgTqxgEX6BkujvzZfLpUD0Q==} engines: {node: '>=12.13.0'} @@ -2660,6 +2731,11 @@ packages: peerDependencies: devtools-protocol: '*' + chromium-bidi@14.0.0: + resolution: {integrity: sha512-9gYlLtS6tStdRWzrtXaTMnqcM4dudNegMXJxkR0I/CXObHalYeYcAMPrL19eroNZHtJ8DQmu1E+ZNOYu/IXMXw==} + peerDependencies: + devtools-protocol: '*' + ci-info@3.9.0: resolution: {integrity: sha512-NIxF55hv4nSqQswkAeiOi1r83xy8JldOFDTWiug55KBu9Jnblncd2U6ViHmYgHf01TPZS77NJBhBMKdWj9HQMQ==} engines: {node: '>=8'} @@ -2971,6 +3047,9 @@ packages: devtools-protocol@0.0.1464554: resolution: {integrity: sha512-CAoP3lYfwAGQTaAXYvA6JZR0fjGUb7qec1qf4mToyoH2TZgUFeIqYcjh6f9jNuhHfuZiEdH+PONHYrLhRQX6aw==} + devtools-protocol@0.0.1581282: + resolution: {integrity: sha512-nv7iKtNZQshSW2hKzYNr46nM/Cfh5SEvE2oV0/SEGgc9XupIY5ggf84Cz8eJIkBce7S3bmTAauFD6aysMpnqsQ==} + didyoumean@1.2.2: resolution: {integrity: sha512-gxtyfqMg7GKyhQmb056K7M3xszy/myH8w+B4RT+QXBQsvAOdc3XymqDDPHx1BgPgsdAA5SIifona89YtRATDzw==} @@ -4937,6 +5016,11 @@ packages: engines: {node: '>=18'} hasBin: true + playwright-core@1.59.0-alpha-1771104257000: + resolution: {integrity: sha512-YiXup3pnpQUCBMSIW5zx8CErwRx4K6O5Kojkw2BzJui8MazoMUDU6E3xGsb1kzFviEAE09LFQ+y1a0RhIJQ5SA==} + engines: {node: '>=18'} + hasBin: true + playwright@1.52.0: resolution: {integrity: sha512-JAwMNMBlxJ2oD1kce4KPtMkDeKGHQstdpFPcPH3maElAXon/QZeTvtsfXmTMRyO9TslfoYOXkSsvao2nE1ilTw==} engines: {node: '>=18'} @@ -4947,6 +5031,11 @@ packages: engines: {node: '>=18'} hasBin: true + playwright@1.59.0-alpha-1771104257000: + resolution: {integrity: sha512-6SCMMMJaDRsSqiKVLmb2nhtLES7iTYawTWWrQK6UdIGNzXi8lka4sLKRec3L4DnTWwddAvCuRn8035dhNiHzbg==} + engines: {node: '>=18'} + hasBin: true + plimit-lit@1.6.1: resolution: {integrity: sha512-B7+VDyb8Tl6oMJT9oSO2CW8XC/T4UcJGrwOVoNGwOQsQYhlpfajmrMj5xeejqaASq3V/EqThyOeATEOMuSEXiA==} engines: {node: '>=12'} @@ -5091,6 +5180,10 @@ packages: resolution: {integrity: sha512-cHArnywCiAAVXa3t4GGL2vttNxh7GqXtIYGym99egkNJ3oG//wL9LkvO4WE8W1TJe95t1F1ocu9X4xWaGsOKOA==} engines: {node: '>=18'} + puppeteer-core@24.39.1: + resolution: {integrity: sha512-AMqQIKoEhPS6CilDzw0Gd1brLri3emkC+1N2J6ZCCuY1Cglo56M63S0jOeBZDQlemOiRd686MYVMl9ELJBzN3A==} + engines: {node: '>=18'} + puppeteer@22.15.0: resolution: {integrity: sha512-XjCY1SiSEi1T7iSYuxS82ft85kwDJUS7wj1Z0eGVXKdtr5g4xnVcbjwxhq5xBnpK/E7x1VZZoJDxpjAOasHT4Q==} engines: {node: '>=18'} @@ -5401,6 +5494,11 @@ packages: engines: {node: '>=10'} hasBin: true + semver@7.7.4: + resolution: {integrity: sha512-vFKC2IEtQnVhpT78h1Yp8wzwrf8CM+MzKMHGJZfBtzhZNycRFnXsHk6E5TxIkkMsgNS7mdX3AGB7x2QM2di4lA==} + engines: {node: '>=10'} + hasBin: true + send@0.19.0: resolution: {integrity: sha512-dW41u5VfLXu8SJh5bwRmyYUbAoSB3c9uQh6L8h/KtsFREPWpbX1lrljJo186Jc4nmci/sGUZ9a0a0J2zgfq2hw==} engines: {node: '>= 0.8.0'} @@ -5672,6 +5770,9 @@ packages: tar-fs@3.1.0: resolution: {integrity: sha512-5Mty5y/sOF1YWj1J6GiBodjlDc05CUR8PKXrsnFAiSG0xA+GHeWLovaZPYUDXkH/1iKRf2+M5+OrRgzC7O9b7w==} + tar-fs@3.1.2: + resolution: {integrity: sha512-QGxxTxxyleAdyM3kpFs14ymbYmNFrfY+pHj7Z8FgtbZ7w2//VAgLMac7sT6nRpIHjppXO2AwwEOg0bPFVRcmXw==} + tar-stream@3.1.7: resolution: {integrity: sha512-qJj60CXt7IU1Ffyc3NJMjh6EkuCFej46zUqJ4J7pqYlThyd9bO0XBTmcOIhSzZJVWfsLks0+nle/j538YAW9RQ==} @@ -5886,6 +5987,9 @@ packages: resolution: {integrity: sha512-3KS2b+kL7fsuk/eJZ7EQdnEmQoaho/r6KUef7hxvltNA5DR8NAUM+8wJMbJyZ4G9/7i3v5zPBIMN5aybAh2/Jg==} engines: {node: '>= 0.4'} + typed-query-selector@2.12.1: + resolution: {integrity: sha512-uzR+FzI8qrUEIu96oaeBJmd9E7CFEiQ3goA5qCVgc4s5llSubcfGHq9yUstZx/k4s9dXHVKsE35YWoFyvEqEHA==} + typescript-eslint@8.56.1: resolution: {integrity: sha512-U4lM6pjmBX7J5wk4szltF7I1cGBHXZopnAXCMXb3+fZ3B/0Z3hq3wS/CCUB2NZBNAExK92mCU2tEohWuwVMsDQ==} engines: {node: ^18.18.0 || ^20.9.0 || >=21.1.0} @@ -6133,6 +6237,9 @@ packages: resolution: {integrity: sha512-QW95TCTaHmsYfHDybGMwO5IJIM93I/6vTRk+daHTWFPhwh+C8Cg7j7XyKrwrj8Ib6vYXe0ocYNrmzY4xAAN6ug==} engines: {node: '>= 14'} + webdriver-bidi-protocol@0.4.1: + resolution: {integrity: sha512-ARrjNjtWRRs2w4Tk7nqrf2gBI0QXWuOmMCx2hU+1jUt6d00MjMxURrhxhGbrsoiZKJrhTSTzbIrc554iKI10qw==} + webidl-conversions@3.0.1: resolution: {integrity: sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ==} @@ -6233,6 +6340,18 @@ packages: utf-8-validate: optional: true + ws@8.19.0: + resolution: {integrity: sha512-blAT2mjOEIi0ZzruJfIhb3nps74PRWTCz1IjglWEEpQl5XS/UNama6u2/rjFkDDouqr4L67ry+1aGIALViWjDg==} + engines: {node: '>=10.0.0'} + peerDependencies: + bufferutil: ^4.0.1 + utf-8-validate: '>=5.0.2' + peerDependenciesMeta: + bufferutil: + optional: true + utf-8-validate: + optional: true + xml-name-validator@5.0.0: resolution: {integrity: sha512-EvGK8EJ3DhaHfbRlETOWAS5pO9MZITeauHKJyb8wyajUfQUenkIg2MvLDTZ4T/TgIcm3HU0TFBgWWboAZ30UHg==} engines: {node: '>=18'} @@ -6511,6 +6630,8 @@ snapshots: transitivePeerDependencies: - encoding + '@anthropic-ai/sdk@0.52.0': {} + '@ark/schema@0.46.0': dependencies: '@ark/util': 0.46.0 @@ -7788,10 +7909,28 @@ snapshots: '@pkgjs/parseargs@0.11.0': optional: true + '@playwright/mcp@0.0.68': + dependencies: + playwright: 1.59.0-alpha-1771104257000 + playwright-core: 1.59.0-alpha-1771104257000 + '@playwright/test@1.54.2': dependencies: playwright: 1.54.2 + '@puppeteer/browsers@2.13.0': + dependencies: + debug: 4.4.3 + extract-zip: 2.0.1 + progress: 2.0.3 + proxy-agent: 6.5.0 + semver: 7.7.4 + tar-fs: 3.1.2 + yargs: 17.7.2 + transitivePeerDependencies: + - bare-buffer + - supports-color + '@puppeteer/browsers@2.3.0': dependencies: debug: 4.4.3 @@ -8438,6 +8577,16 @@ snapshots: agent-base@7.1.3: {} + agent-browser-mcp@0.1.3: + dependencies: + '@anthropic-ai/sdk': 0.52.0 + '@modelcontextprotocol/sdk': 1.17.2 + zod: 3.25.76 + transitivePeerDependencies: + - supports-color + + agent-browser@0.20.6: {} + agentkeepalive@4.6.0: dependencies: humanize-ms: 1.2.1 @@ -8844,6 +8993,8 @@ snapshots: chownr@2.0.0: {} + chrome-devtools-mcp@0.20.0: {} + chrome-launcher@1.2.0: dependencies: '@types/node': 20.17.32 @@ -8861,6 +9012,12 @@ snapshots: urlpattern-polyfill: 10.0.0 zod: 3.23.8 + chromium-bidi@14.0.0(devtools-protocol@0.0.1581282): + dependencies: + devtools-protocol: 0.0.1581282 + mitt: 3.0.1 + zod: 3.25.76 + ci-info@3.9.0: {} clean-stack@4.2.0: @@ -9118,6 +9275,8 @@ snapshots: devtools-protocol@0.0.1464554: {} + devtools-protocol@0.0.1581282: {} + didyoumean@1.2.2: {} dir-glob@3.0.1: @@ -11155,8 +11314,8 @@ snapshots: micromark-extension-mdxjs@3.0.0: dependencies: - acorn: 8.15.0 - acorn-jsx: 5.3.2(acorn@8.15.0) + acorn: 8.16.0 + acorn-jsx: 5.3.2(acorn@8.16.0) micromark-extension-mdx-expression: 3.0.1 micromark-extension-mdx-jsx: 3.0.2 micromark-extension-mdx-md: 2.0.0 @@ -11541,6 +11700,21 @@ snapshots: transitivePeerDependencies: - encoding + openai@4.87.1(ws@8.19.0(bufferutil@4.0.9))(zod@4.2.1): + dependencies: + '@types/node': 18.19.87 + '@types/node-fetch': 2.6.12 + abort-controller: 3.0.0 + agentkeepalive: 4.6.0 + form-data-encoder: 1.7.2 + formdata-node: 4.4.1 + node-fetch: 2.7.0 + optionalDependencies: + ws: 8.19.0(bufferutil@4.0.9) + zod: 4.2.1 + transitivePeerDependencies: + - encoding + openai@4.96.2(ws@8.18.3(bufferutil@4.0.9))(zod@3.25.76): dependencies: '@types/node': 18.19.87 @@ -11813,6 +11987,8 @@ snapshots: playwright-core@1.54.2: {} + playwright-core@1.59.0-alpha-1771104257000: {} + playwright@1.52.0: dependencies: playwright-core: 1.52.0 @@ -11825,6 +12001,12 @@ snapshots: optionalDependencies: fsevents: 2.3.2 + playwright@1.59.0-alpha-1771104257000: + dependencies: + playwright-core: 1.59.0-alpha-1771104257000 + optionalDependencies: + fsevents: 2.3.2 + plimit-lit@1.6.1: dependencies: queue-lit: 1.5.2 @@ -11956,6 +12138,21 @@ snapshots: - supports-color - utf-8-validate + puppeteer-core@24.39.1(bufferutil@4.0.9): + dependencies: + '@puppeteer/browsers': 2.13.0 + chromium-bidi: 14.0.0(devtools-protocol@0.0.1581282) + debug: 4.4.3 + devtools-protocol: 0.0.1581282 + typed-query-selector: 2.12.1 + webdriver-bidi-protocol: 0.4.1 + ws: 8.19.0(bufferutil@4.0.9) + transitivePeerDependencies: + - bare-buffer + - bufferutil + - supports-color + - utf-8-validate + puppeteer@22.15.0(bufferutil@4.0.9)(typescript@5.9.3): dependencies: '@puppeteer/browsers': 2.3.0 @@ -12397,6 +12594,8 @@ snapshots: semver@7.7.3: {} + semver@7.7.4: {} + send@0.19.0: dependencies: debug: 2.6.9 @@ -12836,6 +13035,16 @@ snapshots: transitivePeerDependencies: - bare-buffer + tar-fs@3.1.2: + dependencies: + pump: 3.0.2 + tar-stream: 3.1.7 + optionalDependencies: + bare-fs: 4.1.4 + bare-path: 3.0.0 + transitivePeerDependencies: + - bare-buffer + tar-stream@3.1.7: dependencies: b4a: 1.6.7 @@ -13069,6 +13278,8 @@ snapshots: possible-typed-array-names: 1.1.0 reflect.getprototypeof: 1.0.10 + typed-query-selector@2.12.1: {} + typescript-eslint@8.56.1(eslint@10.0.2(jiti@1.21.7))(typescript@5.8.3): dependencies: '@typescript-eslint/eslint-plugin': 8.56.1(@typescript-eslint/parser@8.56.1(eslint@10.0.2(jiti@1.21.7))(typescript@5.8.3))(eslint@10.0.2(jiti@1.21.7))(typescript@5.8.3) @@ -13372,6 +13583,8 @@ snapshots: web-streams-polyfill@4.0.0-beta.3: {} + webdriver-bidi-protocol@0.4.1: {} + webidl-conversions@3.0.1: {} webidl-conversions@7.0.0: {} @@ -13482,6 +13695,10 @@ snapshots: optionalDependencies: bufferutil: 4.0.9 + ws@8.19.0(bufferutil@4.0.9): + optionalDependencies: + bufferutil: 4.0.9 + xml-name-validator@5.0.0: {} xml2js@0.6.2: diff --git a/pnpm-workspace.yaml b/pnpm-workspace.yaml index a84a85f5e..6a06b9ff3 100644 --- a/pnpm-workspace.yaml +++ b/pnpm-workspace.yaml @@ -1,6 +1,7 @@ packages: - "packages/core" - "packages/cli" + - "packages/multiagent" - "packages/evals" - "packages/docs" - "packages/server-v3"