inworld-ai · ianbbqzy · Feb 9, 2026 · Feb 9, 2026 · Feb 9, 2026 · Feb 11, 2026
diff --git a/.cursor/rules/integrations.mdc b/.cursor/rules/integrations.mdc
@@ -0,0 +1,99 @@
+---
+description: Structure and sanity check procedures for integrations/ benchmarks and quickstarts
+globs: integrations/**
+alwaysApply: false
+---
+
+# First-Time Setup (required for fresh clone)
+
+```bash
+# Initialize git submodules (pipecat, livekit agents)
+git submodule update --init --recursive
+
+# Build LiveKit JS agents monorepo (needed by JS benchmarks and quickstart)
+cd integrations/livekit/js/agents-js && pnpm install && pnpm build && cd -
+```
+
+Each script directory needs a `.env` file with valid API keys (copy from `.env.example`).
+
+# Directory Structure
+
+```
+integrations/
+├── pipecat/
+│   ├── pipecat/                        # Submodule: pipecat-ai framework
+│   ├── pipecat-quickstart/             # Voice bot (uv sync && uv run bot.py)
+│   └── benchmarks/                     # TTFB benchmarks (uv sync && uv run python ...)
+├── livekit/
+│   ├── python/
+│   │   ├── agents/                     # Submodule: livekit agents + Inworld plugin
+│   │   ├── quickstart/                 # Voice agent (uv sync && uv run python ...)
+│   │   └── benchmarks/                 # TTFB benchmarks (uv sync && uv run python ...)
+│   └── js/
+│       ├── agents-js/                  # Submodule: livekit agents-js + Inworld plugin
+│       ├── quickstart/                 # Voice agent (pnpm install && pnpm start)
+│       └── benchmarks/                 # TTFB benchmarks (pnpm install && npx tsx ...)
+```
+
+Each script directory has:
+- `pyproject.toml` or `package.json` with editable/local deps on the plugin submodule
+- `.env.example` listing required API keys
+- `README.md` with setup and usage
+
+# Sanity Check Procedure
+
+Prerequisites: user must have `.env` files with valid API keys in each directory they want to test. Check with:
+```bash
+# Verify .env exists and INWORLD_API_KEY is non-empty
+for dir in integrations/pipecat/benchmarks integrations/livekit/python/benchmarks integrations/livekit/js/benchmarks; do
+  test -f "$dir/.env" && grep -q 'INWORLD_API_KEY=.\+' "$dir/.env" && echo "✅ $dir" || echo "❌ $dir missing .env or INWORLD_API_KEY"
+done
+```
+
+## Benchmarks (quick smoke test: -n 1 --warmup 0)
+
+```bash
+# Pipecat Python
+cd integrations/pipecat/benchmarks && uv sync && uv run python benchmark_http_ttfb.py --services inworld -n 1 --warmup 0 --no-save-audio
+cd integrations/pipecat/benchmarks && uv run python benchmark_websocket_ttfb.py --services inworld -n 1 --warmup 0 --no-save-audio
+
+# LiveKit Python
+cd integrations/livekit/python/benchmarks && uv sync && uv run python benchmark_http_ttfb.py --services inworld -n 1 --warmup 0 --no-save-audio
+cd integrations/livekit/python/benchmarks && uv run python benchmark_websocket_ttfb.py --services inworld -n 1 --warmup 0 --no-save-audio
+
+# LiveKit JS (requires agents-js built: cd ../agents-js && pnpm install && pnpm build)
+cd integrations/livekit/js/benchmarks && pnpm install && npx tsx benchmark_http_ttfb.ts --services inworld -n 1 --warmup 0 --no-save-audio
+cd integrations/livekit/js/benchmarks && npx tsx benchmark_websocket_ttfb.ts --services inworld -n 1 --warmup 0 --no-save-audio
+```
+
+## Quickstarts (import check only — full run needs LiveKit Cloud)
+
+```bash
+# Pipecat
+cd integrations/pipecat/pipecat-quickstart && uv sync && uv run python -c "import bot; print('ok')"
+
+# LiveKit Python
+cd integrations/livekit/python/quickstart && uv sync && uv run python -c "import test_inworld_voice_agent; print('ok')"
+
+# LiveKit JS (needs agents-js built)
+cd integrations/livekit/js/quickstart && pnpm install
+```
+
+## Expected benchmark output
+
+Each benchmark should print a table with N matching the -n flag:
+```
+📊 TTFB
+Service              Avg    StdDev   Min      Max      P50      P95       N
+```
+TTFB for Inworld should be ~0.2-0.4s for HTTP, ~0.2-0.5s for WS.
+If audio_bytes=0 or TTFB is N/A, the API key is likely invalid.
+
+# Key conventions
+
+- All Python dirs use `uv sync && uv run` (never activate venvs manually)
+- All JS dirs use `pnpm install` (not npm)
+- `.env.example` exists in every script directory
+- Benchmarks: `-n` controls number of TTFB samples, `--warmup` discards cold starts
+- Inworld base URLs are hardcoded in factory functions (swap for dev env)
+- Pipecat InworldHttpTTSService does not accept a custom base_url parameter
diff --git a/.gitignore b/.gitignore
@@ -327,13 +327,18 @@ public
 
 .DS_Store
 
+.claude
+
 # Audio output files from examples
 *.wav
 *.mp3
 synthesis_*.wav
 synthesis_*.mp3
+
+# Benchmark audio
+benchmark_audio/
+
 # Keep STT tests-data in repo (e.g. test-audio.wav, test-pcm-audio.pcm)
 !stt/tests-data/
 !stt/tests-data/**
-.claude
 
diff --git a/.gitmodules b/.gitmodules
@@ -0,0 +1,9 @@
+[submodule "integrations/pipecat/pipecat"]
+	path = integrations/pipecat/pipecat
+	url = https://github.com/inworld-ai/pipecat
+[submodule "integrations/livekit/python/agents"]
+	path = integrations/livekit/python/agents
+	url = https://github.com/inworld-ai/livekit_agents
+[submodule "integrations/livekit/js/agents-js"]
+	path = integrations/livekit/js/agents-js
+	url = https://github.com/inworld-ai/livekit_agents_js
diff --git a/integrations/CLAUDE.md b/integrations/CLAUDE.md
@@ -0,0 +1,97 @@
+# Integrations — Agent Guide
+
+## First-Time Setup (required for fresh clone)
+
+```bash
+# 1. Initialize git submodules (pipecat, livekit agents)
+git submodule update --init --recursive
+
+# 2. Build LiveKit JS agents monorepo (needed by JS benchmarks and quickstart)
+cd integrations/livekit/js/agents-js
+pnpm install && pnpm build
+cd -
+```
+
+## Directory Structure
+
+```
+integrations/
+├── pipecat/
+│   ├── pipecat/                        # Submodule: pipecat-ai framework
+│   ├── pipecat-quickstart/             # Voice bot (uv sync && uv run bot.py)
+│   └── benchmarks/                     # TTFB benchmarks (uv sync && uv run python ...)
+├── livekit/
+│   ├── python/
+│   │   ├── agents/                     # Submodule: livekit agents + Inworld plugin
+│   │   ├── quickstart/                 # Voice agent (uv sync && uv run python ...)
+│   │   └── benchmarks/                 # TTFB benchmarks (uv sync && uv run python ...)
+│   └── js/
+│       ├── agents-js/                  # Submodule: livekit agents-js + Inworld plugin
+│       ├── quickstart/                 # Voice agent (pnpm install && pnpm start)
+│       └── benchmarks/                 # TTFB benchmarks (pnpm install && npx tsx ...)
+```
+
+Each script directory has:
+- `pyproject.toml` or `package.json` with editable/local deps on the plugin submodule
+- `.env.example` listing required API keys
+- `README.md` with setup and usage
+
+## Prerequisites for Running Anything
+
+1. Git submodules initialized (see above)
+2. `.env` file in the target directory with valid API keys (copy from `.env.example`)
+3. `uv` installed (Python dirs) or `pnpm` installed (JS dirs)
+
+## Sanity Check — All Benchmarks
+
+Verify `.env` files exist:
+```bash
+for dir in pipecat/benchmarks livekit/python/benchmarks livekit/js/benchmarks; do
+  test -f "integrations/$dir/.env" && grep -q 'INWORLD_API_KEY=.\+' "integrations/$dir/.env" \
+    && echo "✅ $dir" || echo "❌ $dir — missing .env or empty INWORLD_API_KEY"
+done
+```
+
+Run each benchmark with minimal iterations:
+```bash
+# Pipecat (Python)
+cd integrations/pipecat/benchmarks
+uv sync
+uv run python benchmark_http_ttfb.py --services inworld -n 1 --warmup 0 --no-save-audio
+uv run python benchmark_websocket_ttfb.py --services inworld -n 1 --warmup 0 --no-save-audio
+
+# LiveKit (Python)
+cd integrations/livekit/python/benchmarks
+uv sync
+uv run python benchmark_http_ttfb.py --services inworld -n 1 --warmup 0 --no-save-audio
+uv run python benchmark_websocket_ttfb.py --services inworld -n 1 --warmup 0 --no-save-audio
+
+# LiveKit (JS) — requires agents-js built (see First-Time Setup)
+cd integrations/livekit/js/benchmarks
+pnpm install
+npx tsx benchmark_http_ttfb.ts --services inworld -n 1 --warmup 0 --no-save-audio
+npx tsx benchmark_websocket_ttfb.ts --services inworld -n 1 --warmup 0 --no-save-audio
+```
+
+## Sanity Check — Quickstarts (import only, full run needs LiveKit Cloud)
+
+```bash
+cd integrations/pipecat/pipecat-quickstart && uv sync && uv run python -c "import bot; print('ok')"
+cd integrations/livekit/python/quickstart && uv sync && uv run python -c "import test_inworld_voice_agent; print('ok')"
+cd integrations/livekit/js/quickstart && pnpm install  # import check: JS loads on pnpm start
+```
+
+## Expected Results
+
+- Benchmark output shows a TTFB table with N matching the `-n` flag
+- Inworld HTTP TTFB: ~0.2–0.4s, WS TTFB: ~0.2–0.5s
+- If TTFB is N/A or audio_bytes=0: API key is invalid or expired
+- LiveKit JS WS TTFB appears ~200ms higher than Python (framework measurement difference, not real latency — documented in JS benchmarks README)
+
+## Key Conventions
+
+- Python: always `uv sync && uv run` (never activate venvs manually)
+- JS: always `pnpm` (not npm)
+- Inworld base URLs are hardcoded in TTS factory functions — edit directly to point at a dev environment
+- Pipecat `InworldHttpTTSService` does not accept a custom `base_url` parameter
+- `-n` = number of TTFB samples, `--warmup` = throwaway iterations for connection warmup
diff --git a/integrations/README.md b/integrations/README.md
@@ -0,0 +1,43 @@
+# Integrations
+
+Inworld TTS plugins and examples for voice agent frameworks.
+
+## First-Time Setup
+
+```bash
+git submodule update --init --recursive
+```
+
+Then follow the setup instructions in each directory below.
+
+## LiveKit
+
+### Python
+
+| | |
+|---|---|
+| [Quickstart](livekit/python/quickstart/README.md) | Voice agent using Inworld TTS with LiveKit Agents (Python) |
+| [Benchmarks](livekit/python/benchmarks/README.md) | HTTP & WebSocket TTFB benchmarks vs ElevenLabs, Cartesia |
+
+### JS/TypeScript
+
+| | |
+|---|---|
+| [Quickstart](livekit/js/quickstart/README.md) | Voice agent using Inworld TTS with LiveKit Agents (JS) |
+| [Benchmarks](livekit/js/benchmarks/README.md) | HTTP & WebSocket TTFB benchmarks vs ElevenLabs, Cartesia |
+
+## Pipecat
+
+| | |
+|---|---|
+| [Quickstart](pipecat/pipecat-quickstart/README.md) | Voice bot using Inworld TTS with Pipecat |
+| [Benchmarks](pipecat/benchmarks/README.md) | HTTP & WebSocket TTFB benchmarks vs ElevenLabs, Cartesia |
+
+## AI Agent Guides
+
+If you're using an AI coding assistant, these files provide full context for automated testing and development:
+
+- **Cursor**: [`.cursor/rules/integrations.mdc`](../.cursor/rules/integrations.mdc) — auto-activates when working in this directory
+- **Claude Code**: [`CLAUDE.md`](CLAUDE.md) — sanity check commands, directory structure, expected results
+
+Agents can run all benchmarks and verify quickstarts — just make sure `.env` files with valid API keys exist in each directory first (copy from `.env.example`).
diff --git a/integrations/livekit/js/agents-js b/integrations/livekit/js/agents-js
diff --git a/integrations/livekit/js/benchmarks/.env.example b/integrations/livekit/js/benchmarks/.env.example
@@ -0,0 +1,3 @@
+INWORLD_API_KEY=
+ELEVEN_API_KEY=
+CARTESIA_API_KEY=
diff --git a/integrations/livekit/js/benchmarks/README.md b/integrations/livekit/js/benchmarks/README.md
@@ -0,0 +1,75 @@
+> First time? See [integrations setup](../../../README.md) to initialize submodules.
+
+# TTS TTFB Benchmarks — LiveKit Agents (JavaScript)
+
+Measures time-to-first-byte (TTFB) for TTS providers via LiveKit Agents JS SDK.
+Compares Inworld, ElevenLabs, and Cartesia across HTTP and WebSocket transports.
+
+## Setup
+
+```bash
+# Build agents-js monorepo (need to re-run after source changes)
+cd integrations/livekit/js/agents-js
+pnpm install && pnpm build
+
+# Install benchmark dependencies
+cd integrations/livekit/js/benchmarks
+pnpm install
+```
+
+Copy the example env file and fill in your API keys:
+
+```bash
+cp .env.example .env
+# edit .env with your keys
+```
+
+## Usage
+
+```bash
+# HTTP benchmark
+npx tsx benchmark_http_ttfb.ts --services inworld
+npx tsx benchmark_http_ttfb.ts --services inworld -n 10
+
+# WebSocket benchmark
+npx tsx benchmark_websocket_ttfb.ts --services inworld
+npx tsx benchmark_websocket_ttfb.ts --services all --token-delay 50
+
+```
+
+## CLI Options
+
+| Flag                 | HTTP | WS  | Default | Description                                        |
+| -------------------- | ---- | --- | ------- | -------------------------------------------------- |
+| `--text`             | yes  | yes | *       | Custom text to synthesize                          |
+| `-n` / `--iterations`| yes  | yes | 5       | Number of timed iterations                         |
+| `--services`         | yes  | yes | all     | Comma-separated: inworld,elevenlabs,cartesia       |
+| `--no-save-audio`    | yes  | yes | off     | Skip saving WAV output files                       |
+| `--debug`            | yes  | yes | off     | Enable debug logging                               |
+| `--warmup`           | yes  | yes | 1       | Warmup iterations before timing                    |
+| `--token-delay`      | —    | yes | 50ms    | Delay between simulated LLM tokens                 |
+
+\* Default text: 2 sentences
+
+## Note on WebSocket TTFB measurement
+
+The LiveKit agents-js framework starts the TTFB timer on the first `pushText()` call,
+not when the complete sentence is sent to the TTS provider. This means the WS TTFB
+metric includes token aggregation time (waiting for sentence boundary) in addition to
+the actual API latency. The LiveKit Python SDK does not have this behavior — it starts
+the timer when the sentence is dispatched to the provider.
+
+As a result, **JS WebSocket TTFB will appear ~300ms higher than Python WebSocket TTFB**
+for the same provider with the default benchmark text and 50ms token delay. Two factors
+contribute:
+
+1. **TTFB timer start**: JS starts the timer at `pushText()`, Python starts when the
+   sentence is dispatched to the provider.
+2. **Sentence tokenizer**: The JS `basic.SentenceTokenizer` has a `minSentenceLength`
+   of 20 characters, so short sentences like "Hello!" (6 chars) are merged with the
+   next sentence. The Python `blingfire.SentenceTokenizer` splits at punctuation
+   regardless of length. With the default benchmark text, the JS tokenizer waits for
+   all tokens (~300ms at 50ms/token) before yielding a single sentence, while Python
+   splits "Hello!" immediately.
+
+HTTP TTFB is unaffected and comparable across both SDKs.