AI Calling Agent MVP

An outbound voice agent that places calls and walks them through a six-state sales funnel — code controls the flow, the LLM only generates language.

The funnel

The conversation is a state machine. The LLM never decides where the call goes — it only writes the next line, and a classifier (interested / not_interested / unclear) feeds the engine.

   ┌──────┐    ┌───────┐    ┌──────────┐    ┌───────┐    ┌───────┐    ┌─────┐
   │ INIT │ -> │ INTRO │ -> │ QUALIFY  │ -> │ PITCH │ -> │ CLOSE │ -> │ END │
   └──────┘    └───────┘    └──────────┘    └───────┘    └───────┘    └─────┘
                                  │                                      ▲
                                  └──────── not_interested ──────────────┘

Every transition is code-controlled, persisted to SQLite, and emitted as a structured log line.

Why this exists

Voice agents fail in production from accumulated complexity, not from missing features. This repo is the smallest thing that places a real call, drives a real funnel, and persists every turn — built against a five-principle constitution:

#	Principle	Status
I	Simplicity Over Cleverness	standard
II	Deterministic Control, Generative Surface	standard
III	Latency Discipline	NON-NEGOTIABLE
IV	Conversation Is a Funnel	standard
V	Transparency to the Caller	NON-NEGOTIABLE

Every turn must respond in under 3 seconds end-to-end. The agent identifies itself as AI on request and honors do-not-call signals on the same call.

Architecture at a glance

       caller                Twilio                 FastAPI                OpenAI
        ───┐                  ───┐                    ───┐                  ───┐
           │  speech           │  /twilio/voice         │  generate_reply     │
           │ ─────────────────▶│ ────────────────────▶  │ ──────────────────▶ │
           │                   │                        │                     │
           │                   │                        │  ◀── classify ──────│
           │                   │  TwiML <Say>           │                     │
           │  audio   ◀──────  │ ◀───────────────────── │ ─── stream TTS ──▶  ElevenLabs
           ▼                   ▼                        ▼
                                                    SQLite (calls, messages)

Three layers, no message queue, no Redis, no Alembic:

app/
├── routes/      # POST /call/start, POST /twilio/voice — thin
├── services/    # conversation_engine, ai_service, tts_service, intent, retention
└── db/          # SQLAlchemy async + aiosqlite; two tables: calls, messages

Quickstart

Full walkthrough: specs/001-calling-agent-mvp/quickstart.md · Target: clone → first test call in 10 minutes.

Prerequisites: Docker + Docker Compose, a Twilio account with a verified test number, and a public URL for webhooks (ngrok http 8000).

git clone <repo-url> && cd VoiceSalesAgent
cp .env.example .env && $EDITOR .env   # paste keys + PUBLIC_BASE_URL
docker compose up --build

Place a call:

curl -X POST http://localhost:8000/call/start \
  -H 'Content-Type: application/json' \
  -d '{"phone": "+1YOUR_TEST_NUMBER"}'

Your phone rings. The agent introduces itself, qualifies, pitches, and closes — or ends politely if you signal disinterest. Every turn lands in SQLite with the funnel state at the time of the turn.

Configuration

Variable	Required	Notes
`OPENAI_API_KEY`	yes	LLM + classifier
`TWILIO_ACCOUNT_SID`	yes	telephony
`TWILIO_AUTH_TOKEN`	yes	webhook signature verification
`TWILIO_PHONE_NUMBER`	yes	originating number
`ELEVENLABS_API_KEY`	yes	streaming TTS; falls back to Twilio `<Say>` on error
`PUBLIC_BASE_URL`	yes	https URL Twilio can reach (ngrok in dev)
`HANDOFF_NUMBER`	optional	live-transfer target; FR-021 falls back cleanly if absent
`WEBHOOK_BUDGET_SECONDS`	optional	engine short-circuits past this; default `1.8`

Behavior matrix

What the agent does when reality gets weird:

Scenario	What you'll hear	What gets persisted
Caller asks "are you AI?"	Confirms it's an AI, keeps going	normal turn (FR-002, SC-005)
Caller asks for a person	"Transferring you now"; Twilio dials out	`status=transferred`
Caller says "do not call"	Polite end on the same call	`end_reason=caller_dnc`
Silence ≥ 8 s	Polite end	`end_reason=silence_timeout`
LLM error	Fallback line, then end	`end_reason=llm_error`, `status=failed`
No `HANDOFF_NUMBER`	"Can't transfer right now"; ends	`end_reason=handoff_unconfigured`

Performance baseline

Simulated harness (tests/integration/test_latency_smoke.py, RUN_LATENCY_SMOKE=1 pytest) measures wall-clock from POST /twilio/voice ingress to TwiML response, with realistic per-stage delays.

Stage	Budget	Observed (p95)
LLM	1.5 s	~600 ms
TTS first byte	1.5 s	~700 ms
Webhook ack	2.0 s	~1.32 s
End-to-end	3.0 s	within budget

Twilio adds ~700 ms in production (STT silence-detect + TwiML round-trip). WEBHOOK_BUDGET_SECONDS is the load-bearing constraint; if real LLM tail latency pushes engine work past it, the call ends as telephony_error and the breach is logged as event=webhook_budget_exceeded.

Project layout

.
├── app/                       # FastAPI service (routes / services / db)
├── tests/
│   ├── contract/              # OpenAPI conformance for /call/start
│   ├── integration/           # full funnel, qualify routing, failure modes,
│   │                          # handoff, idempotency, transparency, retention
│   └── unit/                  # state machine, intent service
├── specs/001-calling-agent-mvp/
│   ├── spec.md                # functional + success criteria
│   ├── plan.md                # implementation plan + constitution check
│   ├── research.md            # latency budget, library choices
│   ├── data-model.md          # tables, columns, indexes
│   ├── quickstart.md          # clone → first call (10-minute target)
│   └── contracts/             # OpenAPI + Twilio webhook shape
├── .specify/memory/constitution.md   # the five principles
├── Dockerfile
├── docker-compose.yml
└── pyproject.toml

Documentation

Specification — what the system does and why
Implementation plan — how it's built, with the constitution check
Quickstart — clone to first test call
Data model — calls and messages schemas
Constitution — the five principles every change is checked against

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Calling Agent MVP

The funnel

Why this exists

Architecture at a glance

Quickstart

Configuration

Behavior matrix

Performance baseline

Project layout

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.claude/skills		.claude/skills
.idea		.idea
.specify		.specify
app		app
specs/001-calling-agent-mvp		specs/001-calling-agent-mvp
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

AI Calling Agent MVP

The funnel

Why this exists

Architecture at a glance

Quickstart

Configuration

Behavior matrix

Performance baseline

Project layout

Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages