You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Browser-based AI DJ studio — describe a sound, hear it instantly.
Talk to your AI DJ, generate custom sounds on the fly, and build beats directly in the browser. No DAW. No samples pack. Just describe what you want and drop it on a pad.
How It Works
Open ByteBeat — a 16-pad controller loads in your browser with your saved session
Click Enable Voice to connect to your AI DJ via ElevenLabs Conversational AI
Ask for a sound — "give me a fat 808 kick on pad 1"
The DJ calls generate_sound, ElevenLabs SFX generates the audio, it lands on the pad and plays automatically
Keep building — the DJ suggests patterns, BPM, and complements your existing sounds
Sponsor Integrations
ElevenLabs
Conversational AI — full-duplex voice agent that listens, responds, and takes action. The DJ holds context across the session, adapts its greeting for returning users, and calls client-side tools in real time
Sound Effect Generation — every pad sound is AI-generated from a text prompt via the ElevenLabs SFX API. Generated audio is returned as raw bytes, stored in R2, and served back instantly
Client tools — the agent calls generate_sound(prompt, name, slot_number) directly from the conversation, auto-assigning the result to the correct pad without any UI interaction
Cloudflare
Workers — edge API handles sound generation, audio serving, session persistence, and AI agent routing
Durable Objects — each session gets its own DO instance for consistent, low-latency state (pad assignments, BPM)
R2 — stores every generated MP3 with a 1-year cache header; sounds are shared across all sessions
KV — caches sound metadata and generation results (24-hour TTL) to avoid re-generating identical prompts
Workers AI — powers the text-based agent fallback (@cf/meta/llama-3.1-8b-instruct) for when voice is off
Vectorize — wired up for semantic sound search (find similar sounds by description)
Features
16-pad controller with reactive neon glow — pads light up based on their assigned sound color
Voice-first AI DJ via ElevenLabs Conversational AI — talks, listens, and generates sounds mid-conversation
Text-to-sound generation — describe any sound in plain English and get an MP3 in ~2 seconds
Auto-assign to pad slot — the agent places sounds directly on numbered pads via tool call
Community sound library — all generated sounds are shared and reusable across sessions
Persistent sessions — pad layout and BPM saved to Cloudflare Durable Objects, restored on refresh
Pad selection mode — click "use on pad" in chat and pads wobble to let you pick a slot visually
MIDI-ready — each pad maps to a GM drum MIDI note (36–51)
git clone https://github.com/your-username/byteBeat.git
cd byteBeat
bun install
API service (services/bytebeat_api/):
Set ELEVENLABS_API_KEY in your Cloudflare Worker secrets (wrangler secret put ELEVENLABS_API_KEY)
Create the R2 bucket and KV namespace: wrangler r2 bucket create bytebeat-audio and wrangler kv namespace create AUDIO_CACHE
bun run dev — starts the Worker locally via Wrangler
Frontend (apps/bytebeat_client/):
Copy .env and set VITE_API_URL and VITE_ELEVENLABS_AGENT_ID
Create an ElevenLabs Conversational AI agent and add the generate_sound client tool (params: prompt, name, slot_number)
bun run dev — starts the Vite dev server
License
MIT
About
AI-powered beat pad — talk to your DJ, generate sounds from text, and drop them straight onto the board. Powered by ElevenLabs Conversational AI + Cloudflare Workers.