You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ByteBeat is a browser-based, AI-powered MIDI pad controller and sequencer.
Users arrange neon/pastel sound pads, build multi-track timelines, and co-produce
music in real-time — with an AI DJ agent that can suggest loops, generate new
sounds via ElevenLabs, and remix your session on command.
Tech Stack
Layer
Technology
Why
Frontend
Vite + React + TypeScript
Fast dev, great ecosystem
Styling
Tailwind CSS + custom CSS vars
Neon/pastel theming, dark mode
Audio Engine
Web Audio API + Tone.js
Scheduling, effects, MIDI
MIDI I/O
WebMIDI API
Real hardware controller support
API Runtime
Elysia on Cloudflare Workers
Edge-first, Bun-compatible
Persistence
Cloudflare Durable Objects
Stateful session + timeline
Sound Storage
Cloudflare R2
Pre-baked + generated audio
Caching
Cloudflare KV
Generated audio blob cache
Sound Search
Cloudflare Vectorize
Semantic "find me a punchy kick"
AI Inference
Cloudflare Workers AI
Local LLM for agent logic
Voice/Sound Gen
ElevenLabs TTS + Sound Effects
On-demand sound generation
Collab
WebSocket via Durable Objects
Real-time multi-user jam
Deploy (FE)
Cloudflare Pages
CDN-hosted, instant preview URLs
Cloudflare Features Utilized
Feature
Usage
Workers
All API routes, agent logic, audio proxy
Durable Objects
SessionDurableObject (pad layout, BPM), TimelineDurableObject (multi-track state + WS hub)
R2
Store pre-baked ElevenLabs audio files + user-generated sounds
KV
Cache ElevenLabs API responses (audio URLs keyed by prompt hash)
Vectorize
Embed sound metadata; semantic search ("give me something ethereal")
Workers AI
Run inference for the DJ Agent (sound suggestions, timeline analysis)
Browser Rendering
Generate shareable session screenshots / OG images
Pages
Host the Vite frontend
WebSockets (DO)
Live collab — see other users' pad hits in real-time
ElevenLabs Features Utilized
Feature
Usage
Text-to-Speech
Generate vocal samples from text (e.g. "say 'lets go' in a hype voice")
Sound Effects API
Generate percussion, synth hits, FX from text prompts
Voice Cloning
User uploads voice → clone → use as instrument pad