Skip to content

Ojhaharsh/Heartbeat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎤 Heartbeat

Native C++17 Inference Engine for Kokoro-82M Text-to-Speech

Heartbeat is a high-performance, standalone TTS engine that runs Kokoro-82M entirely in C++ with no Python runtime dependencies. Built on GGML for tensor operations and custom ISTFT for audio synthesis.

✨ Features

  • Fast: <200ms latency for 5-second sentences on AVX2 CPUs
  • 🎯 Portable: Single GGUF model file, no external dependencies at runtime
  • 🔊 High Quality: 24kHz audio output using ISTFTNet vocoder
  • 🌍 Multi-Voice: American English, Indian English, and more

🚀 Quick Start

1. Set Up Dependencies

# Windows (PowerShell as Administrator)
.\scripts\setup_dependencies.ps1

This installs:

  • espeak-ng - Text-to-phoneme conversion
  • GGML - Tensor operations library
  • KissFFT - Fast Fourier Transform
  • Python packages - For model export

2. Download & Export Model

# Download Kokoro-82M from Hugging Face
python scripts/download_model.py

# Convert to GGUF format
python scripts/export_kokoro.py

3. Build

mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . --config Release

4. Run

./heartbeat --text "Hello, world!" --voice af --output hello.wav

🎭 Available Voices

Voice Code Description
af American Female
am American Male
bf British Female
bm British Male
in_f Indian Female
in_m Indian Male

📖 Usage

# Basic synthesis
heartbeat --text "Welcome to Heartbeat!" --output welcome.wav

# Specify voice
heartbeat --text "नमस्ते" --voice in_f --output namaste.wav

# Benchmark mode
heartbeat --benchmark --text "Performance test sentence."

🏗️ Architecture

Text → Phonemizer (espeak-ng) → PL-BERT Encoder → Duration Predictor
                                      ↓
    WAV ← ISTFT ← ISTFTNet Decoder ← Length Regulator ← Style Vector

📁 Project Structure

Heartbeat/
├── extern/           # Third-party libraries
│   ├── ggml/         # Tensor operations
│   └── kissfft/      # FFT library
├── models/           # Model files (.pth, .gguf)
├── scripts/          # Python utilities
├── include/          # C++ headers
├── src/              # C++ implementation
└── tests/            # Unit tests

🤝 Credits

📄 License

MIT License - See LICENSE for details.

About

high-performance, standalone TTS engine that runs Kokoro-82M entirely in C++

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors