🎙️ The Empathy Engine

Giving AI a Human Voice — A service that dynamically modulates synthesized speech based on the detected emotion of the source text.

🎯 Overview

The Empathy Engine bridges the gap between text-based sentiment and expressive, human-like audio output. Instead of monotonic, robotic speech, it detects the emotion behind your text and adjusts the voice — its speed, pitch, and volume — to match how a human would naturally say it.

Key Features:

Granular Emotion Detection — Goes beyond positive/negative/neutral to detect 7 emotions: Happy, Excited, Sad, Angry, Fearful, Surprised, and Neutral
Intensity Scaling — The degree of emotion affects the voice. "This is good" sounds different from "THIS IS THE BEST NEWS EVER!!!"
SSML-like Pause Injection — Natural pauses at punctuation for more human-like delivery
Web Interface — A polished Flask UI with instant audio playback
CLI Mode — A terminal-based interface for quick testing
API Endpoint — JSON API at POST /synthesize for programmatic access

🏗️ Architecture & Design Choices

Emotion Detection — Two-Stage Hybrid Approach

Rather than relying on a single method, the engine uses a two-stage pipeline:

VADER Sentiment Analysis — Provides a continuous compound score (–1.0 to +1.0) that captures overall valence and intensity. VADER is specifically tuned for social media text and handles slang, emojis, capitalization, and punctuation boosters (like !!!) out of the box.
Keyword Matching — A curated keyword lexicon distinguishes between emotions that share the same VADER polarity. For example, "angry" and "sad" are both negative, but keyword matching differentiates them.

The two stages are cross-validated: keyword matches are only trusted if the VADER polarity aligns, preventing false classifications.

Intensity Scaling

The intensity (0.0–1.0) is derived from VADER's absolute compound score, then boosted by:

Exclamation marks (! → +0.08 each, capped at +0.3)
ALL CAPS words → +0.06 each, capped at +0.2

This intensity value controls an interpolation between the neutral voice profile and the target emotion's profile. A low-intensity "happy" barely changes the voice; a high-intensity "excited" dramatically increases rate and pitch.

Voice Parameter Modulation

Each emotion maps to a voice profile that modulates three parameters:

Emotion	Rate (wpm)	Volume	Pitch Δ (Hz)
Excited	↑ Fast	Full	+40
Happy	↑ Slightly	Full	+20
Neutral	Normal	95%	0
Sad	↓ Slow	Soft	–30
Angry	↑ Fast	Full	+15
Fearful	↑ Fast	Soft	+25
Surprised	↑↑ Fast	Full	+35

These profiles are interpolated by intensity, so the actual values fall on a spectrum.

TTS Engine

pyttsx3 was chosen for fully offline, zero-API-key operation. It uses the system's native TTS engine (espeak on Linux, SAPI5 on Windows, NSSpeechSynthesizer on macOS). Pitch control is best supported on Linux (espeak backend); on other platforms, rate and volume modulation still produce clearly differentiated emotional speech.

🚀 Setup & Installation

Prerequisites

Python 3.8+
espeak (Linux) or native TTS (macOS/Windows)

# On Ubuntu/Debian — install espeak for pyttsx3
sudo apt-get install espeak

# On macOS — no extra install needed (uses NSSpeechSynthesizer)
# On Windows — no extra install needed (uses SAPI5)

Install

# Clone the repository
git clone https://github.com/YOUR_USERNAME/empathy-engine.git
cd empathy-engine

# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

▶️ Running the Application

Web Interface (default)

python app.py

Open http://localhost:5000 in your browser. Type or paste text, click Synthesize Speech, and listen.

CLI Mode

python app.py --cli

Type sentences interactively in the terminal. Emotion analysis and voice parameters are printed, and audio files are saved to static/audio/.

API Usage

curl -X POST http://localhost:5000/synthesize \
  -H "Content-Type: application/json" \
  -d '{"text": "I am absolutely thrilled about this!"}'

Response:

{
  "emotion": "excited",
  "intensity": 0.82,
  "vader_scores": { "neg": 0.0, "neu": 0.359, "pos": 0.641, "compound": 0.6996 },
  "voice_profile": { "rate": 218, "volume": 0.94, "pitch_delta": 32 },
  "audio_url": "/audio/empathy_a1b2c3d4e5.wav"
}

📂 Project Structure

empathy-engine/
├── app.py                 # Main application (Flask + emotion + TTS)
├── requirements.txt       # Python dependencies
├── README.md              # This file
├── templates/
│   └── index.html         # Web interface
└── static/
    └── audio/             # Generated .wav files (auto-created)

🧪 Example Test Cases

Input Text	Expected Emotion	Intensity
"The meeting is at 3 PM."	Neutral	Low
"I'm so happy for you!"	Happy	Medium
"THIS IS THE BEST NEWS EVER!!!"	Excited	High
"This is unacceptable. I demand a refund."	Angry	High
"I'm worried about the deadline."	Fearful	Medium
"I miss those days so much."	Sad	Medium

📜 License

MIT License — free to use, modify, and distribute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ The Empathy Engine

🎯 Overview

🏗️ Architecture & Design Choices

Emotion Detection — Two-Stage Hybrid Approach

Intensity Scaling

Voice Parameter Modulation

TTS Engine

🚀 Setup & Installation

Prerequisites

Install

▶️ Running the Application

Web Interface (default)

CLI Mode

API Usage

📂 Project Structure

🧪 Example Test Cases

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
templates		templates
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
test_gtts.mp3		test_gtts.mp3

Folders and files

Latest commit

History

Repository files navigation

🎙️ The Empathy Engine

🎯 Overview

🏗️ Architecture & Design Choices

Emotion Detection — Two-Stage Hybrid Approach

Intensity Scaling

Voice Parameter Modulation

TTS Engine

🚀 Setup & Installation

Prerequisites

Install

▶️ Running the Application

Web Interface (default)

CLI Mode

API Usage

📂 Project Structure

🧪 Example Test Cases

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages