LibreVA is a self-hosted, open-source platform for generating high-quality voice acting for indie games. It provides a web interface for managing projects and voices, and a powerful text-to-speech backend based on Chatterbox TTS, both containerized using Docker.
- Clone the repository:
git clone [email protected]:digitalbreed/libreva.git cd libreva
- Start the application with Docker Compose:
If you are on WSL2 with CUDA support, run:
If you are on a different system, run:
docker compose -f docker-compose.yml -f docker-compose.gpu.yml up --build
On first start, several dependencies and approximately 2 GB of model files will be downloaded automatically. This may take several minutes depending on your internet connection.TTS_DEVICE=cpu docker compose up --build
- Access the app: Open http://localhost:3000/libreva in your browser.
- Upload a voice sample:
- (This step is optional; if you don't have a sample, you can use the built-in default voice for testing.)
- Click the "Add Voice" button on the homepage or the "Voices" sidebar menu item.
- Click "Add Voice" on the voices screen and select "Upload Voice".
- Follow the instructions to upload voice sample WAV file. Use 10-20 seconds of good quality for best results.
- Create a project:
- Click "Create Project" and enter a name for your project. Projects hold your generated speech files.
- Convert text to speech:
- Open your project, enter some text, select a voice, and click "Generate" to create a voice acting audio file.
You can configure the default TTS exaggeration and temperature. Please see the Chatterbox documentation for more information.
LibreVA supports changing these values in the middle of a speech line, as well as inserting breaks, using these markers:
<p=N>for pause in milliseconds<e=N>for exaggeration value (0 to 2)<t=N>for temperature value (0 to 1)
LibreVA supports uploading zip archives containing multiple voice samples and metadata, called voice packs.
You can create your own voice pack by creating a zip file containing a voices.json file with the following structure:
{
"version": "1.0.0",
"author": "Your Name <[email protected]>",
"repository": "https://github.com/yourname/your-repository",
"base": "voices",
"voices": [
{
"name": "Speaker Name",
"notes": "Some notes describing the voice to ease discoverability.",
"gender": "female",
"file_name": "A-female-voice.wav",
"tags": ["american", "female", "clear", "calm"]
},
...
]
}
When importing, LibreVA searches for the voices.json file in the root and on the first level of a zip archive. This allows for zip file download from GitHub repositories, where the content is stored in a directory with the name of the repository.
All properties except for the voices array are optional.
The base attribute is a directory name relative to the location of voices.json. This allows storing the WAV files in a sub-directory.
The gender attribute can be "male", "female", null, or omitted entirely.
The tags attribute can be omitted entirely.
LibreVA consists of two main services, orchestrated via Docker Compose:
- Purpose: The Next.js web frontend and API server. Manages projects, voices, and user interface.
- Build: Built from
Dockerfile.web(production) orDockerfile.web.dev(development). - Ports: Exposes port 3000 by default.
- Data: Mounts the
/datadirectory for database, voices, and outputs.
- Purpose: The TTS backend, running Chatterbox TTS for voice synthesis and waveform generation.
- Build: Built from
tts_service/Dockerfile. - Ports: Exposes port 3100 internally for API calls from
libreva-web. - Data: Mounts the
/datadirectory for model files.
The /data/ folder is the central storage location for all persistent data:
db/— SQLite database for projects, voices, and outputsvoices/— Voice samples and their waveform images (<voiceId>.wavand<voiceId>.png)outputs/— Generated audio outputs and their waveform images, organized by project (<projectId>/<outputId>.wavand<projectId>/<outputId>.png)model/— Chatterbox TTS model files
On first start, the start_tts.sh script in libreva-ttsdownloads all required model files (ca. 2 GB) into /data/model if missing.
The libreva-tts service defers initialization until all files are downloaded.
To reset all data (including projects, voices, and outputs):
- Stop the application:
docker compose down
- Delete the database and all generated data:
rm -rf data/db/* data/voices/* data/outputs/* data/waveforms/*
- Restart the application:
docker compose up --build
You can run LibreVA in development mode using a separate compose file. This will launch the Next.js web application in development mode and serve source code and dependencies from the host file system, enabling hot reload.
To start in development mode, run:
docker compose -f docker-compose-dev.yml -f docker-compose.gpu.yml up --buildThe web app will then reload on code changes and the TTS backend will use your local tts_service code.
This project uses native Node modules (notably sqlite3), so you must install dependencies in the same environment as your container (Linux for most Docker images).
-
Why do we mount
node_modulesfrom the host?- The editor/IDE on the host can resolve dependencies for intellisense, type checking, etc.
- The container and host use the same
node_modules, avoiding sync issues. - Hot reload works seamlessly.
-
On Windows:
- Use WSL2 (Windows Subsystem for Linux) for development.
- Always use the WSL2-native Node.js and npm. Check with
which nodeandwhich npm(should be/usr/bin/nodeor~/.nvm/..., not/mnt/c/...). - Recommended: Use nvm (Node Version Manager) to install Node.js in WSL2:
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash nvm install --lts nvm use --lts - Ensure your PATH in WSL2 does NOT include Windows Node.js or npm (e.g.,
/mnt/c/Program Files/nodejs). - This ensures all native modules are built for Linux, matching your container.
- Open your project in WSL2 (e.g., Ubuntu) and run
npm installthere. - Do not run
npm installfrom Windows CMD/PowerShell, or you will get Windows binaries that will not work in the container. - If you see errors like
Exec format errororinvalid ELF header, deletenode_modulesandpackage-lock.json, then reinstall in WSL2.
-
On Mac/Linux:
- Just run
npm installas usual; your environment matches the container.
- Just run
- To ensure you have a CUDA-capable graphics card, type
nvidia-smiin a terminal and check for theCUDA Versionoutput. - If the
libreva-ttsservice stops with the messagecontainer libreva-libreva-tts-1 is unhealthyat first start, Docker may have deemed your container unhealthy while it's actually still downloading dependencies or model files. Increase the healthcheckstart_periodparameter in docker-compose.yml. - The default maximum voice sample file size is 5 MB. If you get a "File too large" error when trying to upload a voice sample, either try a smaller sample or change the
MAX_VOICE_FILE_SIZEenvironment variable.
Feel free to fork, enhance, and submit pull requests! For questions or ideas, open an issue or discussion.
- Next.js Documentation
- Chatterbox TTS (ResembleAI)
- Docker Desktop WSL2 backend
- Docker Compose
- shadcn/ui
This project is licensed under the MIT License. See the LICENSE file for details.
