Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 107 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,112 @@ Powered by -
<img src="resources/coqui.png" width=80px height=80px></img>
<img src="resources/kokoro.jpg" width=80px height=80px></img>

## Features

- RSS feed parsing and article extraction
- Article summarization using Ollama
- Text-to-speech conversion using multiple engines:
- Kokoro TTS (Recommended)
- MLX Audio TTS
- Coqui TTS
- Podcast generation with customizable settings
- Web interface for configuration and control

## Requirements

- Go 1.21 or later
- Ollama (for article summarization)
- One of the following TTS engines:
- Kokoro TTS (recommended)
- MLX Audio TTS
- Coqui TTS

## Installation

1. Clone the repository:
```bash
git clone https://github.com/intothevoid/rss2podcast.git
cd rss2podcast
```

2. Install dependencies:
```bash
go mod download
```

3. Configure the application by editing `config.yaml` or using the web interface.

## Configuration

The application can be configured using the web interface or by editing the `config.yaml` file. The following settings are available:

### RSS Settings
- `url`: The RSS feed URL to parse
- `max_articles`: Maximum number of articles to process
- `filters`: List of filters to apply to articles

### Ollama Settings
- `end_point`: The Ollama API endpoint
- `model`: The Ollama model to use for summarization

### Podcast Settings
- `subject`: The podcast subject
- `podcaster`: The podcaster name

### TTS Settings
- `engine`: The TTS engine to use ("kokoro", "mlx", or "coqui")
- `kokoro`: Kokoro TTS settings
- `url`: The Kokoro TTS API endpoint
- `voice`: The voice to use
- `speed`: The speech speed (0.25 to 4.0)
- `format`: The audio format (mp3, opus, flac, wav, pcm)
- `mlx`: MLX Audio TTS settings
- `url`: The MLX Audio TTS API endpoint
- `voice`: The voice to use
- `speed`: The speech speed (0.5 to 2.0)
- `format`: The audio format (mp3, wav)
- `coqui`: Coqui TTS settings
- `url`: The Coqui TTS API endpoint

## Usage

1. Start the application:
```bash
go run cmd/rss2podcast/main.go
```

2. Access the web interface at `http://localhost:8080`

3. Configure the application using the web interface or edit `config.yaml`

4. The application will:
- Parse the RSS feed
- Extract and summarize articles
- Convert the summary to audio using the selected TTS engine
- Generate a podcast file

## TTS Engines

### Kokoro TTS (Recommended)
Kokoro TTS offers OpenAI-compatible speech synthesis with support for multiple voices and formats. It provides excellent quality with low latency.

### MLX Audio TTS
MLX Audio TTS is a powerful text-to-speech engine that provides high-quality speech synthesis with support for multiple voices and formats. It offers additional features like direct audio playback and output folder management.

### Coqui TTS
Coqui TTS provides high-quality speech synthesis with support for multiple voices and formats.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Acknowledgments

- [Ollama](https://ollama.ai/) for the LLM API
- [Kokoro TTS](https://github.com/kokoro-tts/kokoro) for the TTS engine
- [MLX Audio TTS](https://github.com/mlx-audio/mlx-tts) for the TTS engine
- [Coqui TTS](https://github.com/coqui-ai/TTS) for the TTS engine

## How it works
The application reads an rss feed, extracts the articles and summarises them.

Expand Down Expand Up @@ -106,68 +212,8 @@ Start the container by using the following command:
docker run -d -p 5002:5002 --platform linux/amd64 --entrypoint /usr/local/bin/tts-server ghcr.io/coqui-ai/tts-cpu --model_name tts_models/en/ljspeech/vits
```

## Installation

Clone the repository and navigate into the directory:

```bash
git clone https://github.com/yourusername/your-repo.git
cd your-repo
```

Then, install the dependencies:
```bash
go mod download
```

## Usage
To run the application, navigate to the cmd/rss2podcast directory and run:
```bash
go run main.go
```

## Testing
To run the tests, use the following command:
```bash
go test ./...
```

## Configuration

The application can be configured through the web interface or by editing the `config.yaml` file directly. The configuration options include:

### Podcast Settings
- `subject`: The topic or subject of your podcast
- `podcaster`: The name of the podcaster

### RSS Feed Settings
- `url`: The RSS feed URL to fetch content from
- `max_articles`: Maximum number of articles to process
- `filters`: List of keywords to filter out unwanted articles

### Ollama Settings
- `end_point`: The Ollama API endpoint
- `model`: The Ollama model to use for text generation

### TTS Settings
- `engine`: The TTS engine to use ("coqui" or "kokoro")
- `coqui.url`: The URL for the Coqui TTS service
- `kokoro.url`: The URL for the Kokoro TTS service

### TTS Requirements

#### Coqui TTS
- Requires a running instance of Coqui TTS server
- Default URL: http://localhost:5002/api/tts
- Installation and setup instructions: [Coqui TTS Documentation](https://github.com/coqui-ai/TTS)

#### Kokoro TTS
- Requires a running instance of Kokoro TTS FastAPIserver
- Default URL: http://localhost:8880/docs
- Installation and setup instructions: [Kokoro TTS Fast API](https://github.com/remsky/Kokoro-FastAPI)

## Contributing
Contributions are welcome. Please open a pull request with your changes.

## License
This project is licensed under the terms of the MIT License.
```
35 changes: 20 additions & 15 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -1,21 +1,26 @@
podcast:
subject: "News"
podcaster: "Cody"
rss:
url: "https://news.google.com/rss/search?q=australia"
max_articles: 15
url: https://news.google.com/rss/search?q=australia
max_articles: 10
filters:
- "Daily"
- "Weekly"
- Daily
- Weekly
ollama:
end_point: "http://localhost:11434/api/generate"
model: "mistral:latest"
end_point: http://localhost:11434/api/generate
model: mistral:7b
podcast:
subject: News
podcaster: Cody
tts:
engine: "kokoro" # Options: "coqui" or "kokoro"
engine: mlx
coqui:
url: "http://localhost:5002/api/tts"
url: http://localhost:5002/api/tts
kokoro:
url: "http://localhost:8880"
voice: "bm_george" # Default voice, options: af_heart, en_heart, etc.
speed: 1.0 # Range: 0.25 to 4.0
format: "mp3" # Options: mp3, opus, flac, wav, pcm
url: http://localhost:8880
voice: bm_george
speed: 1
format: mp3
mlx:
url: http://localhost:8000
voice: bm_george
speed: 1.2
format: mp3
2 changes: 2 additions & 0 deletions frontend/about.html
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ <h3 style="padding: 5px;"><strong>Ollama</strong></h3>
Default model used is mistral:7b</p>
<h4 style="padding: 5px;"><strong>Kokoro TTS (Recommended)</strong></h4>
<p>Kokoro TTS, which offers OpenAI-compatible speech synthesis with support for multiple voices and formats. Kokoro is the default TTS engine and provides excellent quality with low latency.</p>
<h4 style="padding: 5px;"><strong>MLX Audio TTS</strong></h4>
<p>MLX Audio TTS is a powerful text-to-speech engine that provides high-quality speech synthesis with support for multiple voices and formats. It offers additional features like direct audio playback and output folder management.</p>
<h4 style="padding: 5px;"><strong>Coqui TTS</strong></h4>
<p>The summarised article content can be converted into an audio podcast using the Coqui TTS API, which provides high-quality speech synthesis.</p>
</main>
Expand Down
92 changes: 91 additions & 1 deletion frontend/configuration.html
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
<select id="tts_engine" class="form-control" onchange="updateTtsUrlVisibility()">
<option value="coqui">Coqui</option>
<option value="kokoro">Kokoro</option>
<option value="mlx">MLX Audio</option>
</select>
</div>

Expand All @@ -70,7 +71,72 @@
<label for="kokoro_voice">Voice:</label>
<select id="kokoro_voice" class="form-control">
<option value="bm_george">bm_george</option>
<option value="en_heart">en_heart</option>
<option value="af_alloy">af_alloy</option>
<option value="af_aoede">af_aoede</option>
<option value="af_bella">af_bella</option>
<option value="af_heart">af_heart</option>
<option value="af_jadzia">af_jadzia</option>
<option value="af_jessica">af_jessica</option>
<option value="af_kore">af_kore</option>
<option value="af_nicole">af_nicole</option>
<option value="af_nova">af_nova</option>
<option value="af_river">af_river</option>
<option value="af_sarah">af_sarah</option>
<option value="af_sky">af_sky</option>
<option value="af_v0">af_v0</option>
<option value="af_v0bella">af_v0bella</option>
<option value="af_v0irulan">af_v0irulan</option>
<option value="af_v0nicole">af_v0nicole</option>
<option value="af_v0sarah">af_v0sarah</option>
<option value="af_v0sky">af_v0sky</option>
<option value="am_adam">am_adam</option>
<option value="am_echo">am_echo</option>
<option value="am_eric">am_eric</option>
<option value="am_fenrir">am_fenrir</option>
<option value="am_liam">am_liam</option>
<option value="am_michael">am_michael</option>
<option value="am_onyx">am_onyx</option>
<option value="am_puck">am_puck</option>
<option value="am_santa">am_santa</option>
<option value="am_v0adam">am_v0adam</option>
<option value="am_v0gurney">am_v0gurney</option>
<option value="am_v0michael">am_v0michael</option>
<option value="bf_alice">bf_alice</option>
<option value="bf_emma">bf_emma</option>
<option value="bf_lily">bf_lily</option>
<option value="bf_v0emma">bf_v0emma</option>
<option value="bf_v0isabella">bf_v0isabella</option>
<option value="bm_daniel">bm_daniel</option>
<option value="bm_fable">bm_fable</option>
<option value="bm_lewis">bm_lewis</option>
<option value="bm_v0george">bm_v0george</option>
<option value="bm_v0lewis">bm_v0lewis</option>
<option value="ef_dora">ef_dora</option>
<option value="em_alex">em_alex</option>
<option value="em_santa">em_santa</option>
<option value="ff_siwis">ff_siwis</option>
<option value="hf_alpha">hf_alpha</option>
<option value="hf_beta">hf_beta</option>
<option value="hm_omega">hm_omega</option>
<option value="hm_psi">hm_psi</option>
<option value="if_sara">if_sara</option>
<option value="im_nicola">im_nicola</option>
<option value="jf_alpha">jf_alpha</option>
<option value="jf_gongitsune">jf_gongitsune</option>
<option value="jf_nezumi">jf_nezumi</option>
<option value="jf_tebukuro">jf_tebukuro</option>
<option value="jm_kumo">jm_kumo</option>
<option value="pf_dora">pf_dora</option>
<option value="pm_alex">pm_alex</option>
<option value="pm_santa">pm_santa</option>
<option value="zf_xiaobei">zf_xiaobei</option>
<option value="zf_xiaoni">zf_xiaoni</option>
<option value="zf_xiaoxiao">zf_xiaoxiao</option>
<option value="zf_xiaoyi">zf_xiaoyi</option>
<option value="zm_yunjian">zm_yunjian</option>
<option value="zm_yunxi">zm_yunxi</option>
<option value="zm_yunxia">zm_yunxia</option>
<option value="zm_yunyang">zm_yunyang</option>
</select>

<label for="kokoro_speed">Speed:</label>
Expand All @@ -86,6 +152,29 @@
</select>
</div>

<div id="mlx_url_container" class="form-group" style="display: none;">
<label for="mlx_url">MLX Audio TTS URL:</label>
<input type="text" id="mlx_url" class="form-control" value="http://localhost:8000">

<label for="mlx_voice">Voice:</label>
<select id="mlx_voice" class="form-control">
<option value="af_heart">af_heart</option>
<option value="bm_george">bm_george</option>
<option value="af_alloy">af_alloy</option>
<option value="af_aoede">af_aoede</option>
<option value="af_bella">af_bella</option>
</select>

<label for="mlx_speed">Speed:</label>
<input type="number" id="mlx_speed" class="form-control" value="1.0" min="0.5" max="2.0" step="0.1">

<label for="mlx_format">Format:</label>
<select id="mlx_format" class="form-control">
<option value="mp3">MP3</option>
<option value="wav">WAV</option>
</select>
</div>

<div class="flex flex-row text-center py-5">
<button onclick="saveConfig()" class="save-button bg-blue-500 text-white"
style="padding-left: 5px; padding-right: 5px;">Save</button>
Expand All @@ -102,6 +191,7 @@
const engine = document.getElementById('tts_engine').value;
document.getElementById('coqui_url_container').style.display = engine === 'coqui' ? 'block' : 'none';
document.getElementById('kokoro_url_container').style.display = engine === 'kokoro' ? 'block' : 'none';
document.getElementById('mlx_url_container').style.display = engine === 'mlx' ? 'block' : 'none';
}

// Call on page load to set initial visibility
Expand Down
Loading