feat: add TTS voice message responses via edge-tts#65
feat: add TTS voice message responses via edge-tts#65asturwebs wants to merge 5 commits intosix-ddc:mainfrom
Conversation
Add text-to-speech support using Microsoft Edge neural voices. Final assistant responses are sent as Telegram voice notes alongside text. Features: - /voice — Toggle TTS on/off (per-user) - /voice <name> — Set voice and auto-enable TTS - /voices — Compact locale index with voice counts - /voices <locale> — List all voices for a locale (es, en, zh...) - Per-user voice selection with global defaults - Graceful 503 handling for Microsoft service outages - Smart /voices vs /voice confusion detection Config (env vars): - CCBOT_TTS_ENABLED (default: true) - CCBOT_TTS_AUTO (default: false) - CCBOT_TTS_VOICE (default: es-ES-ElviraNeural) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e560196e0b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Validate voice names in set_voice() to prevent invalid names like '/voices' from crashing edge-tts (ValueError) - Add clean_text_for_tts() to strip emojis, symbols, and markdown artifacts before TTS synthesis for cleaner audio - Preserve 'assistant' role when merging mixed-role tasks so TTS isn't skipped when user+assistant messages are batch-merged - Add 5 new tests for text cleanup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b9307f76a2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
merged_role and merged_complete were derived from all drained queue items including non-merged ones put back. Now uses [first] + items[:merge_count] to avoid incorrectly labeling merged tasks when later non-mergeable tasks have assistant role or is_complete=True. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The markdown cleanup pattern stripped backticks before the code fence pattern could match, leaving code block content as orphan text in TTS. Now code fences are removed first (with their content), then remaining inline backticks are cleaned up. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1d9a3312c8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
OGG/Opus encoding trims the first few milliseconds, cutting the initial phoneme. Prefixing text with "... " forces edge-tts to start with a brief silence, preserving the first word intact. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
/voicetoggles TTS on/off and sets voice,/voiceslists available voices/voices <VoiceName>confusion and suggests/voice; graceful 503 handlingChanges
src/ccbot/tts.py(new)synthesize(text, user_id)→ OGG/Opus audio bytes via edge-ttssend_voice_message()→ Send voice note to Telegram with silent fallbackis_tts_enabled(),toggle_tts(),get_voice(),set_voice()— per-user stateclean_text_for_tts()→ strip emojis, markdown, symbols, code fences before synthesisset_voice()→ validates voice name, rejects command-like strings (/,list,all)"..."pause to prevent first-word truncation in OGG/Opus encodingsrc/ccbot/config.pyCCBOT_TTS_ENABLED(default:true) — global toggleCCBOT_TTS_AUTO(default:false) — auto-enable for all usersCCBOT_TTS_VOICE(default:es-ES-ElviraNeural) — default voicesrc/ccbot/bot.py/voicecommand — toggle or set voice (auto-enables TTS, validates voice name)/voicescommand — compact locale index or filtered voice list by locale prefix/voices <VoiceName>→ suggests/voicesrc/ccbot/handlers/message_queue.pyMessageTaskgainsroleandis_completefields for TTS gatingis_complete=True,role=assistant)"assistant"when any merged task has that rolepyproject.tomledge-tts>=7.2.8added as dependencytests/ccbot/test_tts.py(new)Usage
Voice names are validated — command-like strings are rejected. Use
/voicesto discover available voices for any language.Config
Test plan
ruff checkpassesruff format --checkpassespyright— 0 errors/voice,/voice <name>,/voices,/voices <locale>tested/voices,list,allas voice names/voices <VoiceName>confusion detection tested🤖 Generated with BytIA