-
Notifications
You must be signed in to change notification settings - Fork 46k
feat(backend): Add language fallback for YouTube transcription block #11057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests
Comment |
✅ Deploy Preview for auto-gpt-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Here's the code health analysis summary for commits Analysis Summary
|
Thank you for starting work on this YouTube transcription fallback feature! Since this is marked as WIP, I understand it's not ready for merging yet. Here's what needs to be addressed before this can be approved:
I look forward to reviewing this PR once the implementation is complete! |
- Modified get_transcript method to fallback to first available language when English is not available - Added comprehensive unit tests for language fallback functionality - All tests passing including block integration test Co-authored-by: ntindle <[email protected]>
Thanks for working on this YouTube transcript fallback feature! Your implementation looks technically sound with comprehensive test coverage. Before this PR can be merged, there are a few things to address:
Once these items are addressed, the PR will be ready for a full review. The implementation itself looks promising - adding fallback to any available language when English transcripts aren't available is a good solution to the problem. |
Thanks for working on this YouTube transcription fallback feature! The implementation looks solid and includes comprehensive test coverage. However, before this PR can be merged, a few items need to be addressed:
The implementation itself looks good - the fallback mechanism correctly tries English first, then falls back to other available transcripts with a preference for manually created ones over auto-generated. The test coverage is thorough, covering all the different scenarios including the fallback logic. |
- Added explanation of language fallback behavior - Documented preference order for transcripts (English > manual > auto-generated) - Clarified use case for non-English videos Co-authored-by: ntindle <[email protected]>
Thanks for your PR implementing the YouTube transcription language fallback feature. The implementation looks solid, with comprehensive test coverage and good documentation updates. Before we can merge this PR, please update the title to follow our conventional commit format. It should be structured like: Your implementation looks good - I particularly appreciate how you've:
Once the title is updated from the current WIP format to the conventional commit format, this should be ready for final review. |
Thank you for your work on implementing the YouTube transcription fallback feature. The implementation looks solid with good test coverage and documentation updates. Before we can merge this PR:
The technical implementation looks excellent - the fallback logic is well-designed, thoroughly tested, and properly documented. Nice work on the comprehensive test suite! |
Thank you for your detailed PR implementing YouTube transcription language fallback. The code changes look well-implemented with comprehensive testing and documentation updates. Before this can be merged, there are two issues that need to be addressed:
Your implementation looks solid - the code changes are focused, well-tested, and include appropriate documentation updates. Once you address these formatting issues, this PR should be ready for final review and merge. |
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
Problem
The YouTube transcription block would fail when attempting to transcribe videos that only had transcripts available in non-English languages. Even when usable transcripts existed in other languages, the block would raise a
NoTranscriptFound
error because it only requested English transcripts.Example video that would fail: https://www.youtube.com/watch?v=3AMl5d2NKpQ (only has Hungarian transcripts)
Error message:
Solution
Implemented intelligent language fallback in the
TranscribeYoutubeVideoBlock.get_transcript()
method:Example behavior:
Changes
backend/blocks/youtube.py
: Added try-catch logic to fallback to any available language when English is not foundtest/blocks/test_youtube.py
: Comprehensive test suite covering URL extraction, language fallback, transcript preferences, and error handling (7 tests)docs/content/platform/blocks/youtube.md
: Documented the language fallback behavior and transcript priority orderTesting
Impact
This fix enables the YouTube transcription block to work with international content while maintaining full backward compatibility:
Fixes #10637
Fixes https://linear.app/autogpt/issue/OPEN-2626
Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
www.youtube.com
/home/REDACTED/.cache/pypoetry/virtualenvs/autogpt-platform-backend-Ajv4iu2i-py3.11/bin/python3
(dns block)If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.