-
Notifications
You must be signed in to change notification settings - Fork 183
Description
Describe the bug
When generating educational videos using TheoremExplainAgent, if any scene fails to render completely (video + subtitles), the entire video combination process fails. This means that even if 4 out of 5 scenes successfully generate, the system cannot produce a final video and all the successful scene generations (and the API tokens spent) are effectively wasted. There's also no straightforward way to retry just the failed scenes.
To Reproduce
Steps to reproduce the behavior:
- Run the video generation command with a topic requiring multiple scenes (e.g., "Machine Learning and Decision Trees")
python generate_video.py ` --model "gemini/gemini-2.0-flash-001" ` --helper_model "gemini/gemini-2.0-flash-001" ` --output_dir "output/machine_learning_decision_trees_gemini2" ` --topic "Machine Learning and Decision Trees" ` --context "Introduction to basic concepts of machine learning with a focus on decision trees" `
- If any scene fails to generate completely (e.g., scene 1 or 2 fails due to missing Manim component or syntax error)
- Try to combine the videos using the
--only_combine
flag - Observe the error: "Not all videos/subtitles are found, aborting video combination"
Expected behavior
The system should:
- Allow users to retry failed scenes individually without regenerating everything
- Provide an option to combine only the successfully generated scenes, or to substitute placeholder content for failed scenes
- Have better error recovery during generation to automatically adapt and retry with modified code when encountering Manim render errors
Desktop (please complete the following information):
- OS: Windows 11
- Python: 3.12.8
- Conda environment: tea
Additional context
The current design wastes considerable API tokens and time when partial failures occur. Issues encountered include:
- Scene generation failures due to missing Manim components (e.g., Flowchart class)
- LaTeX rendering errors with certain Unicode characters
- Inability to retry just the failed scenes, as the system recognizes existing files but doesn't process them
- No ability to selectively combine successful scenes
Currently, users must resort to manual solutions like FFmpeg to combine successful scene videos, which undermines the automation purpose of this tool. Since API tokens are expended on both successful and failed generations, this represents a significant waste of resources when the final video cannot be produced.