feat: improve gemini performance #1017

Roei-Bracha · 2025-06-24T14:42:53Z

This pull request introduces significant updates to the GeminiRealtimeConfig class and its associated methods within the gemini_v2v_python extension. The changes enhance audio processing, improve connection management, and update configuration defaults for better performance and reliability.
Significantly reduced the response time and experience.

Configuration Updates:

Updated the model field in GeminiRealtimeConfig to use a new model version (gemini-2.5-flash-preview-native-audio-dialog) for improved functionality.
Renamed audio_chunk_size to audio_buffer_threshold for clarity and updated its usage throughout the code. [1] [2]
Changed audio_len_threshold from 5120 to 1024 and made it configurable via audio_buffer_threshold. [1] [2]
Enabled transcribe_agent in the property.json file to allow transcription of agent responses.

Audio Processing Enhancements:

Added an audio_queue for asynchronous audio processing, enabling non-blocking operations and handling of audio data with improved efficiency. [1] [2]
Introduced timeout-based audio buffer flushing and queue management to prevent data loss during high-load scenarios.
Created a new _process_audio_queue method to process queued audio data in the background, optimizing real-time audio handling.

Connection Management Improvements:

Replaced _loop with _connection_manager, which uses retries with exponential backoff and proper error handling for robust session management.
Added _run_session to streamline session initialization and task handling, ensuring better resource management and logging.

Task Handling and Cleanup:

Introduced a tasks list to manage all asynchronous tasks, enabling clean cancellation during on_stop. [1] [2]
Enhanced on_stop to cancel all running tasks and clean up the session gracefully.

Transcription Updates:

Made _send_transcript asynchronous and added lower-priority task creation for input and output transcriptions.

…oved performance and transcription handling

…lization to allow proactive audio

…ariable

Roei-Bracha · 2025-07-10T18:59:27Z

@plutoless, can you please review? This improves the Gemini performance dramatically and allows using it in a production environment.

plutoless · 2025-07-12T16:02:07Z

ai_agents/server/internal/config.go

+
 var (
-	logTag = slog.String("service", "HTTP_SERVER")
+	MAX_GEMINI_WORKER_COUNT = getMaxGeminiWorkerCount()


what is this for?

This change makes the MAX_GEMINI_WORKER_COUNT configurable via environment variables, which is essential for production deployments where performance tuning is critical. For example, I have rank 3, so I can have much more than 3 conversations at a time.

ai_agents/agents/ten_packages/extension/gemini_v2v_python/extension.py

plutoless · 2025-07-12T17:03:51Z

ai_agents/agents/ten_packages/extension/gemini_v2v_python/extension.py

+                    and len(self.buff) > 0
+                ):
+                    await self._flush_audio_buffer()
+                    ten_env.log_debug("Flushed audio buffer due to timeout")


when is this needed?

The timeout-based buffer flush is crucial for handling edge cases in real-time audio processing where audio data might get stuck in the buffer.

Example scenario:

User says "Hello" (short phrase)
Audio data is 800 bytes, below 1024 threshold
Without timeout flush, this audio would never get processed
With 500ms timeout, it gets flushed automatically, ensuring responsiveness

…method

Roei-Bracha · 2025-07-15T15:57:29Z

@plutoless Please update me what you think, solved some of your comments and commented on others

feat: update GeminiRealtimeConfig and audio processing logic for impr…

220226a

…oved performance and transcription handling

Roei-Bracha requested review from halajohn and plutoless as code owners June 24, 2025 14:42

Merge branch 'main' into gemini-performance-update-230625

97439b9

Roei-Bracha changed the title ~~feat: Improve gemini performance~~ feat: improve gemini performance Jun 24, 2025

Roei-Bracha added 7 commits June 24, 2025 18:24

Fix linting

13c2795

Merge branch 'main' into gemini-performance-update-230625

1da0ed7

feat: add API version option to GeminiRealtimeExtension client initia…

621e856

…lization to allow proactive audio

Merge branch 'main' into gemini-performance-update-230625

ad496db

Merge branch 'main' into gemini-performance-update-230625

4cdee9a

Merge branch 'main' into gemini-performance-update-230625

cf02e5d

feat: implement dynamic worker count configuration from environment v…

3fa0998

…ariable

plutoless added 2 commits July 12, 2025 16:15

Merge branch 'main' into gemini-performance-update-230625

5908589

Merge branch 'main' into gemini-performance-update-230625

f43421b

plutoless reviewed Jul 12, 2025

View reviewed changes

Roei-Bracha added 2 commits July 15, 2025 18:28

Merge branch 'main' into gemini-performance-update-230625

c0fefe6

refactor: simplify audio data handling and update transcript sending …

6dc53a3

…method

Roei-Bracha added 2 commits July 19, 2025 16:46

Merge branch 'main' into gemini-performance-update-230625

cbf76c9

Merge branch 'main' into gemini-performance-update-230625

34d3b0c

Roei-Bracha requested a review from plutoless July 27, 2025 06:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: improve gemini performance #1017

feat: improve gemini performance #1017

Uh oh!

Roei-Bracha commented Jun 24, 2025

Uh oh!

Roei-Bracha commented Jul 10, 2025

Uh oh!

plutoless Jul 12, 2025

Uh oh!

Roei-Bracha Jul 15, 2025

Uh oh!

Uh oh!

Uh oh!

plutoless Jul 12, 2025

Uh oh!

Roei-Bracha Jul 15, 2025

Uh oh!

Roei-Bracha commented Jul 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

feat: improve gemini performance #1017

Are you sure you want to change the base?

feat: improve gemini performance #1017

Uh oh!

Conversation

Roei-Bracha commented Jun 24, 2025

Configuration Updates:

Audio Processing Enhancements:

Connection Management Improvements:

Task Handling and Cleanup:

Transcription Updates:

Uh oh!

Roei-Bracha commented Jul 10, 2025

Uh oh!

plutoless Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

Roei-Bracha Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

plutoless Jul 12, 2025

Choose a reason for hiding this comment

Uh oh!

Roei-Bracha Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Roei-Bracha commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Roei-Bracha commented Jul 15, 2025 •

edited

Loading