Skip to content

Commit ead5ac4

Browse files
committed
remove line that was not allowing page to render
1 parent e0104a1 commit ead5ac4

File tree

1 file changed

+17
-10
lines changed

1 file changed

+17
-10
lines changed

fern/pages/05-guides/cookbooks/streaming-stt/turn_detection_improvement_using_async.mdx

Lines changed: 17 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ title: "Determine Optimal Turn Detection Settings from Historical Audio Analysis
55
This guide shows how to analyze utterance gaps from multiple pre-recorded audio files to automatically determine optimal turn detection settings for real-time streaming transcription. It processes an entire folder, aggregates gap statistics across all recordings, and configures the WebSocket with parameters tailored to your specific conversation patterns.
66

77
## Quickstart
8-
98
```python
109
import requests
1110
import time
@@ -575,7 +574,7 @@ if __name__ == "__main__":
575574
main()
576575
```
577576

578-
## Step-by-step guide
577+
## Step-By-Step Guide
579578

580579
Before we begin, make sure you have an AssemblyAI account and an API key. You can [sign up](https://assemblyai.com/dashboard/signup) and get your API key from your dashboard.
581580

@@ -586,7 +585,8 @@ pip install requests pyaudio websocket-client
586585
```
587586

588587
2. **Configuration and Global Variables**
589-
Sets up API credentials, file paths, audio parameters (16kHz sample rate, mono channel), and initializes global variables for managing WebSocket connections and audio streaming threads.
588+
589+
Set up API credentials, file paths, audio parameters (16kHz sample rate, mono channel), and initialize global variables for managing WebSocket connections and audio streaming threads.
590590

591591
```python
592592
import requests
@@ -624,7 +624,8 @@ OPTIMIZED_CONFIG = {}
624624
```
625625

626626
3. **Define get_audio_files() Function**
627-
Scans a specified folder for audio/video files with supported extensions and returns a sorted list of file paths for batch processing.
627+
628+
This function scans a specified folder for audio/video files with supported extensions and returns a sorted list of file paths for batch processing.
628629

629630
```python
630631
def get_audio_files(folder_path):
@@ -650,7 +651,8 @@ def get_audio_files(folder_path):
650651
```
651652

652653
4. **Define `analyze_single_file()` Function**
653-
Uploads an audio file to AssemblyAI, requests transcription with speaker labels enabled, polls until completion, then calculates gap statistics between utterances (average, median, min, max) and saves the transcript JSON.
654+
655+
This function uploads an audio file to AssemblyAI, requests transcription with speaker labels enabled, polls until completion, then calculates gap statistics between utterances (average, median, min, max) and saves the transcript JSON.
654656

655657
```python
656658
def analyze_single_file(audio_file, api_key, file_index, total_files):
@@ -762,7 +764,8 @@ def analyze_single_file(audio_file, api_key, file_index, total_files):
762764
```
763765

764766
5. **Define `analyze_multiple_files()` Function**
765-
Orchestrates the analysis of all files in a folder by calling `analyze_single_file()` for each, aggregates all gap data across files, calculates overall statistics, displays per-file breakdowns, and saves a comprehensive summary JSON.
767+
768+
This function orchestrates the analysis of all files in a folder by calling `analyze_single_file()` for each, aggregates all gap data across files, calculates overall statistics, displays per-file breakdowns, and saves a comprehensive summary JSON.
766769

767770
```python
768771
def analyze_multiple_files(folder_path, api_key):
@@ -879,7 +882,8 @@ def analyze_multiple_files(folder_path, api_key):
879882
```
880883

881884
6. **Define `determine_streaming_config()` Function**
882-
Takes aggregated gap statistics and selects one of three preset configurations (Aggressive <500ms, Balanced 500-1000ms, Conservative >1000ms) with optimized turn detection parameters for different conversation styles.
885+
886+
This function takes aggregated gap statistics and selects one of three preset configurations with optimized turn detection parameters for different conversation styles.
883887

884888
```python
885889
def determine_streaming_config(aggregated_stats):
@@ -944,7 +948,8 @@ def determine_streaming_config(aggregated_stats):
944948
```
945949

946950
7. **Create WebSocket Event Handlers (`on_open`, `on_message`, `on_error`, `on_close`)**
947-
Manage the real-time streaming connection lifecycle: `on_open` starts the audio streaming thread, `on_message` processes transcription results (partial and final turns), and the close/error handlers clean up resources.
951+
952+
These functions manage the real-time streaming connection lifecycle: `on_open` starts the audio streaming thread, `on_message` processes transcription results (partial and final turns), and the close/error handlers clean up resources.
948953

949954
```python
950955
def on_open(ws):
@@ -1031,7 +1036,8 @@ def on_close(ws, close_status_code, close_msg):
10311036
```
10321037

10331038
8. **Define `run_streaming()` Function**
1034-
Initializes PyAudio to capture microphone input, establishes a WebSocket connection with the optimized configuration parameters, and streams audio in real-time while displaying transcription results until the user stops with Ctrl+C.
1039+
1040+
This function initializes PyAudio to capture microphone input, establishes a WebSocket connection with the optimized configuration parameters, and streams audio in real-time while displaying transcription results until the user stops with Ctrl+C.
10351041

10361042
```python
10371043
def run_streaming(config):
@@ -1133,7 +1139,8 @@ def run_streaming(config):
11331139
```
11341140

11351141
9. **Define `main()` Workflow**
1136-
Executes the three-step process: analyze all audio files in the folder, determine the best streaming configuration based on aggregated utterance gaps, then launch real-time streaming with the optimized settings.
1142+
1143+
Execute the three-step process: analyze all audio files in the folder, determine the best streaming configuration based on aggregated utterance gaps, then launch real-time streaming with the optimized settings.
11371144

11381145
```python
11391146
def main():

0 commit comments

Comments
 (0)