Skip to content

ECHO-651 ECHO-745 Fix audio pipeline issues (#543)#544

Merged
ussaama merged 1 commit into
mainfrom
testing
Apr 16, 2026
Merged

ECHO-651 ECHO-745 Fix audio pipeline issues (#543)#544
ussaama merged 1 commit into
mainfrom
testing

Conversation

@ussaama
Copy link
Copy Markdown
Contributor

@ussaama ussaama commented Apr 16, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Improved audio file processing with enhanced error detection.
    • Fixed audio duration tracking accuracy for multi-file uploads.
  • Refactor

    • Optimized conversation handling to properly track multiple file uploads.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 16, 2026

Walkthrough

Frontend API refactored to track multiple per-file conversations via a Map instead of single conversation ID. Backend audio utilities switched from pipe-based to disk-backed temporary files and added duration probing directly from merge results. Conversation API updated to retrieve durations from merge operations or Directus instead of separate S3 calls. Tests updated to handle new tuple return type.

Changes

Cohort / File(s) Summary
Frontend conversation tracking
echo/frontend/src/lib/api.ts
Updated initiateAndUploadConversationChunk to manage multiple conversation IDs per file using Map<number, string>, with per-conversation finishConversation() calls via Promise.all after uploads complete.
Backend duration handling
echo/server/dembrane/api/conversation.py
Removed get_duration_from_s3 dependency; now derives duration from merge_multiple_audio_files_and_save_to_s3 return value in get_conversation_content, and from Directus query in retranscribe_conversation.
Audio processing refactor
echo/server/dembrane/audio_utils.py
Replaced pipe-based ffmpeg I/O with disk-backed temp directories across convert_and_save_to_s3, merge_multiple_audio_files_and_save_to_s3, and split_audio_chunk. Changed merge return type to tuple[str, float] with duration probing. Added probe_from_file() function for local ffprobe queries. Enhanced ffmpeg error reporting via stderr inspection.
Test updates
echo/server/tests/test_audio_utils.py
Updated merge-related test calls to unpack duration tuple and added duration > 0 assertions. Reformatted assertion message formatting.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • add conversation source when retranscribing #233: Modifies retranscribe_conversation in the same conversation.py file, handling merged-audio duration and conversation metadata/links alongside this PR's duration-fetching refactor.
  • ETL hot fixes #232: Updates conversation finish logic to call finish_conversation per-conversation, aligning with the frontend's new per-file conversation tracking pattern.

Suggested labels

bug, improvement

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly references the specific issues being addressed (ECHO-651, ECHO-745) and accurately describes the changeset's primary objective of fixing audio pipeline issues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch testing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added bug Something isn't working improvement labels Apr 16, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
echo/server/dembrane/audio_utils.py (1)

340-341: ⚠️ Potential issue | 🟡 Minor

total_size_mb is dead code - always logs 0.

The variable is initialized but never updated. The log at line 449 will always report Total input size: 0.0MB. Ship it or fix it, but definitely don't leave stale telemetry.

🛠️ Proposed fix

Either track the size properly:

     total_size_mb = 0
+    for i_name in input_file_names:
+        try:
+            response = s3_client.head_object(
+                Bucket=STORAGE_S3_BUCKET, Key=get_sanitized_s3_key(i_name)
+            )
+            total_size_mb += response["ContentLength"] / (1024 * 1024)
+        except Exception:
+            pass

Or remove the misleading log:

     logger.info(
-        f"Completed merging {len(input_file_names)} files in {elapsed:.2f}s. "
-        f"Total input size: {total_size_mb:.1f}MB, duration: {audio_duration:.1f}s"
+        f"Completed merging {len(input_file_names)} files in {elapsed:.2f}s. "
+        f"Duration: {audio_duration:.1f}s"
     )

Also applies to: 448-449

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@echo/server/dembrane/audio_utils.py` around lines 340 - 341, The variable
total_size_mb is initialized but never updated, so the "Total input size" log
always prints 0.0MB; either remove that misleading log or update total_size_mb
as you process inputs (e.g., inside the loop that builds/reads chunks or files,
add bytes/ (1024*1024) to total_size_mb or sum os.path.getsize(file) for each
input) and then log the computed value. Update references to total_size_mb (and
the log call that prints it) so the metric reflects the actual accumulated input
size.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@echo/server/dembrane/audio_utils.py`:
- Around line 340-341: The variable total_size_mb is initialized but never
updated, so the "Total input size" log always prints 0.0MB; either remove that
misleading log or update total_size_mb as you process inputs (e.g., inside the
loop that builds/reads chunks or files, add bytes/ (1024*1024) to total_size_mb
or sum os.path.getsize(file) for each input) and then log the computed value.
Update references to total_size_mb (and the log call that prints it) so the
metric reflects the actual accumulated input size.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f2892854-7104-4f8c-87a7-8f7dac6fdb34

📥 Commits

Reviewing files that changed from the base of the PR and between 3d0dd57 and 92da09d.

📒 Files selected for processing (4)
  • echo/frontend/src/lib/api.ts
  • echo/server/dembrane/api/conversation.py
  • echo/server/dembrane/audio_utils.py
  • echo/server/tests/test_audio_utils.py

@ussaama ussaama requested a review from spashii April 16, 2026 15:53
@ussaama ussaama added this pull request to the merge queue Apr 16, 2026
Merged via the queue into main with commit d612e19 Apr 16, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants