Sync with upstream by AIWintermuteAI · Pull Request #6 · AIWintermuteAI/WhisperLive

AIWintermuteAI · 2024-10-24T19:23:36Z

No description provided.

Improve cpu and gpu Dockerfiles, resulting in much smaller images

Add option: save network stream to local file while transcribing

… OMP_NUM_THREADS Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

fix: limit CPU usage for VAD onnxruntime inference session by setting…

Add support for RTSP stream

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

…optional Make writing audio frames optional

- Use a threadlock around the model in single model mode

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Expose the srt file location of Transcription client

Update tensorrt llm to v0.9.0

fix spelling of detection in README.md.

Single model mode

README.md: add instructions for running client

Add `--enable-timestamps` option to `run_client.py` script to print out transcripted text with timestamps. Sample output with translation enabled: ``` [0.000 -> 7.440] And so, my fellow Americans, ask not what your country can do for you. [7.440 -> 10.300] Ask what you can do for your country. TRANSLATION to fr: [0.000 -> 7.440] Et donc, mes camarades américains, ne demandez pas ce que votre pays peut faire pour vous. [7.440 -> 10.300] Demandez ce que vous pouvez faire pour votre pays. ``` Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com>

Enable timestamps for transcripted text

feat: update to support faster whisper 1.2.0

Resolves pkg_resources missing during wheel build Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Bump openai-whisper version to 20250625.

Replace hardcoded [-4:] truncation with a configurable display_segments parameter (default: 4) in both Client and TranscriptionClient classes. Fixes #377

Add cross-client GPU batch inference for faster_whisper backend

When VAD removes all speech from an audio chunk, transcriber.transcribe() returns (None, info). Calling list(None) raises TypeError. The _process_multi path already handles this case; this aligns _process_single to match.

Fix NoneType crash in _process_single when VAD filters all audio

feat: make display_segments configurable in Client/TranscriptionClient

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Expose __version__ in package root and update dependencies in setup.py

Fix crash when no --files provided; use microphone input instead

fix: render transcript text safely in browser extensions

these new test cover issues such as thread safety, VAD thresholding, message routing, error handling etc. that weren't covered by existing tests. Mocking is used to avoid dependencies on GPU, ONNX etc.

fixes #

- All ClientManager methods (add_client, get_client, remove_client, get_wait_time, is_server_full, is_client_timeout) now protected by a threading.Lock - cleanup() called outside the lock to avoid holding it during I/O - is_server_full() computes wait time inline under lock instead of calling get_wait_time() to avoid nested lock acquisition - Added concurrent thread safety tests for add/remove and get operations

CI: expand test suite coverage

audio: add support for raw pcm input via server flag

Add thread safety to client manager with threading lock

fraic and others added 30 commits March 25, 2024 19:11

Add option: save network stream to local file while transcribing

f78fc47

Improve cpu and gpu Dockerfiles, resulting in much smaller images

dccfce2

Merge pull request #206 from peldszus/smaller-dockerimages

0a2d92c

Improve cpu and gpu Dockerfiles, resulting in much smaller images

Merge pull request #192 from fraic/dev1

a968331

Add option: save network stream to local file while transcribing

Add support for RTSP stream

615c9c7

fix: limit CPU usage for VAD onnxruntime inference session by setting…

819ab35

… OMP_NUM_THREADS Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #215 from makaveli10/fix/omp-num-threads

c0a947a

fix: limit CPU usage for VAD onnxruntime inference session by setting…

Merge pull request #212 from dshepelev15/feat/RTSP_support

03e30e1

Add support for RTSP stream

Make writing output audio file optional when using microphone

61d07ed

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Update README

8f373c3

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Ignore linting as this file is a copy from faster_whisper

225a98b

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Refactor to make record function more readable

9d2ea75

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Fix README typo

399e9e7

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Remove flake8 warning suppression

3d043dc

Signed-off-by: makaveli10 <suryanvineet47@gmail.com>

Merge pull request #216 from makaveli10/feature/writing_audio_frames_…

e1a42c2

…optional Make writing audio frames optional

Add single model mode for custom models

3c09289

- Use a threadlock around the model in single model mode

Raise error for invalid model paths

3a96f60

Update README

1ac7a27

Expose the srt file location of Transcription client

cfba5b3

Update TensorRT backend tensorrt_llm==0.9.0

f73a146

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Dockerfile tensorrt use cuda-runtimee as base image to reduce size

e4579ef

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Update ci to build and push teensorrt docker image

22a37e7

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #224 from chien-liu/expose-client-srt-location

d1de2ec

Expose the srt file location of Transcription client

Merge pull request #227 from makaveli10/update_tensorrt_llm

5e24211

Update tensorrt llm to v0.9.0

Make single model mode the default, update readme

ab17c4d

Fix argparser option

1407731

fix spelling of detection in README.md

761bb61

Merge pull request #228 from anshulkharb/patch-1

ee13251

fix spelling of detection in README.md.

Merge pull request #223 from peldszus/single-model-mode

5b9bc2b

Single model mode

Bump version v0.5.0

815441e

aaron-boxer and others added 30 commits February 5, 2026 22:31

setup.sh: support Fedora

b9ae2af

api: add support OpenAI REST transcription api

29ee640

Merge pull request #415 from JenySadadia/run-client-docs

6c8142a

README.md: add instructions for running client

Merge pull request #418 from JenySadadia/enable-timestamps

98fcc51

Enable timestamps for transcripted text

Merge pull request #398 from AlexStansfield/feature/faster-whisper-1.2.0

e48d16f

feat: update to support faster whisper 1.2.0

Bump openai-whisper version to 20250625

9fa7005

Resolves pkg_resources missing during wheel build Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #419 from makaveli10/bump-whisper-version

f886990

Bump openai-whisper version to 20250625.

Add cross-client GPU batch inference for faster_whisper backend

e8bd4fd

feat: make display_segments configurable in Client/TranscriptionClient

067573a

Replace hardcoded [-4:] truncation with a configurable display_segments parameter (default: 4) in both Client and TranscriptionClient classes. Fixes #377

Fix missing batch_config init causing CI test hang

3508b39

Add unit tests for BatchInferenceWorker

e7e78a7

Merge pull request #422 from ianwh02/feature/batch-inference

5d8629e

Add cross-client GPU batch inference for faster_whisper backend

feat: add --n_display_segments CLI arg to run_client.py

6ae57c8

Fix NoneType crash in _process_single when VAD filters all audio

89466f7

When VAD removes all speech from an audio chunk, transcriber.transcribe() returns (None, info). Calling list(None) raises TypeError. The _process_multi path already handles this case; this aligns _process_single to match.

Merge pull request #427 from ianwh02/fix/batch-single-vad-none

bc441de

Fix NoneType crash in _process_single when VAD filters all audio

Merge pull request #425 from nightcityblade/fix/issue-377

6fcae6a

feat: make display_segments configurable in Client/TranscriptionClient

Expose __version__ in package root and update dependencies in setup.py

5f0010d

Signed-off-by: makaveli10 <vineet.suryan@collabora.com>

Merge pull request #429 from makaveli10/vineet/update-setup-packages

710bdff

Expose __version__ in package root and update dependencies in setup.py

Fix crash when no --files provided; use microphone input instead

4943c25

Merge pull request #430 from makaveli10/vineet/fix-run-client

8e09d16

Fix crash when no --files provided; use microphone input instead

Bump version v0.8.0

6de5c87

fix: render transcript text safely in browser extensions

31efff9

Merge pull request #435 from nightcityblade/fix/issue-327

e41324b

fix: render transcript text safely in browser extensions

CI: expand test suite coverage

b1cd51a

these new test cover issues such as thread safety, VAD thresholding, message routing, error handling etc. that weren't covered by existing tests. Mocking is used to avoid dependencies on GPU, ONNX etc.

audio: add support for raw pcm input via server flag

f5340dd

fixes #

Merge pull request #436 from boxerab/testing

298a01f

CI: expand test suite coverage

Merge pull request #437 from boxerab/rawpcm

1c663d0

audio: add support for raw pcm input via server flag

Merge pull request #438 from boxerab/thread-safety-client-manager

52d94bf

Add thread safety to client manager with threading lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync with upstream#6

Sync with upstream#6
AIWintermuteAI wants to merge 199 commits intoAIWintermuteAI:mainfrom
collabora:main

AIWintermuteAI commented Oct 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

AIWintermuteAI commented Oct 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants