Skip to content

Detect and auto switch languages #373

@mbalc

Description

@mbalc

Related to #147

Auto models have the following cons when it comes to real-life multi-language usage:

  • only basic models available
  • models too general - often ( = "in my case" :v ), you only need like just 2 or 3 specific languages:
    • you can't get Intermediate Results, even if the specific languages you use do have support for that,
    • at least theoretically - a model specialized in a single language should have better precision than one that's generalized over a set of languages you never really use
      • I think it is reasonable enough to make an assumption that a single listening session will in most cases not mix languages, and if it does, it should be easy to restart listening when you switch
        • this could be less intuitive for some, but I think (at least in my case) - this would better fit my workflow overall
  • auto models might be slower, at least that's what's been suggested in comments in STT language auto-detection #147 (comment)

Feature draft:

  • say, you have a single "favorite" model selected per each language that's enabled in the app
  • you have an extra "language recognition" enumeration model
  1. in an "auto" language mode, every time a STT listening session starts, when you start speaking, the language recognition model runs first while buffering input for the STT model
  2. after the language is recognized with enough certainty, the choice of the model is made based on language recognized and the "favorite" model that's chosen for that language
  3. stuff buffered so far is passed to the STT model
  • here we can probably assume that just the last 3-5 seconds maximum will be needed if we can assume the lang recognition model is fast and accurate enough
    • (maybe even the lang recognition model could point out the exact moment the speech actually starts?)
  1. the rest of the current listening session proceeds as normal

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions