Skip to content

feat(assemblyai): add language_code streaming param for language steering#6219

Open
gsharp-aai wants to merge 2 commits into
livekit:mainfrom
gsharp-aai:gsharp/assemblyai-language-code
Open

feat(assemblyai): add language_code streaming param for language steering#6219
gsharp-aai wants to merge 2 commits into
livekit:mainfrom
gsharp-aai:gsharp/assemblyai-language-code

Conversation

@gsharp-aai

@gsharp-aai gsharp-aai commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a language_code connect-time parameter to the AssemblyAI STT plugin so users can steer transcription toward a specific language (e.g. "en", "es", "fr") instead of relying on automatic detection / code-switching.

Today the plugin only exposes language_detection, which is an output toggle (whether language_code/language_confidence are returned on turn messages) — there is no way to steer the model toward a language. The AssemblyAI streaming API already accepts language_code as a connect-time parameter, so this just plumbs it through.

This is useful for known-monolingual sessions, where steering improves accuracy on short/ambiguous utterances (e.g. disambiguating "see" vs. "si").

Details

  • Added language_code: NotGivenOr[str] to STTOptions and the STT.__init__ signature, forwarded into the connect-time live_config querystring (omitted when unset).
  • Gated to the u3-rt-pro family (u3-rt-pro, u3-rt-pro-beta-1, universal-3-5-pro) via the existing _U3_PRO_MODELS validation — passing it with another model raises ValueError. Language steering is applied by the u3-pro ASR; on the universal-streaming models language_code does not steer, so this matches the parameter's documented behavior and how mode/voice_focus are handled.
  • Connect-time only, matching the AssemblyAI streaming API — language_code is not part of UpdateConfiguration, so it is not added to update_options.
  • Follows the structure of the recent mode param PR (feat(assemblyai): add streaming mode (latency/accuracy preset) param #6156).

Test plan

Added unit tests in tests/test_plugin_assemblyai_stt.py:

  • default is NOT_GIVEN
  • value is stored from the constructor
  • raises ValueError on a non-u3-rt-pro-family model
  • accepted across every u3-rt-pro-family model
  • present in the connect config querystring when set
  • absent from the querystring when unset
  • not exposed via update_options (connect-time only)
65 passed   # full tests/test_plugin_assemblyai_stt.py

ruff check and ruff format pass on both changed files.

@gsharp-aai gsharp-aai requested a review from a team as a code owner June 24, 2026 23:22
devin-ai-integration[bot]

This comment was marked as resolved.

@gsharp-aai gsharp-aai force-pushed the gsharp/assemblyai-language-code branch from 0d6f56c to b9de240 Compare June 24, 2026 23:27
devin-ai-integration[bot]

This comment was marked as resolved.

@gsharp-aai

Copy link
Copy Markdown
Contributor Author

Comment thread livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py Outdated
Comment thread livekit-plugins/livekit-plugins-assemblyai/livekit/plugins/assemblyai/stt.py Outdated
…ring

Adds a language_code connect-time parameter to the AssemblyAI STT plugin,
steering transcription toward a specific language (e.g. 'en', 'es') instead of
automatic detection/code-switching. The plugin previously only exposed
language_detection (an output toggle), with no way to steer language, even
though the AssemblyAI streaming API accepts language_code at connect time.

Language steering is applied by the u3-pro ASR, so language_code is gated to
the u3-rt-pro family (u3-rt-pro, u3-rt-pro-beta-1, universal-3-5-pro) via the
existing _U3_PRO_MODELS validation, matching how mode and voice_focus are
handled. Connect-time only, matching the AssemblyAI streaming API, where
language_code is not part of UpdateConfiguration.
…O 639-1 normalization

- Type language_code as LanguageCode (accepting str input) so common
  formats ('en-US', 'english') are normalized to a bare ISO 639-1 code
  before being sent, matching the language steering expectation.
- Use the "Universal-3 Pro family" umbrella term in the docstring.
- Add a normalization test covering region/name/ISO input forms.
@gsharp-aai gsharp-aai force-pushed the gsharp/assemblyai-language-code branch from b9de240 to f75c01f Compare June 25, 2026 19:02
"universal-3-5-pro",
] = "universal-3-5-pro",
language_detection: NotGivenOr[bool] = NOT_GIVEN,
language_code: NotGivenOr[LanguageCode | str] = NOT_GIVEN,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can just accept a str here so we don't need to normalize it like this later: LanguageCode(LanguageCode(language_code).language)

Suggested change
language_code: NotGivenOr[LanguageCode | str] = NOT_GIVEN,
language_code: NotGivenOr[str] = NOT_GIVEN,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants