Skip to content

feat(assemblyai): add language code streaming option#1908

Open
rosetta-livekit-bot[bot] wants to merge 1 commit into
mainfrom
opioid-might-steams
Open

feat(assemblyai): add language code streaming option#1908
rosetta-livekit-bot[bot] wants to merge 1 commit into
mainfrom
opioid-might-steams

Conversation

@rosetta-livekit-bot

@rosetta-livekit-bot rosetta-livekit-bot Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add languageCode to AssemblyAI streaming STT options for language steering
  • Normalize the option to a base language code before connecting
  • Send it as the language_code streaming query param only when set
  • Keep it restricted to the Universal-3 Pro model family and connect-time only

Testing

  • pnpm --filter @livekit/agents-plugin-assemblyai lint
  • pnpm build:agents
  • pnpm --filter @livekit/agents-plugin-silero build
  • pnpm --filter @livekit/agents-plugins-test build
  • pnpm --filter @livekit/agents-plugin-assemblyai build
  • pnpm test plugins/assemblyai/src/stt.test.ts

Ported from livekit/agents#6219

Original PR description

Summary

Adds a language_code connect-time parameter to the AssemblyAI STT plugin so users can steer transcription toward a specific language (e.g. "en", "es", "fr") instead of relying on automatic detection / code-switching.

Today the plugin only exposes language_detection, which is an output toggle (whether language_code/language_confidence are returned on turn messages) — there is no way to steer the model toward a language. The AssemblyAI streaming API already accepts language_code as a connect-time parameter, so this just plumbs it through.

This is useful for known-monolingual sessions, where steering improves accuracy on short/ambiguous utterances (e.g. disambiguating "see" vs. "si").

Details

  • Added language_code: NotGivenOr[str] to STTOptions and the STT.__init__ signature, forwarded into the connect-time live_config querystring (omitted when unset).
  • Gated to the u3-rt-pro family (u3-rt-pro, u3-rt-pro-beta-1, universal-3-5-pro) via the existing _U3_PRO_MODELS validation — passing it with another model raises ValueError. Language steering is applied by the u3-pro ASR; on the universal-streaming models language_code does not steer, so this matches the parameter's documented behavior and how mode/voice_focus are handled.
  • Connect-time only, matching the AssemblyAI streaming API — language_code is not part of UpdateConfiguration, so it is not added to update_options.
  • Follows the structure of the recent mode param PR (#6156).

Test plan

Added unit tests in tests/test_plugin_assemblyai_stt.py:

  • default is NOT_GIVEN
  • value is stored from the constructor
  • raises ValueError on a non-u3-rt-pro-family model
  • accepted across every u3-rt-pro-family model
  • present in the connect config querystring when set
  • absent from the querystring when unset
  • not exposed via update_options (connect-time only)
65 passed   # full tests/test_plugin_assemblyai_stt.py

ruff check and ruff format pass on both changed files.

@changeset-bot

changeset-bot Bot commented Jun 29, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: f722850

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 35 packages
Name Type
@livekit/agents-plugin-assemblyai Patch
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-did Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-fishaudio Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-hume Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-perplexity Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugin-soniox Patch
@livekit/agents-plugin-tavus Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch
@livekit/agents-plugins-test Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

Open in Devin Review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Language code normalization is skipped when options are updated after construction

The language code is stored without normalization ({ ...this.#opts, ...opts } at plugins/assemblyai/src/stt.ts:189) when updated after initial setup, so a value like 'en-US' is sent raw to AssemblyAI instead of being reduced to 'en'.

Impact: Future streaming connections after an option update may send an un-normalized language code to AssemblyAI, potentially causing rejected or misinterpreted language steering.

Constructor normalizes but updateOptions does not

In the constructor at plugins/assemblyai/src/stt.ts:171-172, languageCode is normalized via getBaseLanguage() and then explicitly set after the spread at line 179, ensuring the normalized value wins. However, STT.updateOptions at line 188-189 simply spreads the raw opts over this.#opts without applying getBaseLanguage(). This means:

  1. User calls stt.updateOptions({ languageCode: 'en-US' })
  2. this.#opts.languageCode is set to 'en-US' (raw, un-normalized)
  3. Next stt.stream() call creates a SpeechStream with these opts
  4. #connectWS() at line 335 sends language_code: 'en-US' to AssemblyAI instead of 'en'

The fix should apply getBaseLanguage() to opts.languageCode in updateOptions before merging, consistent with the constructor logic.

(Refers to lines 188-189)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +171 to +172
const languageCode =
opts.languageCode !== undefined ? getBaseLanguage(opts.languageCode) : undefined;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 getBaseLanguage strips region subtags, which may be intentional for AssemblyAI but lossy

The constructor normalizes languageCode via getBaseLanguage() at plugins/assemblyai/src/stt.ts:171-172, which strips region subtags (e.g. 'en-US''en', 'pt-BR''pt'). This is consistent with how other plugins (ElevenLabs, Cartesia) use getBaseLanguage. However, if AssemblyAI's API supports or benefits from region-level language codes (e.g. distinguishing pt-BR from pt-PT), this normalization would silently drop useful information. Worth confirming with AssemblyAI's API docs whether region subtags are supported for the language_code parameter.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants