feat(assemblyai): add language code streaming option#1908
feat(assemblyai): add language code streaming option#1908rosetta-livekit-bot[bot] wants to merge 1 commit into
Conversation
🦋 Changeset detectedLatest commit: f722850 The changes in this PR will be included in the next version bump. This PR includes changesets to release 35 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
🟡 Language code normalization is skipped when options are updated after construction
The language code is stored without normalization ({ ...this.#opts, ...opts } at plugins/assemblyai/src/stt.ts:189) when updated after initial setup, so a value like 'en-US' is sent raw to AssemblyAI instead of being reduced to 'en'.
Impact: Future streaming connections after an option update may send an un-normalized language code to AssemblyAI, potentially causing rejected or misinterpreted language steering.
Constructor normalizes but updateOptions does not
In the constructor at plugins/assemblyai/src/stt.ts:171-172, languageCode is normalized via getBaseLanguage() and then explicitly set after the spread at line 179, ensuring the normalized value wins. However, STT.updateOptions at line 188-189 simply spreads the raw opts over this.#opts without applying getBaseLanguage(). This means:
- User calls
stt.updateOptions({ languageCode: 'en-US' }) this.#opts.languageCodeis set to'en-US'(raw, un-normalized)- Next
stt.stream()call creates aSpeechStreamwith these opts #connectWS()at line 335 sendslanguage_code: 'en-US'to AssemblyAI instead of'en'
The fix should apply getBaseLanguage() to opts.languageCode in updateOptions before merging, consistent with the constructor logic.
(Refers to lines 188-189)
Was this helpful? React with 👍 or 👎 to provide feedback.
| const languageCode = | ||
| opts.languageCode !== undefined ? getBaseLanguage(opts.languageCode) : undefined; |
There was a problem hiding this comment.
🚩 getBaseLanguage strips region subtags, which may be intentional for AssemblyAI but lossy
The constructor normalizes languageCode via getBaseLanguage() at plugins/assemblyai/src/stt.ts:171-172, which strips region subtags (e.g. 'en-US' → 'en', 'pt-BR' → 'pt'). This is consistent with how other plugins (ElevenLabs, Cartesia) use getBaseLanguage. However, if AssemblyAI's API supports or benefits from region-level language codes (e.g. distinguishing pt-BR from pt-PT), this normalization would silently drop useful information. Worth confirming with AssemblyAI's API docs whether region subtags are supported for the language_code parameter.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
languageCodeto AssemblyAI streaming STT options for language steeringlanguage_codestreaming query param only when setTesting
pnpm --filter @livekit/agents-plugin-assemblyai lintpnpm build:agentspnpm --filter @livekit/agents-plugin-silero buildpnpm --filter @livekit/agents-plugins-test buildpnpm --filter @livekit/agents-plugin-assemblyai buildpnpm test plugins/assemblyai/src/stt.test.tsPorted from livekit/agents#6219
Original PR description
Summary
Adds a
language_codeconnect-time parameter to the AssemblyAI STT plugin so users can steer transcription toward a specific language (e.g."en","es","fr") instead of relying on automatic detection / code-switching.Today the plugin only exposes
language_detection, which is an output toggle (whetherlanguage_code/language_confidenceare returned on turn messages) — there is no way to steer the model toward a language. The AssemblyAI streaming API already acceptslanguage_codeas a connect-time parameter, so this just plumbs it through.This is useful for known-monolingual sessions, where steering improves accuracy on short/ambiguous utterances (e.g. disambiguating "see" vs. "si").
Details
language_code: NotGivenOr[str]toSTTOptionsand theSTT.__init__signature, forwarded into the connect-timelive_configquerystring (omitted when unset).u3-rt-pro,u3-rt-pro-beta-1,universal-3-5-pro) via the existing_U3_PRO_MODELSvalidation — passing it with another model raisesValueError. Language steering is applied by the u3-pro ASR; on the universal-streaming modelslanguage_codedoes not steer, so this matches the parameter's documented behavior and howmode/voice_focusare handled.language_codeis not part ofUpdateConfiguration, so it is not added toupdate_options.modeparam PR (#6156).Test plan
Added unit tests in
tests/test_plugin_assemblyai_stt.py:NOT_GIVENValueErroron a non-u3-rt-pro-family modelupdate_options(connect-time only)ruff checkandruff formatpass on both changed files.