-
Notifications
You must be signed in to change notification settings - Fork 313
feat(assemblyai): add language code streaming option #1908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| '@livekit/agents-plugin-assemblyai': patch | ||
| --- | ||
|
|
||
| Add AssemblyAI streaming language steering via the `languageCode` option. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,6 +10,7 @@ import { | |
| Task, | ||
| createTimedString, | ||
| delay, | ||
| getBaseLanguage, | ||
| log, | ||
| normalizeLanguage, | ||
| stt, | ||
|
|
@@ -66,6 +67,8 @@ export interface STTOptions { | |
| encoding: STTEncoding; | ||
| speechModel: STTModels; | ||
| languageDetection?: boolean; | ||
| /** Only supported with the Universal-3 Pro model family. Set at connection time only. */ | ||
| languageCode?: string; | ||
| endOfTurnConfidenceThreshold?: number; | ||
| /** Minimum silence (ms) before a confident end-of-turn is finalized. */ | ||
| minTurnSilence?: number; | ||
|
|
@@ -146,6 +149,7 @@ export class STT extends stt.STT { | |
| 'voiceFocus', | ||
| 'voiceFocusThreshold', | ||
| 'mode', | ||
| 'languageCode', | ||
| ] as const) { | ||
| if (opts[param] !== undefined) { | ||
| throw new Error( | ||
|
|
@@ -164,12 +168,15 @@ export class STT extends stt.STT { | |
|
|
||
| // Minimize latency by default, but let AssemblyAI's mode preset control silence tuning. | ||
| const minTurnSilence = opts.minTurnSilence ?? (opts.mode === undefined ? 100 : undefined); | ||
| const languageCode = | ||
| opts.languageCode !== undefined ? getBaseLanguage(opts.languageCode) : undefined; | ||
|
Comment on lines
+171
to
+172
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🚩 getBaseLanguage strips region subtags, which may be intentional for AssemblyAI but lossy The constructor normalizes Was this helpful? React with 👍 or 👎 to provide feedback. |
||
|
|
||
| this.#opts = { | ||
| ...defaultSTTOptions, | ||
| ...opts, | ||
| apiKey, | ||
| minTurnSilence, | ||
| languageCode, | ||
| }; | ||
| } | ||
|
|
||
|
|
@@ -325,6 +332,7 @@ export class SpeechStream extends stt.SpeechStream { | |
| ? JSON.stringify(this.#opts.keytermsPrompt) | ||
| : undefined, | ||
| language_detection: languageDetection, | ||
| language_code: this.#opts.languageCode, | ||
| prompt: this.#opts.prompt, | ||
| agent_context: this.#opts.agentContext, | ||
| previous_context_n_turns: this.#opts.previousContextNTurns, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Language code normalization is skipped when options are updated after construction
The language code is stored without normalization (
{ ...this.#opts, ...opts }atplugins/assemblyai/src/stt.ts:189) when updated after initial setup, so a value like'en-US'is sent raw to AssemblyAI instead of being reduced to'en'.Impact: Future streaming connections after an option update may send an un-normalized language code to AssemblyAI, potentially causing rejected or misinterpreted language steering.
Constructor normalizes but updateOptions does not
In the constructor at
plugins/assemblyai/src/stt.ts:171-172,languageCodeis normalized viagetBaseLanguage()and then explicitly set after the spread at line 179, ensuring the normalized value wins. However,STT.updateOptionsat line 188-189 simply spreads the rawoptsoverthis.#optswithout applyinggetBaseLanguage(). This means:stt.updateOptions({ languageCode: 'en-US' })this.#opts.languageCodeis set to'en-US'(raw, un-normalized)stt.stream()call creates aSpeechStreamwith these opts#connectWS()at line 335 sendslanguage_code: 'en-US'to AssemblyAI instead of'en'The fix should apply
getBaseLanguage()toopts.languageCodeinupdateOptionsbefore merging, consistent with the constructor logic.(Refers to lines 188-189)
Was this helpful? React with 👍 or 👎 to provide feedback.