fix(voice): resume speech interrupted before its first frame (#1909) by enriqueespaillat-gyde · Pull Request #1910 · livekit/agents-js

enriqueespaillat-gyde · 2026-06-29T21:18:03Z

Summary

Port of livekit/agents#5039 (Python issue livekit/agents#5038) to agents-js.

When the agent is in the "thinking" state and the user makes a brief sound before the first TTS audio frame is forwarded, the speech is silently dropped: no audio reaches the user and the turn is dropped from conversation history. This happens with resumeFalseInterruption: true (the default).

Closes #1909.

Root cause

Two behaviors combine in agents/src/voice/:

agent_activity.ts → onStartOfSpeech pauses the current speech when the agent is not yet "speaking" (thinking state). This pause is intentional — it stops the agent talking over a user who barges in during the thinking state — and is preserved by this PR.
generation.ts → forwardAudio registered its PLAYBACK_STARTED listener inside the forwarding task and, in its finally, rejected firstFrameFut whenever no frame had played. The thinking-state pause buffers the (short) TTS frames; the forwarding task then finishes before playback starts → firstFrameFut is rejected and the listener removed. When the false interruption clears and the output resumes, the buffered first frame plays but nothing is listening, so the future stays rejected. The reply tasks gate transcript preservation on firstFrameFut.done && !firstFrameFut.rejected, so the resumed turn is blanked from history even though audio reached the user.

Fix

Mirrors the spirit of #5039 — make the pre-first-frame pause recoverable while keeping the thinking-state pause:

Move the PLAYBACK_STARTED listener from forwardAudio to performAudioForwarding so it outlives the forwarding task; a late first frame (e.g. after a resumeFalseInterruption resume) can still resolve firstFrameFut.
Stop rejecting firstFrameFut in forwardAudio's finally.
Settle the future in the reply tasks (say / pipeline / realtime) after playout finishes or is interrupted, which also removes the listener.

JS-specific note

Unlike Python, the JS Future (agents/src/utils.ts) has no cancel() distinct from reject() — reject() sets rejected = true. So a literal "cancel instead of reject" transcription of #5039 would not change the downstream !firstFrameFut.rejected gate. This fix preserves the audio/transcript on the no-first-frame path by relocating resolution so the late first frame resolves the future, rather than relying on a cancel/reject distinction.

Tests

New regression test generation_interrupt_before_first_frame.test.ts reproduces the thinking-state pause before the first frame with a pausable mock output:

False interruption → speech resumes, frames are forwarded, firstFrameFut resolves, and the synchronized transcript would be preserved.
Genuine interruption after a resume → the partial synchronized transcript is kept (turn not lost from history).

Both cases fail on main and pass with this change. The existing generation_tts_timeout.test.ts "ignores PLAYBACK_STARTED from another segment" assertion is updated to the new contract (forwardAudio no longer rejects the future).

Full agents test suite: green (0 failed).
pnpm build, ESLint, Prettier: green.

…#1909) Port of livekit/agents#5039 (Python issue #5038) to agents-js. When the agent is in the "thinking" state and the user makes a brief sound before the first TTS audio frame is forwarded, `onStartOfSpeech` pauses the not-yet-playing speech. That thinking-state pause is intentional and is preserved. The frames are still captured into the paused output buffer, but `forwardAudio`'s finally block rejected `firstFrameFut` (and removed its PLAYBACK_STARTED listener) whenever no frame had played yet. When a false interruption then cleared and the output resumed, the buffered first frame played but nothing was listening, so the future stayed rejected. Because the reply tasks gate transcript preservation on `firstFrameFut.done && !firstFrameFut.rejected`, the resumed turn was dropped from history even though audio reached the user. Fix: - Move the PLAYBACK_STARTED listener from `forwardAudio` to `performAudioForwarding` so it outlives the forwarding task; a late first frame (e.g. after a `resumeFalseInterruption` resume) can still resolve `firstFrameFut`. - Stop rejecting `firstFrameFut` in `forwardAudio`'s finally. - Settle the future in the reply tasks (say / pipeline / realtime) after playout finishes or is interrupted, which also removes the listener. JS note: unlike Python, the JS `Future` has no `cancel()` distinct from `reject()` (reject sets `rejected = true`), so the fix preserves audio on the no-first-frame path by relocating resolution rather than relying on a cancel/reject distinction in the downstream gate. Adds a regression test reproducing the thinking-state pause before the first frame for both a false interruption (resumes and plays, transcript preserved) and a genuine interruption after a resume (partial transcript kept, turn not lost). Both fail on main and pass with this change.

changeset-bot · 2026-06-29T21:18:08Z

🦋 Changeset detected

Latest commit: 2e087f7

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 35 packages

Name	Type
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-assemblyai	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cartesia	Patch
@livekit/agents-plugin-cerebras	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-did	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-fishaudio	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-hume	Patch
@livekit/agents-plugin-inworld	Patch
@livekit/agents-plugin-lemonslice	Patch
@livekit/agents-plugin-liveavatar	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-minimax	Patch
@livekit/agents-plugin-mistral	Patch
@livekit/agents-plugin-mistralai	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-perplexity	Patch
@livekit/agents-plugin-phonic	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-runway	Patch
@livekit/agents-plugin-sarvam	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugin-soniox	Patch
@livekit/agents-plugin-tavus	Patch
@livekit/agents-plugins-test	Patch
@livekit/agents-plugin-trugen	Patch
@livekit/agents-plugin-xai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Wrap the post-forwarding body of ttsTask in try/finally so settleFirstFrameFut runs even if an await throws between the performAudioForwarding call and the end of the method. Otherwise the PLAYBACK_STARTED listener registered in performAudioForwarding would leak on the shared audioOutput EventEmitter on the exception path. Mirrors the finally-block pattern already used in forwardSegment and processOneMessage.

…#1909) Use the bare livekit#1909 short form referenced once at the core fix points, matching existing comments (e.g. livekit#1662, livekit#1430, livekit#1124), instead of the verbose cross-repo 'livekit#1909 (port of livekit/agents#5039)' form and the per-call-site repetition. The port context lives in the commit/PR/changeset.

enriqueespaillat-gyde

r? @toubatbrian - please let me know if I should stop tagging you and if theres another process to follow :) There's been an uptick in PRs past few months so just want to make sure you see this one. Its hit us a few times in production.

enriqueespaillat-gyde mentioned this pull request Jun 29, 2026

Agent speech silently dropped when interrupted before the first audio frame (resumeFalseInterruption) — port of Python #5039 #1909

Open

enriqueespaillat-gyde marked this pull request as ready for review June 29, 2026 21:19

This comment was marked as resolved.

Sign in to view

enriqueespaillat-gyde added 2 commits June 29, 2026 17:27

enriqueespaillat-gyde commented Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(voice): resume speech interrupted before its first frame (#1909)#1910

fix(voice): resume speech interrupted before its first frame (#1909)#1910
enriqueespaillat-gyde wants to merge 3 commits into
livekit:mainfrom
enriqueespaillat-gyde:fix/interrupt-before-first-frame

enriqueespaillat-gyde commented Jun 29, 2026

Uh oh!

changeset-bot Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

enriqueespaillat-gyde left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

enriqueespaillat-gyde commented Jun 29, 2026

Summary

Root cause

Fix

JS-specific note

Tests

Uh oh!

changeset-bot Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

This comment was marked as resolved.

Uh oh!

enriqueespaillat-gyde left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot Bot commented Jun 29, 2026 •

edited

Loading