Fix stale playout state across multi-step voice turns by jayeshp19 · Pull Request #5533 · livekit/agents

jayeshp19 · 2026-04-23T06:55:02Z

This PR fixes a bug where paused-speech state could become stale across generation steps within a single SpeechHandle. The key changes are:

New current_generation_playout_active flag on SpeechHandle — tracks whether the current generation's audio is actively playing, reset on each step advance.
New generation_step field on _PausedSpeechInfo — ensures paused speech is only applied/resumed if it matches the current generation step.
Centralized _on_generation_playout_started / _on_generation_playout_finished — extracts duplicated playout lifecycle logic (state updates, audio recognition notifications, interruption toggling) into two reusable methods.

The audio-interruption condition now also gates on current_generation_playout_active, preventing interruptions when the agent isn't actually playing audio for the current generation.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

longcw · 2026-04-27T02:24:33Z

paused-speech state could become stale across generation steps within a single SpeechHandle

do you mean a tool reply created after a speech was paused, the paused_speech is still the same as the new tool reply? can you share an example to demonstrate the issue it caused?

jayeshp19 · 2026-04-28T18:55:46Z

paused-speech state could become stale across generation steps within a single SpeechHandle

do you mean a tool reply created after a speech was paused, the paused_speech is still the same as the new tool reply? can you share an example to demonstrate the issue it caused?

Yes, that is the case.

A specific interleaving example is:

User asks “what’s the weather in Tokyo?”
The model emits only a function call, with no spoken preamble/audio.
While that silent tool-call generation is still active, user speech/noise starts, so false-interruption handling records _paused_speech for the current SpeechHandle with agent_state="thinking".
The same SpeechHandle advances to the tool-reply generation.
The tool reply starts playing, and pause handling runs again while the user is still speaking.

Before this change, _PausedSpeechInfo was keyed only by SpeechHandle, so _update_paused_speech() treated the later tool-reply pause as the same paused speech and only updated the timeout. It kept the metadata captured during the earlier silent/tool-call step.

The visible effect in the regression test is that false-interruption resume transitions the agent from listening back to thinking while the tool-reply audio resumes, instead of resuming to speaking.

With generation-step tracking, the tool-reply pause refreshes the paused state for the current generation, so resume restores the correct state.

I rebased the PR and added a regression test for this specific interleaving.

https://github.com/livekit/agents/pull/5533/changes#diff-8fc06608b976184d49adbab8eccbd212aa7252e3e17cc5d9ec34dced8365f217R833

longcw · 2026-04-29T06:01:13Z

@jayeshp19 thanks for the details! and yeah I see the issue, it could be fixed in a simple way that we reset the paused speech in _scheduling_task after the current generation _wait_for_generation. I created a alternative pr in #5594

jayeshp19 · 2026-04-29T13:40:47Z

@jayeshp19 thanks for the details! and yeah I see the issue, it could be fixed in a simple way that we reset the paused speech in _scheduling_task after the current generation _wait_for_generation. I created a alternative pr in #5594

Thanks, it makes sense for the bug I was trying to fix.

One small note: I also tested a edge case where the false-interruption timer fires before the silent tool-call generation finishes. In that case the runtime can still do a thinking -> listening -> thinking false-resume during the silent step itself. That seems like a separate edge case around pausing while no audio is actually playing, and my current PR does not fully solve that either because the pause-on-thinking path can still record _paused_speech.

So I’m good with #5594 superseding this PR for the cross-step stale-state bug. I can close this PR once #5594 lands.

devin-ai-integration Bot reviewed Apr 23, 2026

View reviewed changes

jayeshp19 force-pushed the Fix-stale-playout-state-across-multi-step-voice-turns branch from efc3ae3 to efcb35c Compare April 24, 2026 08:15

jayeshp19 mentioned this pull request Apr 24, 2026

Track voice playout and pause state by generation step instead of reused SpeechHandle #5545

Open

jayeshp19 added 3 commits April 29, 2026 00:08

Fix stale playout state across multi-step voice turns

981dd4e

fix resume audio when clearing stale pause state

c7cc559

Add silent tool-call pause regression test

741c6cf

jayeshp19 force-pushed the Fix-stale-playout-state-across-multi-step-voice-turns branch from 8c7fa33 to 741c6cf Compare April 28, 2026 18:50

longcw mentioned this pull request Apr 29, 2026

fix: clear stale paused speech state across generation steps #5594

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix stale playout state across multi-step voice turns#5533

Fix stale playout state across multi-step voice turns#5533
jayeshp19 wants to merge 3 commits intolivekit:mainfrom
jayeshp19:Fix-stale-playout-state-across-multi-step-voice-turns

jayeshp19 commented Apr 23, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

longcw commented Apr 27, 2026

Uh oh!

jayeshp19 commented Apr 28, 2026

Uh oh!

longcw commented Apr 29, 2026

Uh oh!

jayeshp19 commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jayeshp19 commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

longcw commented Apr 27, 2026

Uh oh!

jayeshp19 commented Apr 28, 2026

Uh oh!

longcw commented Apr 29, 2026

Uh oh!

jayeshp19 commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jayeshp19 commented Apr 23, 2026 •

edited

Loading

jayeshp19 commented Apr 29, 2026 •

edited

Loading