fix(agents): persist _speech_start_time across intra-turn VAD bursts by AlessandroElyos · Pull Request #5585 · livekit/agents

AlessandroElyos · 2026-04-28T14:05:28Z

Within a single user turn, VAD can fire multiple START_OF_SPEECH/END_OF_SPEECH cycles separated by short silences (e.g. the user says "Hello." then pauses briefly before continuing). End-of-turn detection is decoupled from VAD — a turn is only considered ended once _bounce_eou_task runs and clears the per-turn state.

_vad_speech_started was being reset to False in the END_OF_SPEECH branch of _on_vad_event. The next START_OF_SPEECH within the same turn would therefore overwrite _speech_start_time with the latest burst's start, losing the original turn-start timestamp. This propagates to started_speaking_at on the EOT metrics report, which downstream consumers (recording-to-transcript alignment, analytics) rely on to know when the user actually began speaking.

The fix removes the _vad_speech_started = False reset from the END_OF_SPEECH branch. The flag is now cleared only by the EOT cleanup in _bounce_eou_task — which is already the lifecycle owner of _speech_start_time — so both fields are governed symmetrically by turn boundaries rather than burst boundaries.

_vad_speech_started is read at exactly one site (the SOS guard) and is also defensively reset in the VAD task finally block on handoff/teardown, so removing the EOS reset has no other side effects.

Adds tests/test_speech_start_time_persistence.py with three cases following the existing test_audio_recognition_aclose.py style:

test_first_sos_sets_speech_start_time — sanity check.
test_speech_start_time_persists_across_intra_turn_bursts — the bug reproducer; fails before the fix, passes after.
test_eos_does_not_clear_vad_speech_started — encodes the new invariant.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

AlessandroElyos · 2026-04-28T15:20:30Z

The failing test_interruption_before_speaking[True-3.5] looks like a pre-existing flake unrelated to this PR — it's a timing-sensitive assertion (5× speed factor, 0.75s tolerance on check_timestamp). I ran it 3× locally with the fix applied and it passed each time. Happy to fix it if it can help

longcw · 2026-04-29T01:45:16Z

            with trace.use_span(self._ensure_user_turn_span()):
                self._hooks.on_end_of_speech(ev)

-            self._vad_speech_started = False


this makes the _speech_start_time always the turn start time, may break the tracing that we create user_speaking spans using this timestamp as the start of the span (ref).

using the start time of the user_turn_span could be a better solution if the goal is to make started_speaking_at in the EOT info the first time user started speaking. we already have _ensure_user_turn_span that create the span only when there is no active user turn span, and the span is ended only after EOT committed.

the OTEL span is designed for write-only, so maybe add a _user_turn_start alongside the user_turn_span.

thanks for the feedback @longcw.

I reverted the change on _vad_speech_started.

I also added a _user_turn_start variable as you suggested that should reflect the actual turn boundaries.

Let me know your thoughts.

…stays untouched

AlessandroElyos added 2 commits April 28, 2026 15:02

fix(agents): persist _speech_start_time across intra-turn VAD bursts

ce1e57c

style: ruff format

c0d9327

devin-ai-integration Bot reviewed Apr 28, 2026

View reviewed changes

longcw reviewed Apr 29, 2026

View reviewed changes

AlessandroElyos added 3 commits April 29, 2026 10:09

refactor: introduce _user_turn_start so per-burst _speech_start_time …

1b7dc23

…stays untouched

fix: restore _vad_speech_started reset on EOS

e7564d6

drop verbose comment on _user_turn_start argument

c1b8e4c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agents): persist _speech_start_time across intra-turn VAD bursts#5585

fix(agents): persist _speech_start_time across intra-turn VAD bursts#5585
AlessandroElyos wants to merge 5 commits intolivekit:mainfrom
AlessandroElyos:alessandro/speech-start-time-persistence

AlessandroElyos commented Apr 28, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

AlessandroElyos commented Apr 28, 2026 •

edited

Loading

Uh oh!

longcw Apr 29, 2026

Uh oh!

AlessandroElyos Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AlessandroElyos commented Apr 28, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

AlessandroElyos commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

longcw Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

AlessandroElyos Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AlessandroElyos commented Apr 28, 2026 •

edited

Loading