Skip to content

fix: retry Soniox TTS on 408 timeout mid-stream (#6225)#6228

Open
C1-BA-B1-F3 wants to merge 3 commits into
livekit:mainfrom
C1-BA-B1-F3:fix/soniox-tts-408-retry-v2
Open

fix: retry Soniox TTS on 408 timeout mid-stream (#6225)#6228
C1-BA-B1-F3 wants to merge 3 commits into
livekit:mainfrom
C1-BA-B1-F3:fix/soniox-tts-408-retry-v2

Conversation

@C1-BA-B1-F3

Copy link
Copy Markdown

Fixes #6225

Problem

When a retryable error (408 timeout / 429 rate limit) occurs mid-stream after partial audio was already sent to the user, the raises the exception instead of handling it gracefully. This crashes the entire TTS stream and the agent's turn.

The root cause is in the retry guard: β€” it only retries when no audio was sent yet. If any audio was already pushed, it falls through to .

Fix

When a retryable error occurs mid-stream (pushed_duration > 0), end the segment gracefully instead of crashing. The already-sent audio remains usable. A warning is logged and the error is emitted as recoverable.

Crashing the entire stream is worse than a partial utterance β€” the user at least hears something rather than nothing.

Changes

  • livekit-agents/livekit/agents/tts/tts.py: In SynthesizeStream._main_task, when a retryable error occurs with pushed_duration > 0, call output_emitter.end_input() + join() and return gracefully instead of raising.

πŸ€– Generated with Claude Code

C1-BA-B1-F3 and others added 2 commits June 25, 2026 18:48
wait_if_not_interrupted used asyncio.gather(return_exceptions=True) which
swallowed exceptions from the generate_reply future. Now checks results
and re-raises any non-cancelled exceptions.
When a retryable error (408/429) occurs mid-stream after partial audio
was already sent to the user, don't crash the stream. Instead, end the
segment gracefully so the already-sent audio remains usable.

Crashing the entire stream is worse than a partial utterance β€” the user
at least hears something rather than nothing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@C1-BA-B1-F3 C1-BA-B1-F3 requested a review from a team as a code owner June 25, 2026 12:46
@CLAassistant

CLAassistant commented Jun 25, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 potential issues.

Open in Devin Review

Comment thread livekit-agents/livekit/agents/voice/speech_handle.py Outdated
Comment thread livekit-agents/livekit/agents/voice/agent_activity.py
Comment thread livekit-agents/livekit/agents/voice/speech_handle.py Outdated
Comment on lines +514 to +531
if pushed_duration > 0.0 and e.retryable:
# Retryable error (408/429) mid-stream β€” the user already
# heard some audio. Crashing the stream is worse than a
# partial utterance, so end the segment gracefully and
# let the already-sent audio be used.
logger.warning(
"TTS failed after partial audio was already sent to the user, "
"ending segment gracefully.",
extra={
"tts": self._tts._label,
"streamed": True,
"pushed_duration": pushed_duration,
},
)
self._emit_error(e, recoverable=True)
output_emitter.end_input()
await output_emitter.join()
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 The TTS graceful recovery condition may be too narrow for non-retryable mid-stream errors

The new graceful recovery at tts.py:514 only activates when pushed_duration > 0.0 and e.retryable. Non-retryable errors with partial audio still fall through to self._emit_error(e, recoverable=False) and raise at lines 542-543, which crashes the stream. The comment and log message suggest the intent is to avoid crashing when partial audio was already sent to the user, but this only applies to retryable errors (408/429). A non-retryable error (e.g. 500) with partial audio still crashes. This may be intentional (non-retryable errors are considered unrecoverable), but worth confirming the design intent since the user experience is the same β€” partial audio was already heard.

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Remove the exception re-raising in wait_if_not_interrupted that was
breaking normal segment-stream completion and silently swallowing
realtime errors.

- speech_handle.py: Remove the else block that re-raises exceptions from
  gathered futures. The return_exceptions=True gather captures exceptions
  as results, but re-raising them from wait_if_not_interrupted bypassed
  callers' own error handlers (e.g. the ChanClosed handler in
  _next_segment, the RealtimeError handler in _realtime_reply_task).

- agent_activity.py: Add explicit pre-check for generate_reply_fut
  exceptions in _realtime_reply_task to ensure RealtimeError is properly
  propagated through speech_handle._mark_done(error=e), even when the
  gather's return_exceptions=True captures the exception as a result.

- test_agent_session.py: Minor whitespace fix.

Addresses all 4 issues from Devin review on PR livekit#6228:
- BUG: ChanClosed re-raised before _next_segment handler β†’ pipeline crash
- BUG: RealtimeError re-raised before _mark_done(error=e) β†’ silently swallowed
- ANALYSIS: Multiple other call sites no longer affected by spurious re-raises
- ANALYSIS: TTS graceful recovery intentionally limited to retryable errors

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

Open in Devin Review

Comment on lines 3272 to 3277
" after tool execution" if tool_reply else "",
str(e),
)
speech_handle._mark_done(error=e)
self._session._update_agent_state("listening")
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 New exception check makes the subsequent try/except dead code for error paths

The new check at lines 3253-3265 catches ALL exception types from generate_reply_fut (not just RealtimeError), then returns early. Since wait_if_not_interrupted uses gather(return_exceptions=True), after it returns without interruption the future is always done. This means the try: generation_ev = await generate_reply_fut / except llm.RealtimeError block at lines 3267-3277 can never catch an exception β€” it will only execute in the success case where .exception() is None. The existing try/except is now effectively dead code for error paths. This also represents a semantic change: previously, non-RealtimeError exceptions (e.g. ValueError, RuntimeError) from generate_reply would propagate up and be caught by @utils.log_exceptions. Now they are caught, logged, and gracefully handled via _mark_done. This is likely intentional for robustness but changes error propagation behavior.

(Refers to lines 3253-3277)

Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Soniox TTS throws 408 timeout and breaks the stream

2 participants