fix: retry Soniox TTS on 408 timeout mid-stream (#6225)#6228
fix: retry Soniox TTS on 408 timeout mid-stream (#6225)#6228C1-BA-B1-F3 wants to merge 3 commits into
Conversation
wait_if_not_interrupted used asyncio.gather(return_exceptions=True) which swallowed exceptions from the generate_reply future. Now checks results and re-raises any non-cancelled exceptions.
When a retryable error (408/429) occurs mid-stream after partial audio was already sent to the user, don't crash the stream. Instead, end the segment gracefully so the already-sent audio remains usable. Crashing the entire stream is worse than a partial utterance β the user at least hears something rather than nothing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
| if pushed_duration > 0.0 and e.retryable: | ||
| # Retryable error (408/429) mid-stream β the user already | ||
| # heard some audio. Crashing the stream is worse than a | ||
| # partial utterance, so end the segment gracefully and | ||
| # let the already-sent audio be used. | ||
| logger.warning( | ||
| "TTS failed after partial audio was already sent to the user, " | ||
| "ending segment gracefully.", | ||
| extra={ | ||
| "tts": self._tts._label, | ||
| "streamed": True, | ||
| "pushed_duration": pushed_duration, | ||
| }, | ||
| ) | ||
| self._emit_error(e, recoverable=True) | ||
| output_emitter.end_input() | ||
| await output_emitter.join() | ||
| return |
There was a problem hiding this comment.
π© The TTS graceful recovery condition may be too narrow for non-retryable mid-stream errors
The new graceful recovery at tts.py:514 only activates when pushed_duration > 0.0 and e.retryable. Non-retryable errors with partial audio still fall through to self._emit_error(e, recoverable=False) and raise at lines 542-543, which crashes the stream. The comment and log message suggest the intent is to avoid crashing when partial audio was already sent to the user, but this only applies to retryable errors (408/429). A non-retryable error (e.g. 500) with partial audio still crashes. This may be intentional (non-retryable errors are considered unrecoverable), but worth confirming the design intent since the user experience is the same β partial audio was already heard.
Was this helpful? React with π or π to provide feedback.
Remove the exception re-raising in wait_if_not_interrupted that was breaking normal segment-stream completion and silently swallowing realtime errors. - speech_handle.py: Remove the else block that re-raises exceptions from gathered futures. The return_exceptions=True gather captures exceptions as results, but re-raising them from wait_if_not_interrupted bypassed callers' own error handlers (e.g. the ChanClosed handler in _next_segment, the RealtimeError handler in _realtime_reply_task). - agent_activity.py: Add explicit pre-check for generate_reply_fut exceptions in _realtime_reply_task to ensure RealtimeError is properly propagated through speech_handle._mark_done(error=e), even when the gather's return_exceptions=True captures the exception as a result. - test_agent_session.py: Minor whitespace fix. Addresses all 4 issues from Devin review on PR livekit#6228: - BUG: ChanClosed re-raised before _next_segment handler β pipeline crash - BUG: RealtimeError re-raised before _mark_done(error=e) β silently swallowed - ANALYSIS: Multiple other call sites no longer affected by spurious re-raises - ANALYSIS: TTS graceful recovery intentionally limited to retryable errors
| " after tool execution" if tool_reply else "", | ||
| str(e), | ||
| ) | ||
| speech_handle._mark_done(error=e) | ||
| self._session._update_agent_state("listening") | ||
| return |
There was a problem hiding this comment.
π© New exception check makes the subsequent try/except dead code for error paths
The new check at lines 3253-3265 catches ALL exception types from generate_reply_fut (not just RealtimeError), then returns early. Since wait_if_not_interrupted uses gather(return_exceptions=True), after it returns without interruption the future is always done. This means the try: generation_ev = await generate_reply_fut / except llm.RealtimeError block at lines 3267-3277 can never catch an exception β it will only execute in the success case where .exception() is None. The existing try/except is now effectively dead code for error paths. This also represents a semantic change: previously, non-RealtimeError exceptions (e.g. ValueError, RuntimeError) from generate_reply would propagate up and be caught by @utils.log_exceptions. Now they are caught, logged, and gracefully handled via _mark_done. This is likely intentional for robustness but changes error propagation behavior.
(Refers to lines 3253-3277)
Was this helpful? React with π or π to provide feedback.
Fixes #6225
Problem
When a retryable error (408 timeout / 429 rate limit) occurs mid-stream after partial audio was already sent to the user, the raises the exception instead of handling it gracefully. This crashes the entire TTS stream and the agent's turn.
The root cause is in the retry guard: β it only retries when no audio was sent yet. If any audio was already pushed, it falls through to .
Fix
When a retryable error occurs mid-stream (pushed_duration > 0), end the segment gracefully instead of crashing. The already-sent audio remains usable. A warning is logged and the error is emitted as recoverable.
Crashing the entire stream is worse than a partial utterance β the user at least hears something rather than nothing.
Changes
livekit-agents/livekit/agents/tts/tts.py: InSynthesizeStream._main_task, when a retryable error occurs withpushed_duration > 0, calloutput_emitter.end_input()+join()and return gracefully instead of raising.π€ Generated with Claude Code