Skip to content

fix(soniox): send per-stream keepalive to prevent 408 timeout mid-utterance#6237

Open
tsushanth wants to merge 1 commit into
livekit:mainfrom
tsushanth:fix/soniox-tts-408-stream-keepalive
Open

fix(soniox): send per-stream keepalive to prevent 408 timeout mid-utterance#6237
tsushanth wants to merge 1 commit into
livekit:mainfrom
tsushanth:fix/soniox-tts-408-stream-keepalive

Conversation

@tsushanth

Copy link
Copy Markdown

Fixes #6225

Root cause

In agents 1.6.x the framework drives a single long-lived SynthesizeStream per turn and feeds text incrementally as LLM tokens arrive. Between token chunks — and especially during tool calls — the stream can sit idle for many seconds. Soniox enforces a per-stream idle timeout and responds with error code 408 when it fires. That exception propagates up through _main_task and tears down the agent's TTS turn entirely.

The existing connection-level keepalive ({"keep_alive": true} without a stream_id, sent every 10 s) keeps the WebSocket alive between streams but does not reset the per-stream idle timer on the Soniox server — the server tracks each stream independently.

Fix

Add a _KeepAliveStream outbound message type that carries a stream_id. The _keepalive_loop now runs on a 5 s tick and, every 15 s, enqueues {"stream_id": "...", "keep_alive": true} for every active stream that has already sent its config but has not yet seen audio_end. The message is routed through the same _input_queue the send loop already serialises, so there is no new locking surface.

15 s is well within Soniox's ~30 s per-stream idle limit, so the timeout no longer fires during normal inter-token pauses or tool-call gaps.

Changes

  • livekit-plugins/livekit-plugins-soniox/livekit/plugins/soniox/tts.py
    • Add _STREAM_KEEPALIVE_INTERVAL = 15 constant with explanatory comment
    • Add _KeepAliveStream dataclass and include it in _OutboundMsg
    • Handle _KeepAliveStream in _send_loop by emitting {"stream_id": ..., "keep_alive": true}
    • Refactor _keepalive_loop to tick every 5 s; fire the connection-level keepalive on the existing 10 s cadence and the per-stream keepalive on the 15 s cadence

…erance

In agents 1.6.x the agent drives a single long-lived TTS stream per turn
and feeds text incrementally as LLM tokens arrive.  Between token chunks
(and especially during tool calls) the stream sits idle for many seconds.
Soniox enforces a per-stream idle timeout and responds with a 408 error
when it fires, which tears down the stream and crashes the agent's turn.

The existing connection-level keepalive ({"keep_alive": true} without a
stream_id) keeps the WebSocket alive but does not reset the per-stream
idle counter on the server.

This adds a _KeepAliveStream outbound message type sent through the same
_input_queue used for text and cancel messages.  The _keepalive_loop now
runs on a 5 s tick and, every 15 s, enqueues a per-stream keepalive
({"stream_id": "...", "keep_alive": true}) for every stream that has
already sent its config but has not yet seen audio_end.  15 s is well
within Soniox's ~30 s idle limit so the timeout no longer fires during
normal inter-token pauses.

Fixes livekit#6225
@tsushanth tsushanth requested a review from a team as a code owner June 26, 2026 02:12

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

Open in Devin Review

@davidzhao

Copy link
Copy Markdown
Member

@tsushanth could you run make fix to get CI back to passing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Soniox TTS throws 408 timeout and breaks the stream

2 participants