fix: propagate RealtimeError from generate_reply timeout (#6224)#6226
fix: propagate RealtimeError from generate_reply timeout (#6224)#6226C1-BA-B1-F3 wants to merge 2 commits into
Conversation
wait_if_not_interrupted used asyncio.gather(return_exceptions=True) which swallowed exceptions from the generate_reply future. Now checks results and re-raises any non-cancelled exceptions.
…upted The else branch added to wait_if_not_interrupted re-raised ALL exceptions from gathered futures, which broke callers that relied on the old behavior where exceptions were captured via return_exceptions=True and callers inspected them manually via Future.result(). Specifically: - ChanClosed (normal end-of-generation) was re-raised before _next_segment could catch it via recv.result(), crashing the pipeline. - Other callers (generation.py, filler_scheduler.py, tool execution) were also affected. The PR's real fix - adding speech_handle._mark_done(error=e) at the caller sites in agent_activity.py - is preserved and correctly propagates RealtimeError to the SpeechHandle without needing wait_if_not_interrupted to re-raise exceptions.
| session = create_session(FakeActions(), speed_factor=1.0) | ||
| # Replace the LLM with our fake realtime model | ||
| session._llm = _FakeRealtimeModel() | ||
|
|
||
| agent = MyAgent() | ||
|
|
||
| with pytest.raises(RealtimeError, match="generate_reply timed out"): | ||
| async with session: | ||
| await agent.start(session) | ||
| # generate_reply should raise RealtimeError instead of hanging | ||
| handle = session.generate_reply(instructions="say hello") | ||
| await handle |
There was a problem hiding this comment.
🚩 Test monkey-patches session._llm which may not exercise the full realtime path
The test at tests/test_agent_session.py:1811 sets session._llm = _FakeRealtimeModel() after session creation. Since create_session(FakeActions()) constructs the session with a non-realtime LLM, the AgentActivity created when the agent starts may not initialize its _rt_session properly. Whether this test actually triggers the _realtime_reply_task code path depends on whether AgentActivity re-reads session._llm at startup vs. caching it. If the realtime session is not set up, the assertion assert self._rt_session is not None at line 3183 would fire instead of reaching the intended error path. Worth verifying the test passes in CI.
Was this helpful? React with 👍 or 👎 to provide feedback.
Fixes #6224. The wait_if_not_interrupted method used asyncio.gather(return_exceptions=True) which swallowed RealtimeError exceptions from the generate_reply timeout future.