fix(google): pause audio input during synchronous tool execution on t… by vedevpatel · Pull Request #5556 · livekit/agents

vedevpatel · 2026-04-26T13:49:49Z

Gemini 3.1 live model

Gemini 3.1 forces synchronous tool calling, which means the model blocks until tool responses arrive. The plugin's _send_task was constantly forwarding microphone audio while tools executed, which caused the server to think of incoming audio as a new turn and cancel the pending tool call after ~12s. This caused duplicate tool execution with already-resolved call_ids as well as corrupted conversation state.

Adds a _tool_call_pending flag (for Gemini 3.1 only) that drops push_audio frames from the moment a toolCall is received until send_tool_response is flushed.
Also clears the flag on tool_call_cancellation so the session never stalls. No behavior change for Gemini 2.5 models.

Copilot

Pull request overview

Updates the Google Gemini Realtime session to prevent microphone audio from being streamed while Gemini 3.1 synchronous tool calls are in-flight, avoiding server-side “new turn” detection that cancels pending tool calls and corrupts state.

Changes:

Added a _tool_call_pending flag to pause push_audio() during Gemini 3.1 tool execution.
Set the flag when a tool call is received (Gemini 3.1 only), clear it after sending tool responses.
Clear the flag on tool_call_cancellation to prevent stalling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-26T13:53:24Z


    def push_audio(self, frame: rtc.AudioFrame) -> None:
+        if self._tool_call_pending:


When audio is dropped due to _tool_call_pending, the AudioByteStream (and potentially the resampler) may still contain buffered partial samples from before the tool call. When _tool_call_pending flips back to False, the next push_audio() call can combine that stale buffered audio with new audio, creating a discontinuity/corrupted stream. Consider clearing _bstream (and resetting _input_resampler if needed) when entering the pending state (or before returning early here).

Suggested change

def push_audio(self, frame: rtc.AudioFrame) -> None:

if self._tool_call_pending:

def _clear_pending_audio_state(self) -> None:

flush_bstream = getattr(self._bstream, "flush", None)

if callable(flush_bstream):

for _ in flush_bstream():

pass

input_resampler = getattr(self, "_input_resampler", None)

if input_resampler is None:

return

reset_resampler = getattr(input_resampler, "reset", None)

if callable(reset_resampler):

reset_resampler()

return

flush_resampler = getattr(input_resampler, "flush", None)

if callable(flush_resampler):

for _ in flush_resampler():

pass

def push_audio(self, frame: rtc.AudioFrame) -> None:

if self._tool_call_pending:

self._clear_pending_audio_state()

Copilot · 2026-04-26T13:53:24Z

+        # true while synchronous tool call is in flight for 3.1 only
+        # Audio frames dropped here to prevent server from thinking incoming audio is a
+        # new turn and cancelling the pending tool call
+        self._tool_call_pending = False


_tool_call_pending is only cleared on tool response send and server tool-call cancellation. If the session is restarted/disconnected while a tool call is pending (e.g., send/recv task errors trigger _mark_restart_needed(on_error=True)), the flag can remain True across reconnects and permanently mute push_audio() for Gemini 3.1. Suggest resetting _tool_call_pending as part of session restart/close (e.g., in _close_active_session, _mark_restart_needed, or at the start of each connect loop).

Copilot · 2026-04-26T13:53:24Z

                )
            )
        self._mark_current_generation_done()
+        if "3.1" in self._opts.model:


The model gating if "3.1" in self._opts.model is imprecise and can accidentally match non-Live models or future model names (the file already enumerates known Live model names via KNOWN_GEMINI_API_MODELS / LiveAPIModels). Prefer an exact match (or a well-scoped prefix check like model.startswith("gemini-3.1-")) to keep the behavior tightly bound to Gemini 3.1 Live only.

Suggested change

if "3.1" in self._opts.model:

if self._opts.model.startswith("gemini-3.1-"):

longcw · 2026-04-27T07:04:22Z

                )
            )
        self._mark_current_generation_done()
+        if "3.1" in self._opts.model:


first of all, I don't think dropping the audio when there is in-flight tool call is a right solution, but I am wondering why here it's only applied to 3.1?

fyi, we support gemini NON_BLOCKING tool call via tool_behavior option, you may check that option instead of using the model name? or even making this configurable when the tool behavior is blocking?

…he Gemini 3.1 live model Gemini 3.1 forces synchronous tool calling, which means the model blocks until tool responses arrive. The plugin's _send_task was constantly forwarding microphone audio while tools executed, which caused the server to think of incoming audio as a new turn and cancel the pending tool call after ~12s. This caused duplicate tool execution with already-resolved call_ids as well as corrupted conversation state. Adds a _tool_call_pending flag (for Gemini 3.1 only) that drops push_audio frames from the moment a toolCall is received until send_tool_response is flushed. Also clears the flag on tool_call_cancellation so the session never stalls. No behavior change for Gemini 2.5 models.

devin-ai-integration

Devin Review found 1 new potential issue.

View 8 additional findings in Devin Review.

devin-ai-integration · 2026-04-27T18:46:49Z

+        is_blocking = (
+            not is_given(self._opts.tool_behavior)
+            or self._opts.tool_behavior == types.Behavior.BLOCKING
+        )
+        if is_blocking:
+            self._tool_call_pending = True
+            self._bstream.clear()


🔴 Audio silently dropped during blocking tool calls on all models, not just 3.1 as intended

The comment on line 498 says "for 3.1 only" and the commit message says "on the Gemini 3.1 live model", but the is_blocking check at lines 1306-1309 has no model guard — it evaluates to True for any model when tool_behavior is NOT_GIVEN (the default) or BLOCKING. This means push_audio silently drops all audio frames during blocking tool execution on the default 2.5 models (gemini-2.5-flash-native-audio-preview-12-2025) as well, where the underlying server issue (audio being interpreted as a new turn that cancels the pending tool call) may not exist. Users speaking during tool execution on 2.5 models will have their audio silently discarded.

Prompt for agents

The _handle_tool_calls method sets _tool_call_pending = True for all models with blocking tool behavior, but the comment and commit message state this should only apply to 3.1 models. The model name is available via self._opts.model. The fix should add a model check, e.g. checking if '3.1' is in self._opts.model (similar to how the RealtimeModel.__init__ uses '3.1 in model' to determine mutability at realtime_api.py:289). For example, the is_blocking check should also verify the model is a 3.1 model before setting _tool_call_pending = True and clearing the byte stream.

Was this helpful? React with 👍 or 👎 to provide feedback.

Copilot AI review requested due to automatic review settings April 26, 2026 13:49

Copilot started reviewing on behalf of vedevpatel April 26, 2026 13:50 View session

This comment was marked as resolved.

Sign in to view

Copilot AI reviewed Apr 26, 2026

View reviewed changes

vedevpatel force-pushed the fix/gemini-3.1-sync-tool-audio-gate branch from 0dbb2e9 to 5ae5e69 Compare April 26, 2026 14:51

This comment was marked as resolved.

Sign in to view

vedevpatel force-pushed the fix/gemini-3.1-sync-tool-audio-gate branch from 5ae5e69 to 9325de1 Compare April 26, 2026 14:56

longcw reviewed Apr 27, 2026

View reviewed changes

vedevpatel force-pushed the fix/gemini-3.1-sync-tool-audio-gate branch from 9325de1 to db77c65 Compare April 27, 2026 18:41

devin-ai-integration Bot reviewed Apr 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(google): pause audio input during synchronous tool execution on t…#5556

fix(google): pause audio input during synchronous tool execution on t…#5556
vedevpatel wants to merge 1 commit intolivekit:mainfrom
vedevpatel:fix/gemini-3.1-sync-tool-audio-gate

vedevpatel commented Apr 26, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

Copilot AI Apr 26, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

longcw Apr 27, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		def push_audio(self, frame: rtc.AudioFrame) -> None:
		if self._tool_call_pending:

-    def push_audio(self, frame: rtc.AudioFrame) -> None:
-        if self._tool_call_pending:
+    def _clear_pending_audio_state(self) -> None:
+        flush_bstream = getattr(self._bstream, "flush", None)
+        if callable(flush_bstream):
+            for _ in flush_bstream():
+                pass
+        input_resampler = getattr(self, "_input_resampler", None)
+        if input_resampler is None:
+            return
+        reset_resampler = getattr(input_resampler, "reset", None)
+        if callable(reset_resampler):
+            reset_resampler()
+            return
+        flush_resampler = getattr(input_resampler, "flush", None)
+        if callable(flush_resampler):
+            for _ in flush_resampler():
+                pass
+    def push_audio(self, frame: rtc.AudioFrame) -> None:
+        if self._tool_call_pending:
+            self._clear_pending_audio_state()

	if "3.1" in self._opts.model:
	if self._opts.model.startswith("gemini-3.1-"):

Conversation

vedevpatel commented Apr 26, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

longcw Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants