Skip to content

fix(voice): don't re-issue in-flight tool calls on a new turn#6227

Merged
longcw merged 1 commit into
mainfrom
longc/inflight-tool-placeholder
Jun 26, 2026
Merged

fix(voice): don't re-issue in-flight tool calls on a new turn#6227
longcw merged 1 commit into
mainfrom
longc/inflight-tool-placeholder

Conversation

@longcw

@longcw longcw commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

When a new generation fires while a tool from a previous turn is still running, the in-flight call has no entry in the turn's chat context, so the model re-issues it and duplicates side effects (e.g. booking a flight twice).

This injects an ephemeral in-progress function_call/function_call_output pair for the running tool calls (from the session-wide _RunningTasks registry, covering both activity- and session-scoped executors) into the copy of the chat context fed to the LLM only. It is recomputed every turn and never persisted or forwarded, so it is superseded as soon as the real output lands and a placeholder can never go stale.

It also stops spawning a tool-response generation for an interrupted turn: that reply would be dropped by the interrupted check anyway and waste an LLM call. The completed tool outputs are committed to the chat context directly instead, so an interrupted tool's result is preserved rather than lost.

@longcw longcw requested a review from a team as a code owner June 25, 2026 11:48

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

Open in Devin Review


tool_messages = new_calls + new_fnc_outputs
if fnc_executed_ev._reply_required:
if fnc_executed_ev._reply_required and not speech_handle.interrupted:

@devin-ai-integration devin-ai-integration Bot Jun 25, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Interruption guard only applied to the pipeline path, not the realtime path

The new and not speech_handle.interrupted guard at livekit-agents/livekit/agents/voice/agent_activity.py:3130 prevents scheduling a tool response when the speech was interrupted during tool execution. However, the analogous realtime model path at livekit-agents/livekit/agents/voice/agent_activity.py:3781-3804 does not have this guard. The realtime path's fnc_executed_ev._reply_required check at line 3782 does not account for interruption. This is a pre-existing inconsistency (not introduced by this PR), but worth noting since this PR explicitly addresses the pipeline case.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

When a new generation fires while a tool from a previous turn is still running, the in-flight call has no entry in the turn's chat context, so the model re-issues it and duplicates side effects.

This injects an in-progress function_call/output pair for the running tool calls (from the session-wide _RunningTasks registry, covering both activity- and session-scoped executors) before the LLM call, so the model leaves the call alone. The pair is flagged and stripped again before the context is forwarded to the tool-reply turn, which re-injects from the live running set, so a placeholder is never persisted and can't go stale. Injecting in place rather than into a copy preserves any edits a custom llm_node makes to the context.

It also stops spawning a tool-response generation for an interrupted turn: that reply would be dropped by the interrupted check anyway and waste an LLM call. The completed tool outputs are committed to the chat context directly instead, so an interrupted tool's result is preserved rather than lost.
@longcw longcw force-pushed the longc/inflight-tool-placeholder branch from 09fd12d to 7e6e03f Compare June 25, 2026 13:25

@gyx09212214-prog gyx09212214-prog left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling the duplicate in-flight tool call case. One edge case I think still needs coverage: _inject_running_tool_calls(chat_ctx, ...) mutates chat_ctx before the LLM call, but _strip_running_tool_calls(chat_ctx) currently only runs inside the _reply_required and not speech_handle.interrupted branch. If a turn is interrupted, or if _reply_required is false, the injected function_call/function_call_output pair can remain in this activity's chat context even though the PR description says placeholders are stripped and never persisted/stale.

Could we move the strip into a finally/common cleanup path after inference, or add a regression test for interrupted / no-reply-required tool completion showing the placeholder is removed from chat_ctx?

@longcw

longcw commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

currently only runs inside the _reply_required and not speech_handle.interrupted branch. If a turn is interrupted, or if _reply_required is false, the injected function_call/function_call_output pair can remain in this activity's chat context even though the PR description says placeholders are stripped and never persisted/stale.

the chat_ctx it modified is a copy and will be discarded after the pipeline generation task, the only chance it reused is passing it to next tool reply turn

@longcw longcw merged commit 3f95602 into main Jun 26, 2026
26 checks passed
@longcw longcw deleted the longc/inflight-tool-placeholder branch June 26, 2026 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants