fix: retry when model returns empty response after tool execution#5859
Open
epa-davita wants to merge 8 commits into
Open
fix: retry when model returns empty response after tool execution#5859epa-davita wants to merge 8 commits into
epa-davita wants to merge 8 commits into
Conversation
Some models (notably Claude, and some Gemini preview models) occasionally return an empty content array (parts: []) after processing tool results. ADK's is_final_response() treats this as a valid completed turn because it only checks for the absence of function calls — not the presence of actual content. The agent loop stops and the user sees nothing. This adds a retry mechanism in BaseLlmFlow.run_async() that detects empty/meaningless final responses and re-prompts the model, up to a configurable maximum (default 2 retries) to prevent infinite loops. Closes google#3754 Related: google#3467, google#4090, google#3034
…kashbangad/adk-python into edk-python-issue-3754
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Collaborator
|
Response from ADK Triaging Agent Hello @epa-davita, thank you for submitting this pull request! This PR is a great fix for the empty model response issue. To help us move forward with reviewing your contribution, please make sure you follow the Contribution Guidelines:
Thank you for your contribution to the project! |
…adk-python-issue-3754
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
Some models (notably Claude, and some Gemini preview models) return an empty content array (parts: []) after processing tool results. ADK's is_final_response() treats this as a valid completed turn because it only checks for the absence of function calls — not the presence of actual content. The agent loop stops and the user sees nothing.
Observed with:
Claude (Opus/Sonnet/Haiku) via AnthropicLlm — after run_shell, computer_use tool results
Gemini preview models — after tool execution with streaming enabled
Example session history showing the bug:
Event 19: agent calls run_shell({"command": "cloudflared --version"})
Event 20: tool responds: {"output": "cloudflared version 2026.3.0", "exit_code": 0}
Event 21: agent responds with parts: [] ← EMPTY, agent loop ends, user sees nothing
Root Cause
In BaseLlmFlow.run_async() (line 757):
if not last_event or last_event.is_final_response() or last_event.partial:
break
And is_final_response() in event.py:
return (
not self.get_function_calls()
and not self.get_function_responses()
and not self.partial
and not self.has_trailing_code_execution_result()
)
An event with parts: [] passes all these checks — no function calls, no function responses, not partial — so is_final_response() returns True and the loop breaks.
Fix
Added a retry mechanism in BaseLlmFlow.run_async():
_has_meaningful_content(event) — helper that checks if an event actually contains content worth showing (non-empty text, function calls, inline data, etc.)
When is_final_response() is True but the event has no meaningful content, the loop continues instead of breaking, re-prompting the model
A maximum retry count (_MAX_EMPTY_RESPONSE_RETRIES = 2) prevents infinite loops if the model keeps returning empty responses
Tests
Added 10 new tests in test_empty_response_retry.py:
_has_meaningful_content tests (7):
test_no_content — None content → not meaningful
test_empty_parts — parts: [] → not meaningful
test_only_empty_text_part — text="" → not meaningful
test_only_whitespace_text_part — text=" \n " → not meaningful
test_non_empty_text — actual text → meaningful
test_function_call — function call → meaningful
test_function_response — function response → meaningful
Integration tests (3):
test_empty_response_retried_then_succeeds — empty response triggers retry, second call succeeds
test_empty_response_stops_after_max_retries — stops after max retries to prevent infinite loop
test_non_empty_response_not_retried — normal responses are not retried
All 10 tests pass. All 356 pre-existing flows/llm_flows/ tests pass.
pytest tests/unittests/flows/llm_flows/test_empty_response_retry.py -v
Closes #3754
Related: #3467, #4090, #3034
Note: This is the same as #4982 but I've fixed the failing unit tests.
🤖 Generated with Claude Code