fix: make `_EvalMetricResultWithInvocation.expected_invocation` `Optional` for conversation_scenario support by ASRagab · Pull Request #5215 · google/adk-python

ASRagab · 2026-04-08T21:16:20Z

Summary

_EvalMetricResultWithInvocation.expected_invocation is typed as Invocation (required), but local_eval_service.py:285-287 intentionally sets it to None when eval_case.conversation is None (i.e., conversation_scenario user-simulation cases)
The public model EvalMetricResultPerInvocation in eval_metrics.py:323 already types this field as Optional[Invocation] = None
This mismatch causes a pydantic ValidationError during post-processing in _get_eval_metric_results_with_invocation, after all metrics have been computed

Changes

Make expected_invocation Optional[Invocation] = None in _EvalMetricResultWithInvocation
Guard the three attribute accesses in _print_details to handle None (fall back to actual_invocation.user_content for the prompt column, None for expected response/tool calls)
Both _convert_content_to_text and _convert_tool_calls_to_text already accept Optional parameters

Testing Plan

Verified with a pytest-based evaluation using AgentEvaluator.evaluate() against an evalset containing conversation_scenario cases (LLM-backed user simulation, no explicit conversation arrays).

Before fix — crashes after ~33 minutes of metric computation during post-processing:

pydantic_core._pydantic_core.ValidationError: 1 validation error for _EvalMetricResultWithInvocation
expected_invocation
  Input should be a valid dictionary or instance of Invocation [type=model_type, input_value=None, input_type=NoneType]

.venv/lib/python3.11/site-packages/google/adk/evaluation/agent_evaluator.py:639: ValidationError

After fix — the ValidationError is eliminated. The None expected_invocation flows through correctly because:

The field now accepts Optional[Invocation], matching the upstream EvalMetricResultPerInvocation model
_print_details gracefully handles None by falling back to actual_invocation.user_content for the prompt column and passing None to _convert_content_to_text/_convert_tool_calls_to_text (both already accept Optional inputs)

Reproduction evalset (any evalset with conversation_scenario triggers this):

{
  "eval_set_id": "test",
  "eval_cases": [{
    "eval_id": "scenario_1",
    "conversation_scenario": {
      "starting_prompt": "Hello",
      "conversation_plan": "Ask the agent a question and accept the answer."
    },
    "session_input": {"app_name": "my_agent", "user_id": "user1", "state": {}}
  }]
}

@pytest.mark.asyncio
async def test_scenario():
    await AgentEvaluator.evaluate("my_agent", "path/to/evalset.json", num_runs=1)

Fixes #5214

When using conversation_scenario for user simulation, expected_invocation is None because conversations are dynamically generated. The public model EvalMetricResultPerInvocation already types this as Optional[Invocation], but the private _EvalMetricResultWithInvocation requires non-None, causing a pydantic ValidationError during post-processing. - Make expected_invocation Optional[Invocation] = None - Guard attribute accesses in _print_details to handle None - Fall back to actual_invocation.user_content for the prompt column Fixes google#5214

adk-bot · 2026-04-08T21:17:51Z

Response from ADK Triaging Agent

Hello @ASRagab, thank you for submitting this pull request!

To help the reviewers, could you please add a testing plan section to your PR description explaining how you verified the fix? For example, did you run the evaluation with a conversation_scenario?

Including the logs or a screenshot showing that the ValidationError is gone after your change would also be very helpful.

You can find more details in our contribution guidelines. Thanks!

src/google/adk/evaluation/agent_evaluator.py

ASRagab · 2026-04-09T19:42:05Z

Testing Evidence for PR #5215

Reproduction Script

A targeted script that exercises the exact codepath fixed by this PR:

Constructs _EvalMetricResultWithInvocation with expected_invocation=None (the conversation_scenario path)
Exercises the three guard paths in _print_details where .user_content, .final_response, and .intermediate_data are accessed on expected_invocation
Verifies the non-None path still works (regression check)

Before (PyPI `google-adk==1.28.0`, unfixed)

============================================================
TEST 1: _EvalMetricResultWithInvocation(expected_invocation=None)
============================================================
  FAIL: ValidationError: 1 validation error for _EvalMetricResultWithInvocation
expected_invocation
  Input should be a valid dictionary or instance of Invocation [type=model_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.12/v/model_type

Pydantic rejects None because the field is typed as Invocation (non-optional).

After (local editable install from `fix/optional-expected-invocation` branch)

============================================================
TEST 1: _EvalMetricResultWithInvocation(expected_invocation=None)
============================================================
  PASS: constructed successfully
  expected_invocation is None: True

============================================================
TEST 2: Guard paths for None expected_invocation
============================================================
  PASS: prompt = 'Hello'
  PASS: expected_response = ''
  PASS: expected_tool_calls = ''

============================================================
TEST 3: _EvalMetricResultWithInvocation(expected_invocation=<Invocation>)
============================================================
  PASS: constructed with real invocation
  PASS: prompt = 'Hello'

============================================================
ALL TESTS PASSED
============================================================

What was verified

Check	Result
`expected_invocation: Optional[Invocation] = None` (line 93)	`None` accepted without `ValidationError`
`_print_details` prompt fallback to `actual_invocation.user_content`	Works correctly
`_print_details` expected_response fallback to `None`	`_convert_content_to_text(None)` returns `""`
`_print_details` expected_tool_calls fallback to `None`	`_convert_tool_calls_to_text(None)` returns `""`
Non-None `expected_invocation` (regression)	Still works as before

Context

This was tested using a conversation_scenario-based evalset from an agent project. The multi-turn evalset has 5 cases that all use conversation_scenario (no explicit conversation array), which is exactly the codepath where local_eval_service.py sets expected_invocation=None during post-processing.

adk-bot added the eval [Component] This issue is related to evaluation label Apr 8, 2026

kylegallatin reviewed Apr 8, 2026

View reviewed changes

src/google/adk/evaluation/agent_evaluator.py Show resolved Hide resolved

kylegallatin approved these changes Apr 8, 2026

View reviewed changes

rohityan self-assigned this Apr 9, 2026

ASRagab changed the title ~~fix: make _EvalMetricResultWithInvocation.expected_invocation Optional for conversation_scenario support~~ fix: make _EvalMetricResultWithInvocation.expected_invocation Optional for conversation_scenario support Apr 9, 2026

surajksharma07 mentioned this pull request Apr 9, 2026

AgentEvaluator crashes with ValidationError when evaluating conversation_scenario eval cases #5214

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: make `_EvalMetricResultWithInvocation.expected_invocation` `Optional` for conversation_scenario support#5215

fix: make `_EvalMetricResultWithInvocation.expected_invocation` `Optional` for conversation_scenario support#5215
ASRagab wants to merge 1 commit intogoogle:mainfrom
ASRagab:fix/optional-expected-invocation

ASRagab commented Apr 8, 2026 •

edited

Loading

Uh oh!

adk-bot commented Apr 8, 2026

Uh oh!

Uh oh!

ASRagab commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ASRagab commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing Plan

Uh oh!

adk-bot commented Apr 8, 2026

Uh oh!

Uh oh!

ASRagab commented Apr 9, 2026

Testing Evidence for PR #5215

Reproduction Script

Before (PyPI google-adk==1.28.0, unfixed)

After (local editable install from fix/optional-expected-invocation branch)

What was verified

Context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ASRagab commented Apr 8, 2026 •

edited

Loading

Before (PyPI `google-adk==1.28.0`, unfixed)

After (local editable install from `fix/optional-expected-invocation` branch)