Skip to content

[bug] client.evals.create_evaluation_run(agent=<reasoningEngine>) fails for A2A engines — hardcodes stream_query() on agent_engine that only exposes on_message_send #6837

@Abeansits

Description

@Abeansits

TL;DR

vertexai._genai._evals_common._execute_agent_run_with_retry unconditionally calls agent_engine.stream_query(...), but Agent Engines deployed with the A2A template (vertexai.preview.reasoning_engines.A2aAgent / agents-cli --agent adk_a2a) expose only on_message_send (api_mode=a2a_extension). The resulting AttributeError is swallowed by a broad except Exception, surfaces as the opaque INTERNAL: Internal error occurred., and the eval items never reach the engine container.

Repros 100% against any A2A engine; affects all client.evals.create_evaluation_run(agent=...) and client.evals.run_inference(agent=...) calls.

Two distinct defects in this report (see Root Cause):

  1. Wrong method dispatch — A2A engines have no stream_query
  2. Overbroad exception masking — the real AttributeError is invisible to the caller

A2A is a first-class Vertex Agent Engine deployment path (docs, SDK
reference
), not an edge case — so failing silently here breaks a documented happy path.

Versions

  • google-cloud-aiplatform==1.148.1 (also present at HEAD of the affected files in 1.149.x – 1.153.x)
  • a2a-sdk==0.3.26
  • Python 3.12, Linux + macOS

Reproduce

from vertexai import Client
from vertexai._genai import types as evals_types
import pandas as pd

client = Client(project="<PROJECT>", location="us-central1")
run = client.evals.create_evaluation_run(
    dataset=evals_types.EvaluationDataset(
        eval_dataset_df=pd.DataFrame({"prompt": ["hello"]})
    ),
    dest="gs://<BUCKET>/eval/",
    metrics=[evals_types.RubricMetric.GENERAL_QUALITY],
    agent="projects/<PROJECT>/locations/us-central1/reasoningEngines/<A2A_ENGINE_ID>",
)
# Run state: FAILED
# Item errors: "INTERNAL: Internal error occurred."
# Engine container: zero log entries during the inference window

The swallowed exception under the opaque INTERNAL (captured by calling stream_query directly against the engine):

Traceback (most recent call last):
  File ".../vertexai/_genai/_evals_common.py", line 2058, in _execute_agent_run_with_retry
    for event in agent_engine.stream_query(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'AgentEngine' object has no attribute 'stream_query'

This bubbles through the retry loop and is returned as {"error": "Failed after retries: ..."}, then surfaced to the user as INTERNAL: Internal error occurred. with no AttributeError mention.

Root cause

Defect 1: Wrong method dispatch (_evals_common.py:2058)

for event in agent_engine.stream_query(  # type: ignore[attr-defined]
    user_id=user_id,
    session_id=session_id,
    message=contents,
):

There's no branching on the engine's API mode. A2A engines' operation_schemas() reports:

[
  {
    "name": "on_message_send",
    "api_mode": "a2a_extension",
    "parameters": {"type": "object", "required": ["request", "context"]}
  }
]

Verified live:

ae = client.agent_engines.get(name=A2A_ENGINE_RESOURCE)
hasattr(ae, "stream_query")     # False
hasattr(ae, "on_message_send")  # True

Defect 2: Overbroad exception masking (_evals_common.py:2078-2089)

except Exception as e:  # pylint: disable=broad-exception-caught
    logger.error("Unexpected error during generate_content on attempt %d/%d: %s", ...)
    if attempt == max_retries - 1:
        return {"error": f"Failed after retries: {e}"}

The bare except Exception plus 3 retries means a deterministic AttributeError surfaces as a generic post-retry error string. The original exception type is lost from the public error path; a user has to read SDK source to figure out what's happening.

Two acceptable outcomes for the fix

I'd consider either of these resolving the report:

Option A — Add A2A dispatch path

In _execute_agent_run_with_retry, detect A2A via operation_schemas() and dispatch via on_message_send (Vertex A2A proto-JSON
<engine_base>/a2a/v1/message:send):

schemas = agent_engine.operation_schemas()
a2a_method = next(
    (s["name"] for s in schemas if s.get("api_mode") == "a2a_extension"),
    None,
)
if a2a_method:
    # Dispatch via A2A message:send and translate the response to the
    # eval service's expected event shape.
    ...
else:
    for event in agent_engine.stream_query(...):
        ...

Option B — Explicit unsupported error

If A2A inference is out of scope for the eval SDK in the near term, fail fast at create_evaluation_run / run_inference time with an actionable
error:

schemas = agent_engine.operation_schemas()
if any(s.get("api_mode") == "a2a_extension" for s in schemas):
    raise NotImplementedError(
        "A2A agent engines are not yet supported by client.evals inference. "
        "Use the two-step workaround: invoke the engine directly via "
        "with pre-populated 'response' column."
    )

In both options, the broad except Exception at line 2078 should be narrowed (or moved to catch only transport / quota retryable errors) so genuine programming defects don't surface as INTERNAL.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: vertex-aiIssues related to the googleapis/python-aiplatform API.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions