[bug] client.evals.create_evaluation_run(agent=<reasoningEngine>) fails for A2A engines — hardcodes stream_query() on agent_engine that only exposes   on_message_send


  ## TL;DR

  `vertexai._genai._evals_common._execute_agent_run_with_retry` unconditionally calls `agent_engine.stream_query(...)`, but Agent Engines deployed with the [A2A template](https://google.github.io/adk-docs/a2a/) (`vertexai.preview.reasoning_engines.A2aAgent` / `agents-cli --agent adk_a2a`) expose only `on_message_send` (`api_mode=a2a_extension`). The resulting `AttributeError` is swallowed by a broad `except Exception`, surfaces as the opaque `INTERNAL: Internal error occurred.`, and the eval items never reach the engine container.

Repros 100% against any A2A engine; affects all `client.evals.create_evaluation_run(agent=...)` and `client.evals.run_inference(agent=...)` calls.

  **Two distinct defects** in this report (see Root Cause):
  1. Wrong method dispatch — A2A engines have no `stream_query`
  2. Overbroad exception masking — the real `AttributeError` is invisible to the caller

  A2A is a first-class Vertex Agent Engine deployment path ([docs](https://google.github.io/adk-docs/a2a/), [SDK
  reference](https://docs.cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/develop/adk-a2a)), not an edge case — so failing silently here breaks a documented happy path.

  ## Versions

  - `google-cloud-aiplatform==1.148.1` (also present at HEAD of the affected files in 1.149.x – 1.153.x)
  - `a2a-sdk==0.3.26`
  - Python 3.12, Linux + macOS

  ## Reproduce

  ```python
  from vertexai import Client
  from vertexai._genai import types as evals_types
  import pandas as pd

  client = Client(project="<PROJECT>", location="us-central1")
  run = client.evals.create_evaluation_run(
      dataset=evals_types.EvaluationDataset(
          eval_dataset_df=pd.DataFrame({"prompt": ["hello"]})
      ),
      dest="gs://<BUCKET>/eval/",
      metrics=[evals_types.RubricMetric.GENERAL_QUALITY],
      agent="projects/<PROJECT>/locations/us-central1/reasoningEngines/<A2A_ENGINE_ID>",
  )
  # Run state: FAILED
  # Item errors: "INTERNAL: Internal error occurred."
  # Engine container: zero log entries during the inference window
  ```

  The swallowed exception under the opaque `INTERNAL` (captured by calling `stream_query` directly against the engine):

  ```
  Traceback (most recent call last):
    File ".../vertexai/_genai/_evals_common.py", line 2058, in _execute_agent_run_with_retry
      for event in agent_engine.stream_query(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  AttributeError: 'AgentEngine' object has no attribute 'stream_query'
  ```

  This bubbles through the retry loop and is returned as `{"error": "Failed after retries: ..."}`, then surfaced to the user as `INTERNAL: Internal
  error occurred.` with no `AttributeError` mention.

  ## Root cause

  ### Defect 1: Wrong method dispatch (`_evals_common.py:2058`)

  ```python
  for event in agent_engine.stream_query(  # type: ignore[attr-defined]
      user_id=user_id,
      session_id=session_id,
      message=contents,
  ):
  ```

  There's no branching on the engine's API mode. A2A engines' `operation_schemas()` reports:

  ```json
  [
    {
      "name": "on_message_send",
      "api_mode": "a2a_extension",
      "parameters": {"type": "object", "required": ["request", "context"]}
    }
  ]
  ```

  Verified live:
  ```python
  ae = client.agent_engines.get(name=A2A_ENGINE_RESOURCE)
  hasattr(ae, "stream_query")     # False
  hasattr(ae, "on_message_send")  # True
  ```

  ### Defect 2: Overbroad exception masking (`_evals_common.py:2078-2089`)

  ```python
  except Exception as e:  # pylint: disable=broad-exception-caught
      logger.error("Unexpected error during generate_content on attempt %d/%d: %s", ...)
      if attempt == max_retries - 1:
          return {"error": f"Failed after retries: {e}"}
  ```

  The bare `except Exception` plus 3 retries means a deterministic AttributeError surfaces as a generic post-retry error string. The original exception type is lost from the public error path; a user has to read SDK source to figure out what's happening.

  ## Two acceptable outcomes for the fix

  I'd consider either of these resolving the report:

  ### Option A — Add A2A dispatch path

  In `_execute_agent_run_with_retry`, detect A2A via `operation_schemas()` and dispatch via `on_message_send` (Vertex A2A proto-JSON
  `<engine_base>/a2a/v1/message:send`):

  ```python
  schemas = agent_engine.operation_schemas()
  a2a_method = next(
      (s["name"] for s in schemas if s.get("api_mode") == "a2a_extension"),
      None,
  )
  if a2a_method:
      # Dispatch via A2A message:send and translate the response to the
      # eval service's expected event shape.
      ...
  else:
      for event in agent_engine.stream_query(...):
          ...
  ```

  ### Option B — Explicit unsupported error

  If A2A inference is out of scope for the eval SDK in the near term, fail fast at `create_evaluation_run` / `run_inference` time with an actionable
  error:

  ```python
  schemas = agent_engine.operation_schemas()
  if any(s.get("api_mode") == "a2a_extension" for s in schemas):
      raise NotImplementedError(
          "A2A agent engines are not yet supported by client.evals inference. "
          "Use the two-step workaround: invoke the engine directly via "
          "with pre-populated 'response' column."
      )
  ```

  In both options, the broad `except Exception` at line 2078 should be narrowed (or moved to catch only transport / quota retryable errors) so genuine programming defects don't surface as `INTERNAL`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bug] client.evals.create_evaluation_run(agent=<reasoningEngine>) fails for A2A engines — hardcodes stream_query() on agent_engine that only exposes on_message_send #6837

TL;DR

Versions

Reproduce

Root cause

Defect 1: Wrong method dispatch (`_evals_common.py:2058`)

Defect 2: Overbroad exception masking (`_evals_common.py:2078-2089`)

Two acceptable outcomes for the fix

Option A — Add A2A dispatch path

Option B — Explicit unsupported error

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[bug] client.evals.create_evaluation_run(agent=<reasoningEngine>) fails for A2A engines — hardcodes stream_query() on agent_engine that only exposes on_message_send #6837

Description

TL;DR

Versions

Reproduce

Root cause

Defect 1: Wrong method dispatch (_evals_common.py:2058)

Defect 2: Overbroad exception masking (_evals_common.py:2078-2089)

Two acceptable outcomes for the fix

Option A — Add A2A dispatch path

Option B — Explicit unsupported error

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Defect 1: Wrong method dispatch (`_evals_common.py:2058`)

Defect 2: Overbroad exception masking (`_evals_common.py:2078-2089`)