[integration] Agent workflows (big-agents) by mmabrouk · Pull Request #4791 · Agenta-AI/agenta

mmabrouk · 2026-06-22T11:58:09Z

Context

big-agents is the integration branch for the agent-workflows feature. Every agent PR targets big-agents (directly, or by stacking on one that does). The plan is to review and merge each sub-PR into big-agents, then merge big-agents into main as a single unit.

This PR is a draft tracker. It stays open until all the sub-PRs below are merged into big-agents. The branch starts from an empty commit, so the diff fills in as sub-PRs land.

Integrated PRs

Each box gets checked when that PR is merged into big-agents. Indented items stack on the item above them.

SDK and service

feat(sdk): agent runtime behind backend/harness ports #4771 feat(sdk): agent runtime behind backend/harness ports — merged
- feat(agent): agent workflow service and tool-resolution API #4772 feat(agent): agent workflow service and tool-resolution API
  - fix(tools): support no-auth Composio toolkits + server-owned connection flags #4785 fix(tools): support no-auth Composio toolkits + server-owned connection flags

Runner

feat(agent): runner wire contract and tool execution #4773 feat(agent): runner wire contract and tool execution
- feat(agent): runner engines, HTTP server, tracing, and docker image #4778 feat(agent): runner engines, HTTP server, tracing, and docker image
  - test(agent): vitest suite + CI for the agent runner; fix relay error bug #4784 (draft) test(agent): vitest suite + CI for the agent runner

Frontend

feat(frontend): agent config playground controls #4775 feat(frontend): agent config playground controls
- feat(frontend): agent chat streaming slice + RAG example demo #4780 feat(frontend): agent chat streaming slice + RAG example demo

Hosting

chore(hosting): wire the agent runner sidecar into compose #4776 chore(hosting): wire the agent runner sidecar into compose

Sandbox-agent deployment

chore(agent): make sandbox-agent runner first-class #4786 chore(agent): make sandbox-agent runner first-class
- chore(railway): add sandbox-agent preview deployment #4787 chore(railway): add sandbox-agent preview deployment
- chore(kubernetes): deploy sandbox-agent sidecar #4788 chore(kubernetes): deploy sandbox-agent sidecar
- ci(agent): build and test sandbox-agent images #4789 ci(agent): build and test sandbox-agent images

Docs

docs(agent): agent-workflows design wiki, ground truth, and archived POCs #4779 docs(agent): agent-workflows design wiki, ground truth, and archived POCs

Branch-only (no PR yet)

These design-doc branches are stacked on big-agents but have no PR. Open one if you want them reviewed separately, otherwise they fold in with the docs.

docs/agent-model-config-and-provider-auth
docs/agent-skills-config
docs/agent-code-tool-sandbox
docs/agent-harness-capabilities

Notes

Deferred to the review pass: make feat(agent): runner wire contract and tool execution #4773 its own series and apply test(agent): vitest suite + CI for the agent runner; fix relay error bug #4784 with the rivet → sandbox-agent rename folded in. The in-place apply of test(agent): vitest suite + CI for the agent runner; fix relay error bug #4784 conflicts with the rename, so it gets rebuilt during review.
Closed and not part of this integration: feat(agent): runner engines, server, and tracing #4774 (superseded by feat(agent): runner engines, HTTP server, tracing, and docker image #4778), docs(agent): agent-workflows design and ground truth #4777 (superseded by docs(agent): agent-workflows design wiki, ground truth, and archived POCs #4779), feat(agent): run the Agenta harness on the rivet/ACP backend with forced skills #4782 (rivet harness, abandoned).

…es protocol

vercel · 2026-06-22T11:58:16Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 22, 2026 12:21pm

coderabbitai · 2026-06-22T11:58:20Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: edbefce6-68d5-40aa-89aa-bef622ea8a6b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

✅ Review completed - (🔄 Check again to review again)

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch big-agents

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…error handling)

feat(sdk): agent runtime behind backend/harness ports

coderabbitai

Actionable comments posted: 10

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 76c33a7d-feff-4e5f-acc0-962498f74cfc

📥 Commits

Reviewing files that changed from the base of the PR and between a97e608 and 2eed5d0.

📒 Files selected for processing (70)

sdks/python/agenta/__init__.py
sdks/python/agenta/sdk/agents/__init__.py
sdks/python/agenta/sdk/agents/adapters/__init__.py
sdks/python/agenta/sdk/agents/adapters/_runner_config.py
sdks/python/agenta/sdk/agents/adapters/agenta_builtins.py
sdks/python/agenta/sdk/agents/adapters/harnesses.py
sdks/python/agenta/sdk/agents/adapters/in_process.py
sdks/python/agenta/sdk/agents/adapters/local.py
sdks/python/agenta/sdk/agents/adapters/sandbox_agent.py
sdks/python/agenta/sdk/agents/adapters/vercel/__init__.py
sdks/python/agenta/sdk/agents/adapters/vercel/messages.py
sdks/python/agenta/sdk/agents/adapters/vercel/routing.py
sdks/python/agenta/sdk/agents/adapters/vercel/sse.py
sdks/python/agenta/sdk/agents/adapters/vercel/stream.py
sdks/python/agenta/sdk/agents/dtos.py
sdks/python/agenta/sdk/agents/errors.py
sdks/python/agenta/sdk/agents/interfaces.py
sdks/python/agenta/sdk/agents/mcp/__init__.py
sdks/python/agenta/sdk/agents/mcp/errors.py
sdks/python/agenta/sdk/agents/mcp/interfaces.py
sdks/python/agenta/sdk/agents/mcp/models.py
sdks/python/agenta/sdk/agents/mcp/parsing.py
sdks/python/agenta/sdk/agents/mcp/resolver.py
sdks/python/agenta/sdk/agents/mcp/wire.py
sdks/python/agenta/sdk/agents/streaming.py
sdks/python/agenta/sdk/agents/tools/__init__.py
sdks/python/agenta/sdk/agents/tools/compat.py
sdks/python/agenta/sdk/agents/tools/errors.py
sdks/python/agenta/sdk/agents/tools/interfaces.py
sdks/python/agenta/sdk/agents/tools/models.py
sdks/python/agenta/sdk/agents/tools/parsing.py
sdks/python/agenta/sdk/agents/tools/resolver.py
sdks/python/agenta/sdk/agents/tools/wire.py
sdks/python/agenta/sdk/agents/ui_messages.py
sdks/python/agenta/sdk/agents/utils/__init__.py
sdks/python/agenta/sdk/agents/utils/ts_runner.py
sdks/python/agenta/sdk/agents/utils/wire.py
sdks/python/agenta/sdk/decorators/routing.py
sdks/python/agenta/sdk/engines/running/interfaces.py
sdks/python/agenta/sdk/engines/running/utils.py
sdks/python/agenta/sdk/middlewares/running/normalizer.py
sdks/python/agenta/sdk/models/workflows.py
sdks/python/agenta/sdk/utils/types.py
sdks/python/agenta/tests/agents/test_streaming.py
sdks/python/oss/tests/pytest/integration/agents/__init__.py
sdks/python/oss/tests/pytest/integration/agents/test_transport_roundtrip.py
sdks/python/oss/tests/pytest/unit/agents/__init__.py
sdks/python/oss/tests/pytest/unit/agents/conftest.py
sdks/python/oss/tests/pytest/unit/agents/golden/run_request.claude.json
sdks/python/oss/tests/pytest/unit/agents/golden/run_request.pi.json
sdks/python/oss/tests/pytest/unit/agents/golden/run_result.error.json
sdks/python/oss/tests/pytest/unit/agents/golden/run_result.ok.json
sdks/python/oss/tests/pytest/unit/agents/mcp/__init__.py
sdks/python/oss/tests/pytest/unit/agents/mcp/test_resolver.py
sdks/python/oss/tests/pytest/unit/agents/test_dtos_agent_config.py
sdks/python/oss/tests/pytest/unit/agents/test_dtos_capabilities_events.py
sdks/python/oss/tests/pytest/unit/agents/test_dtos_content_blocks.py
sdks/python/oss/tests/pytest/unit/agents/test_dtos_harness_configs.py
sdks/python/oss/tests/pytest/unit/agents/test_environment_lifecycle.py
sdks/python/oss/tests/pytest/unit/agents/test_harness_adapters.py
sdks/python/oss/tests/pytest/unit/agents/test_runner_adapter_config.py
sdks/python/oss/tests/pytest/unit/agents/test_ui_messages.py
sdks/python/oss/tests/pytest/unit/agents/test_wire_contract.py
sdks/python/oss/tests/pytest/unit/agents/tools/__init__.py
sdks/python/oss/tests/pytest/unit/agents/tools/test_models.py
sdks/python/oss/tests/pytest/unit/agents/tools/test_parsing.py
sdks/python/oss/tests/pytest/unit/agents/tools/test_resolver.py
sdks/python/oss/tests/pytest/unit/test_normalizer_passthrough.py
sdks/python/oss/tests/pytest/utils/test_messages_endpoint.py
sdks/python/oss/tests/pytest/utils/test_routing.py

coderabbitai · 2026-06-22T13:25:59Z

+NOTE on packaging: the Node runner is NOT part of this Python wheel (``pip install agenta``
+stays pure Python; the wheel contains zero ``.ts``/``.js``). How a standalone Pi user obtains
+the runner -- an ``npx`` npm package, a local checkout, or a Docker sidecar over HTTP -- is an
+open distribution decision; see ``docs/design/agent-workflows/typescript-structure/``. Do NOT
+silently bundle a JS runner into the wheel.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align LocalBackend wording with the stated packaging contract.

Line 9-13 says the wheel must not bundle a JS runner, but Line 30 and the NotImplementedError messages still say “bundled JS”. This contradiction will confuse integrators.

Suggested wording fix

-class LocalBackend(Backend): - """Run Pi (bundled JS) or Claude (``claude-agent-sdk``) on this machine.""" +class LocalBackend(Backend): + """Run Pi (external Node runner) or Claude (``claude-agent-sdk``) on this machine.""" ... raise NotImplementedError( - "LocalBackend is not implemented yet (Phase 3: Pi via bundled JS, " + "LocalBackend is not implemented yet (Phase 3: Pi via external Node runner, " "Phase 4: Claude via claude-agent-sdk)." ) ... raise NotImplementedError( - "LocalBackend is not implemented yet (Phase 3: Pi via bundled JS, " + "LocalBackend is not implemented yet (Phase 3: Pi via external Node runner, " "Phase 4: Claude via claude-agent-sdk)." )

Also applies to: 30-38, 50-53

coderabbitai · 2026-06-22T13:25:59Z

+    def __init__(
+        self,
+        *,
+        sandbox: str = "local",
+        url: Optional[str] = None,
+        command: Optional[Sequence[str]] = None,
+        cwd: Optional[str] = None,
+        timeout: float = float(os.getenv("AGENTA_AGENT_RUNNER_TIMEOUT_SECONDS", "180")),
+    ) -> None:
+        self._sandbox = sandbox
+        self._url = url


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate sandbox at construction time.

Line 129 currently accepts any string; invalid values get sent over the wire and fail late. Restrict this to supported values (local, daytona) and raise a configuration error early.

Suggested validation

from ..dtos import ( @@ ) +from ..errors import AgentRunnerConfigurationError @@ def __init__( self, *, sandbox: str = "local", @@ timeout: float = float(os.getenv("AGENTA_AGENT_RUNNER_TIMEOUT_SECONDS", "180")), ) -> None: + allowed_sandboxes = {"local", "daytona"} + if sandbox not in allowed_sandboxes: + raise AgentRunnerConfigurationError( + f"Unsupported sandbox '{sandbox}'. Expected one of: {sorted(allowed_sandboxes)}." + ) self._sandbox = sandbox self._url = url

coderabbitai · 2026-06-22T13:25:59Z

+def _tool_part_blocks(part: Dict[str, Any], ptype: str) -> List[ContentBlock]:
+    """A Vercel tool part -> neutral tool-call/result content blocks."""
+    tool_call_id = part.get("toolCallId") or part.get("tool_call_id")
+    tool_name = part.get("toolName") or part.get("tool_name")
+    if (
+        tool_name is None
+        and ptype.startswith("tool-")
+        and ptype != TOOL_OUTPUT_AVAILABLE
+    ):
+        tool_name = ptype[len("tool-") :]
+
+    blocks: List[ContentBlock] = []
+    if ptype != TOOL_OUTPUT_AVAILABLE or "input" in part:
+        blocks.append(
+            ContentBlock(
+                type="tool_call",
+                tool_call_id=tool_call_id,
+                tool_name=tool_name,
+                input=part.get("input"),
+            )
+        )
+
+    state = part.get("state")
+    error_text = part.get("errorText")
+    if error_text is not None or state == "output-error":
+        blocks.append(
+            ContentBlock(
+                type="tool_result",
+                tool_call_id=tool_call_id,
+                tool_name=tool_name,
+                output=error_text if error_text is not None else part.get("output"),
+                is_error=True,
+            )
+        )
+    elif "output" in part or state == "output-available":
+        blocks.append(
+            ContentBlock(
+                type="tool_result",
+                tool_call_id=tool_call_id,
+                tool_name=tool_name,
+                output=part.get("output"),
+                is_error=False,
+            )
+        )
+    return blocks


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Output-only tool-* parts are incorrectly turned into synthetic tool_call blocks.

_tool_part_blocks currently emits a tool_call for most tool-* parts even when the part is already in an output state. That fabricates an extra call (often with input=None) and distorts stored history for subsequent turns.

Proposed fix

def _tool_part_blocks(part: Dict[str, Any], ptype: str) -> List[ContentBlock]: """A Vercel tool part -> neutral tool-call/result content blocks.""" tool_call_id = part.get("toolCallId") or part.get("tool_call_id") tool_name = part.get("toolName") or part.get("tool_name") @@ - blocks: List[ContentBlock] = [] - if ptype != TOOL_OUTPUT_AVAILABLE or "input" in part: + state = part.get("state") + error_text = part.get("errorText") + has_input = "input" in part + emits_output = ( + error_text is not None + or state in {"output-available", "output-error"} + or "output" in part + ) + + blocks: List[ContentBlock] = [] + if has_input or not emits_output: blocks.append( ContentBlock( type="tool_call", tool_call_id=tool_call_id, tool_name=tool_name, input=part.get("input"), ) ) - state = part.get("state") - error_text = part.get("errorText") if error_text is not None or state == "output-error": blocks.append( ContentBlock( type="tool_result", tool_call_id=tool_call_id,

coderabbitai · 2026-06-22T13:25:59Z

+        llm_config = prompt_cfg.get("llm_config") or {}
+        model = llm_config.get("model") or defaults.model
+        instructions = _system_text(prompt_cfg.get("messages")) or defaults.instructions
+        raw_tools = llm_config.get("tools")
+        if raw_tools is None:
+            raw_tools = prompt_cfg.get("tools")
+    else:


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard llm_config type before dictionary access.

Line 694 assumes prompt["llm_config"] is a dict. If it’s a non-dict value, this path crashes with AttributeError instead of applying defaults.

Proposed fix

prompt_cfg = params.get("prompt") if isinstance(prompt_cfg, dict): - llm_config = prompt_cfg.get("llm_config") or {} + raw_llm_config = prompt_cfg.get("llm_config") + llm_config = raw_llm_config if isinstance(raw_llm_config, dict) else {} model = llm_config.get("model") or defaults.model instructions = _system_text(prompt_cfg.get("messages")) or defaults.instructions raw_tools = llm_config.get("tools") if raw_tools is None: raw_tools = prompt_cfg.get("tools")

coderabbitai · 2026-06-22T13:25:59Z

+        sandbox = await self._sandbox()
+        if provisioning:
+            await sandbox.add_files(provisioning)
+        return await self._backend.create_session(
+            sandbox,
+            config,
+            harness=harness,
+            secrets=session_config.secrets,
+            trace=session_config.trace,
+            session_id=session_config.session_id,
+        )


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Destroy per-session sandbox on setup/session-creation failure.

If Line 224 (add_files) or Line 225 (create_session) raises, a per-session sandbox is left alive with no owner to tear it down.

Proposed fix

async def create_session( self, config: HarnessAgentConfig, *, harness: HarnessType, session_config: SessionConfig, provisioning: Optional[Mapping[str, bytes]] = None, ) -> Session: """Provision a sandbox per policy, then open a session in it.""" sandbox = await self._sandbox() - if provisioning: - await sandbox.add_files(provisioning) - return await self._backend.create_session( - sandbox, - config, - harness=harness, - secrets=session_config.secrets, - trace=session_config.trace, - session_id=session_config.session_id, - ) + try: + if provisioning: + await sandbox.add_files(provisioning) + return await self._backend.create_session( + sandbox, + config, + harness=harness, + secrets=session_config.secrets, + trace=session_config.trace, + session_id=session_config.session_id, + ) + except Exception: + if self._sandbox_per_session: + try: + await sandbox.destroy() + except Exception: + pass + raise

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

sandbox = await self._sandbox()

if provisioning:

await sandbox.add_files(provisioning)

return await self._backend.create_session(

sandbox,

config,

harness=harness,

secrets=session_config.secrets,

trace=session_config.trace,

session_id=session_config.session_id,

)

sandbox = await self._sandbox()

try:

if provisioning:

await sandbox.add_files(provisioning)

return await self._backend.create_session(

sandbox,

config,

harness=harness,

secrets=session_config.secrets,

trace=session_config.trace,

session_id=session_config.session_id,

)

except Exception:

if self._sandbox_per_session:

try:

await sandbox.destroy()

except Exception:

pass

raise

coderabbitai · 2026-06-22T13:25:59Z

+        session = await self.create_session(config)
+
+        def _absorb(result: AgentResult) -> None:
+            if result.session_id:
+                config.session_id = result.session_id
+
+        return session.stream(messages).on_result(_absorb).on_cleanup(session.destroy)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Ensure session cleanup if stream setup fails synchronously.

Line 321 only registers cleanup after session.stream(messages) succeeds. If stream construction raises, the session is leaked.

Proposed fix

session = await self.create_session(config) + try: + run = session.stream(messages) + except Exception: + await session.destroy() + raise def _absorb(result: AgentResult) -> None: if result.session_id: config.session_id = result.session_id - return session.stream(messages).on_result(_absorb).on_cleanup(session.destroy) + return run.on_result(_absorb).on_cleanup(session.destroy)

coderabbitai · 2026-06-22T13:25:59Z

+from agenta.sdk.agents.tools.models import MissingSecretPolicy
+
+from .errors import MissingMCPSecretError
+from .interfaces import MCPSecretProvider
+from .models import MCPServerConfig, ResolvedMCPServer
+
+
+class MCPResolver:
+    def __init__(
+        self,
+        *,
+        secret_provider: MCPSecretProvider,
+        missing_secret_policy: MissingSecretPolicy = MissingSecretPolicy.ERROR,
+    ) -> None:


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Breaks declared layer direction by importing tools model into MCP.

MCPResolver currently depends on agenta.sdk.agents.tools.models.MissingSecretPolicy, but this cohort declares tools as depending on MCP, not the other way around. This reverse edge can create import-order fragility and circular dependency risk as the stack evolves. Move MissingSecretPolicy to a neutral/shared module (or MCP/shared contract module) and import it from both subsystems.

Possible direction

- from agenta.sdk.agents.tools.models import MissingSecretPolicy + from agenta.sdk.agents.shared.missing_secret_policy import MissingSecretPolicy

(then define/move the enum in that shared module and update tools imports accordingly)

coderabbitai · 2026-06-22T13:25:59Z

+    out = stdout.decode("utf-8", "replace")
+    err = stderr.decode("utf-8", "replace")
+    if not out.strip():
+        raise RuntimeError(
+            f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}"
+        )
+    try:
+        return json.loads(out)
+    except json.JSONDecodeError as exc:


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat non-zero subprocess exit as transport failure even with parseable JSON.

Line 74 returns parsed JSON without checking proc.returncode; a crashed runner can look successful if it emitted partial/legacy JSON before exiting non-zero.

Suggested fix

@@ async def deliver_subprocess(...): out = stdout.decode("utf-8", "replace") err = stderr.decode("utf-8", "replace") + if proc.returncode not in (0, None): + raise RuntimeError( + "Agent runner exited non-zero. " + f"exit={proc.returncode} stderr={err[-2000:]} stdout={out[:500]}" + ) if not out.strip(): raise RuntimeError( f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}" )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

out = stdout.decode("utf-8", "replace")

err = stderr.decode("utf-8", "replace")

if not out.strip():

raise RuntimeError(

f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}"

)

try:

return json.loads(out)

except json.JSONDecodeError as exc:

out = stdout.decode("utf-8", "replace")

err = stderr.decode("utf-8", "replace")

if proc.returncode not in (0, None):

raise RuntimeError(

"Agent runner exited non-zero. "

f"exit={proc.returncode} stderr={err[-2000:]} stdout={out[:500]}"

)

if not out.strip():

raise RuntimeError(

f"Agent runner returned no output. exit={proc.returncode} stderr={err[-2000:]}"

)

try:

return json.loads(out)

except json.JSONDecodeError as exc:

coderabbitai · 2026-06-22T13:25:59Z

+            async for line in response.aiter_lines():
+                line = line.strip()
+                if line:
+                    record = json.loads(line)
+                    if record.get("kind") == "result":
+                        saw_result = True
+                    yield record
+    if not saw_result:
+        raise RuntimeError("Agent runner stream ended without a terminal result record")


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce a single terminal result record in stream transports.

Line 120 and Line 168 set saw_result=True but still allow additional records, which violates the “exactly one terminal result” stream contract and can cause ambiguous terminal state handling.

Suggested fix

@@ async def deliver_http_stream(...): - async for line in response.aiter_lines(): + async for line in response.aiter_lines(): line = line.strip() if line: record = json.loads(line) + if saw_result: + raise RuntimeError( + "Agent runner emitted records after terminal result record" + ) if record.get("kind") == "result": - saw_result = True + if saw_result: + raise RuntimeError( + "Agent runner emitted multiple terminal result records" + ) + saw_result = True yield record @@ async def deliver_subprocess_stream(...): line = raw.decode("utf-8", "replace").strip() if line: record = json.loads(line) + if saw_result: + raise RuntimeError( + "Agent runner emitted records after terminal result record" + ) if record.get("kind") == "result": - saw_result = True + if saw_result: + raise RuntimeError( + "Agent runner emitted multiple terminal result records" + ) + saw_result = True yield record

Also applies to: 167-175

coderabbitai · 2026-06-22T13:25:59Z

    # agenta:builtin:* — application-only (not evaluators)
    ("builtin", "chat"): (True, False, False),
    ("builtin", "completion"): (True, False, False),
+    ("builtin", "agent"): (True, False, False),


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

is_agent is never inferred, so agent workflows keep WorkflowFlags.is_agent=False.

You added the built-in agent role mapping, but infer_flags_from_data still never computes/passes is_agent into WorkflowFlags, so the new agent flag/filter path won’t work as intended.

💡 Proposed fix

@@ - is_chat = key == "chat" or _has_messages_input(inputs_schema) + is_chat = key == "chat" or _has_messages_input(inputs_schema) + is_agent = key == "agent" @@ return WorkflowFlags( @@ # schema-derived is_chat=is_chat, + is_agent=is_agent, # interface-derived has_url=has_url,

mmabrouk added 4 commits June 19, 2026 18:27

feat(sdk): agent runtime ports, adapters, tool resolution, and messag…

b9e62f9

…es protocol

fix(sdk): validate agent runner configuration

741fc73

refactor(sdk): rename rivet adapter/backend to sandbox-agent

2a7c129

chore(agent): open big-agents integration branch for agent workflows

1a3f330

vercel Bot deployed to Preview June 22, 2026 11:58 View deployment

mmabrouk and others added 2 commits June 22, 2026 14:16

fix(sdk): address review feedback (locking, input validation, stream/…

0beb120

…error handling)

Merge pull request #4771 from Agenta-AI/feat/agent-sdk-runtime

2eed5d0

feat(sdk): agent runtime behind backend/harness ports

vercel Bot deployed to Preview June 22, 2026 12:21 View deployment

coderabbitai Bot reviewed Jun 22, 2026

View reviewed changes

Conversation

mmabrouk commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Integrated PRs

Branch-only (no PR yet)

Notes

Uh oh!

vercel Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mmabrouk commented Jun 22, 2026 •

edited

Loading

vercel Bot commented Jun 22, 2026 •

edited

Loading

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading