Skip to content

chore(hosting): wire the agent runner sidecar into compose#4776

Open
mmabrouk wants to merge 4 commits into
big-agentsfrom
chore/agent-hosting-compose
Open

chore(hosting): wire the agent runner sidecar into compose#4776
mmabrouk wants to merge 4 commits into
big-agentsfrom
chore/agent-hosting-compose

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 19, 2026

Copy link
Copy Markdown
Member

Agent-workflows: functional PR set

Sliced by functional area, final code only (no intermediate churn). Most PRs are independent off main; two pairs are stacked. This PR's base is main.

Context

The agent runner needs a place to live in the dev stack. The Python service in api/ decides per run whether to call a Pi process in its own checkout or an in-network sidecar. Until now the EE dev compose had no sidecar to call. This PR adds one. It branches off main and is the only hosting change in the agent-workflows slice: the runner stack, nothing else.

What this changes

It adds an agent-pi service to hosting/docker-compose/ee/docker-compose.dev.yml. The service builds from services/agent (the sandbox-agent server runtime) and listens on port 8765.

It also wires the services container to reach the sidecar. Four env vars carry the routing and the run defaults:

  • AGENTA_AGENT_PI_URL=http://agent-pi:8765 routes calls to the sidecar over the compose network.
  • AGENTA_AGENT_RUNTIME picks the runtime (default rivet).
  • AGENTA_AGENT_HARNESS picks the harness (default pi).
  • AGENTA_AGENT_SANDBOX picks the sandbox (default local).

The agent-pi service runs with no env_file on purpose. The Pi sandbox must not inherit the stack secrets. It gets only its port, the mounted Pi login, an OTLP export fallback, and the Daytona credentials the SDK reads when the sandbox axis is daytona.

Key architectural decision to review

Confirm the in-network URL wiring lines up with the runner. The service reads AGENTA_AGENT_PI_URL to choose HTTP-to-sidecar over a local subprocess (services/oss/src/agent/app.py:63). The compose value http://agent-pi:8765 (line 398) is the Docker DNS name of the agent-pi service (line 421), so the names must stay in sync.

Confirm the two axes are independent. Harness (pi/claude) and sandbox (local/daytona) are separate choices the Python service sends per run, not a fixed pairing (line 400).

Confirm the licensing note. The comment at line 456 says we ship the snapshot builder, not the snapshot, so we never distribute a Claude-containing image. That matches the posture in services/agent/docker/README.md: Pi (MIT) is baked freely, Claude Code is installed from Anthropic at runtime and never repackaged.

How to review this PR

Read the agent-pi block top to bottom and check the four points above: the matching service name and URL, the empty env_file, the scoped Daytona credentials, and the licensing comment.

The regression to watch: a stack that does not mount the sidecar. If a deploy omits agent-pi but the services container still has AGENTA_AGENT_PI_URL set, every agent run hits an unresolved host. The URL and the service must land together or not at all.

@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jun 19, 2026
@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 22, 2026 11:26am

Request Review

@dosubot dosubot Bot added the devops label Jun 19, 2026
@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 019209d5-4646-4e99-95e7-727544fba33c

📥 Commits

Reviewing files that changed from the base of the PR and between 76ad769 and 14ab328.

📒 Files selected for processing (1)
  • hosting/docker-compose/ee/docker-compose.dev.yml

📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Added a sandbox-agent sidecar to the development stack to enable in-network agent communication, including routing for agent runner traffic.
  • Chores
    • Updated development compose configuration: consolidated web caching into a single Next.js cache, refined API hot-reload watch paths, and simplified Python worker file watching to run recursively under /app/.
    • MCP is now gated off by default via an environment setting.

Walkthrough

The EE dev docker-compose config removes named Next.js and Turbo cache volumes from .web, simplifies uvicorn --reload-dir args for api, collapses four workers' watchmedo watchers to a single recursive /app/ path, adds agent env vars to services, and introduces a new sandbox-agent sidecar service built from services/agent.

Changes

EE Dev docker-compose Updates

Layer / File(s) Summary
Hot-reload and cache volume simplification
hosting/docker-compose/ee/docker-compose.dev.yml
Removes named cache volume mounts from the .web service; updates uvicorn --reload-dir paths in the api service to watch source directories only (adding /app/oss/src, removing database directories); replaces per-directory watchmedo invocations in all four worker services (worker-evaluations, worker-tracing, worker-webhooks, worker-events) with a single recursive /app/ watcher (still ignoring */tests/*); adds nextjs_cache to named volume declarations while removing prior cache entries.
sandbox-agent sidecar service and env wiring
hosting/docker-compose/ee/docker-compose.dev.yml
Adds a .sandbox-agent stub service definition. Introduces a full sandbox-agent service (built from services/agent) that rebuilds an extension and runs tsx src/server.ts on startup, with explicit environment, read-only Pi login mounts and local src/skills volumes, network attachment to agenta-network, and HTTP health checks with restart: always. Updates the services container to route agent requests via AGENTA_AGENT_RUNNER_URL and gates MCP via AGENTA_AGENT_ENABLE_MCP (defaulting to false), adding a depends_on condition to ensure sandbox-agent health before startup.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding and configuring the agent runner sidecar in the compose setup.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining the agent-pi service addition, environment variable configuration, and architectural decisions.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/agent-hosting-compose

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e6864c15-c994-4861-8084-b3f87649d438

📥 Commits

Reviewing files that changed from the base of the PR and between a97e608 and 143edd6.

📒 Files selected for processing (1)
  • hosting/docker-compose/ee/docker-compose.dev.yml

watchmedo auto-restart --directory=/app/ee/src --directory=/app/ee/databases --directory=/app/oss/src
--directory=/app/oss/databases --directory=/app/entrypoints --directory=/sdks/python/agenta
--directory=/clients/python/agenta_client --pattern=*.py --recursive --ignore-patterns=*/tests/* --
watchmedo auto-restart --directory=/app/ --pattern=*.py --recursive --ignore-patterns=*/tests/* --

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect worker watch directories and mounted source directories.
# Expected: worker watchmedo commands include the mounted SDK/client paths or those mounts are intentionally not hot-reloaded.

sed -n '141,264p' hosting/docker-compose/ee/docker-compose.dev.yml

Repository: Agenta-AI/agenta

Length of output: 5516


🏁 Script executed:

# Check structure of SDK and client directories
find ../../../sdks/python -maxdepth 2 -type d 2>/dev/null | head -20
echo "---"
find ../../../clients/python -maxdepth 2 -type d 2>/dev/null | head -20

Repository: Agenta-AI/agenta

Length of output: 64


🏁 Script executed:

# Search for API service hot-reload configuration to understand expected pattern
rg -A 10 "watchmedo.*directory=" hosting/docker-compose/ee/ --no-heading

Repository: Agenta-AI/agenta

Length of output: 4521


🏁 Script executed:

# Check if there are any existing references to watching SDK/client paths
rg "sdks/python|clients/python" hosting/docker-compose/ -B 2 -A 2

Repository: Agenta-AI/agenta

Length of output: 13615


Add SDK and client paths to worker watchmedo scopes.

The four worker services watch only /app/, but they mount SDK and client code at /sdks/python and /clients/python. This means edits to those dependencies will not trigger restarts, unlike the API service in the same file which includes them. Align worker watchers with the OSS version and API service by adding both directories:

♻️ Proposed watcher scope fix
-            watchmedo auto-restart --directory=/app/ --pattern=*.py --recursive --ignore-patterns=*/tests/* --
+            watchmedo auto-restart --directory=/app/ --directory=/sdks/python/agenta --directory=/clients/python/agenta_client --pattern=*.py --recursive --ignore-patterns=*/tests/* --

Apply to lines 146, 183, 220, and 263 (worker-evaluations, worker-tracing, worker-webhooks, worker-events).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
watchmedo auto-restart --directory=/app/ --pattern=*.py --recursive --ignore-patterns=*/tests/* --
watchmedo auto-restart --directory=/app/ --directory=/sdks/python/agenta --directory=/clients/python/agenta_client --pattern=*.py --recursive --ignore-patterns=*/tests/* --

Comment thread hosting/docker-compose/ee/docker-compose.dev.yml Outdated
Comment on lines +444 to +447
# Tracing export fallback (used when a request carries no usable OTLP
# credential). Must be reachable from this container.
AGENTA_HOST: ${AGENTA_HOST:-http://144.76.237.122:8280}
AGENTA_API_KEY: ${AGENTA_API_KEY:-}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not default telemetry to a public HTTP IP.

Line 446 sends the sidecar’s fallback tracing export to http://144.76.237.122:8280 whenever AGENTA_HOST is unset. For a dev stack, that can leak local trace data/metadata to an external endpoint by default.

🛡️ Proposed safer default
-            AGENTA_HOST: ${AGENTA_HOST:-http://144.76.237.122:8280}
+            # Set AGENTA_HOST explicitly when exporting traces outside the local stack.
+            AGENTA_HOST: ${AGENTA_HOST:-}

If tracing must work out of the box, default this to an in-stack service URL instead of a public IP.

@mmabrouk

Copy link
Copy Markdown
Member Author

Reviewer guide: interesting code

A dev-compose-only change. The spots worth a close look:

  • hosting/docker-compose/ee/docker-compose.dev.yml:398AGENTA_AGENT_PI_URL: http://agent-pi:8765 wires the services container to the sidecar in-network. The host and port must match the agent-pi service name and its PORT. This is read by services/oss/src/agent/app.py.
  • hosting/docker-compose/ee/docker-compose.dev.yml:402AGENTA_AGENT_HARNESS and AGENTA_AGENT_SANDBOX set the per-run defaults for the two independent axes (harness pi/claude, sandbox local/daytona). A request can override either. Defaults are also resolved in services/agent/src/engines/rivet.ts:679.
  • hosting/docker-compose/ee/docker-compose.dev.yml:436 — "Deliberately NO env_file": the sidecar never inherits the stack secrets. It only gets its own port, the Pi login, the OTLP fallback, and the Daytona credentials. Worth confirming this scoping is intended.
  • hosting/docker-compose/ee/docker-compose.dev.yml:458AGENTA_RIVET_DAYTONA_SNAPSHOT points at a prebaked snapshot so Daytona runs skip a slow per-invoke npm install pi. The repo ships the builder, not the snapshot, so no Claude-containing image is distributed.

@mmabrouk

Copy link
Copy Markdown
Member Author

Reviewer guide: interesting code

A short tour of the parts worth a close read. All paths are in hosting/docker-compose/ee/docker-compose.dev.yml.

  • hosting/docker-compose/ee/docker-compose.dev.yml:398 — the in-network wiring. AGENTA_AGENT_PI_URL: http://agent-pi:8765 is what flips the runner from a local subprocess to the sidecar (services/oss/src/agent/app.py:63). The host part must equal the service name below.
  • hosting/docker-compose/ee/docker-compose.dev.yml:400 — the axes note. Harness (pi/claude) and sandbox (local/daytona) are independent choices the Python service sends per run. The defaults here are just dev defaults, not a fixed pairing.
  • hosting/docker-compose/ee/docker-compose.dev.yml:421 — the agent-pi service. Note the deliberate absence of env_file: the sandbox does not inherit stack secrets. It gets its port, the mounted Pi login, an OTLP fallback, and the Daytona keys only.
  • hosting/docker-compose/ee/docker-compose.dev.yml:456 — the licensing comment. We ship the snapshot builder, not a built snapshot, so we never distribute a Claude-containing image. Cross-check against services/agent/docker/README.md.

environment:
DOCKER_NETWORK_MODE: ${DOCKER_NETWORK_MODE:-bridge}
# Agent workflow: reach the agent runner sidecar in-network.
AGENTA_AGENT_PI_URL: http://agent-pi:8765

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the single line that routes runs to the sidecar. The runner reads AGENTA_AGENT_PI_URL and only goes over HTTP when it is set; unset means a local subprocess (services/oss/src/agent/app.py:63). The agent-pi host here must match the service name at line 421, or every run hits an unresolved host.

# === LIFECYCLE ============================================ #
restart: always

agent-pi:

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent-pi service has no env_file by design. The sandbox must not inherit the stack secrets (Composio, Stripe, PostHog, Google). It gets only its port, the mounted Pi login, the OTLP fallback, and the Daytona keys for the daytona sandbox axis. Tools run server-side via /tools/call, so the sandbox never needs the broad secret set. Worth confirming this stays empty as the stack grows.

@mmabrouk mmabrouk left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex subagent review for #4776

Blocking finding:

  • hosting/docker-compose/ee/docker-compose.dev.yml:426 / :427 adds agent-pi with build.context: ../../../services/agent and dockerfile: docker/Dockerfile.dev, but this PR is based on main and services/agent/* is not present on main. I checked the related runner PR #4778, and that is where services/agent/package.json, services/agent/src/server.ts, and services/agent/docker/Dockerfile.dev are introduced. As-is, if #4776 merges independently as advertised, ./hosting/docker-compose/run.sh --build --license ee --dev ... fails before the dev stack can start because the build context/dockerfile is missing. Please either stack/retarget this PR on the runner PR, enforce merge order by landing #4778 first, or include/gate the sidecar so the compose file remains buildable from its declared base.

Context checked:

  • hosting/docker-compose/ee/docker-compose.dev.yml:398 matches the #4772 service contract: AGENTA_AGENT_PI_URL selects the HTTP sidecar transport, and http://agent-pi:8765 matches the new service name and runner server port from #4778.
  • hosting/docker-compose/ee/docker-compose.dev.yml:421 keeps the sidecar outside the stack env_file, which looks like the right secret boundary. The explicit Daytona and trace envs are scoped rather than inheriting the full API/service secret set.
  • Residual risk after the stack/base issue is fixed: because AGENTA_AGENT_PI_URL is unconditional, any override/profile that drops or fails agent-pi leaves services configured to call an unavailable host. A healthcheck/depends_on or tying the env var to the sidecar inclusion would make that failure mode clearer. I did not mark that separately because the sidecar is present in this file and existing review comments already called out startup ordering.

I could not submit this as REQUEST_CHANGES because GitHub rejects change requests on your own PR for the authenticated account, but I would treat the missing runner build context as blocking.

I did not run the compose stack locally; this review used GitHub PR metadata, patches, and files plus the #4779 ground-truth/stack docs and the #4772/#4778 contracts.

Copy link
Copy Markdown
Member Author

Codex subagent review for #4776 follow-up

Additional review-map issue:

I checked the compose diff itself and did not find comments that directly name #4774/#4778; the stale references are in the PR body/review guidance. This reinforces the earlier blocking finding that #4776 is not independently buildable from its declared main base unless #4778 lands first or this PR is retargeted/stacked.

mmabrouk added a commit that referenced this pull request Jun 20, 2026
Adds docs/design/agent-workflows/qa/: the autohealing QA recipe (README), the Gherkin
scenario matrix with a live scoreboard, the findings log (F-001..F-010 in the open-issues
style), a reusable /invoke driver with captured runs, and the regression-test research plus
the replay-test skill draft. Produced by a live end-to-end QA pass across the harness x
environment x capability matrix; it documents and motivates the runner fixes in the sibling
PRs (#4776, #4778).

Claude-Session: https://claude.ai/code/session_01KsGSJQwsUdgWcNSEt2P2qD
mmabrouk added a commit that referenced this pull request Jun 20, 2026
Adds docs/design/agent-workflows/qa/: the autohealing QA recipe (README), the Gherkin
scenario matrix with a live scoreboard, the findings log (F-001..F-010 in the open-issues
style), a reusable /invoke driver with captured runs, and the regression-test research plus
the replay-test skill draft. Produced by a live end-to-end QA pass across the harness x
environment x capability matrix; it documents and motivates the runner fixes in the sibling
PRs (#4776, #4778).

Claude-Session: https://claude.ai/code/session_01KsGSJQwsUdgWcNSEt2P2qD
mmabrouk added a commit that referenced this pull request Jun 20, 2026
Adds docs/design/agent-workflows/qa/: the autohealing QA recipe (README), the Gherkin
scenario matrix with a live scoreboard, the findings log (F-001..F-010 in the open-issues
style), a reusable /invoke driver with captured runs, and the regression-test research plus
the replay-test skill draft. Produced by a live end-to-end QA pass across the harness x
environment x capability matrix; it documents and motivates the runner fixes in the sibling
PRs (#4776, #4778).

Claude-Session: https://claude.ai/code/session_01KsGSJQwsUdgWcNSEt2P2qD
The dev agent-pi compose command replaces the image CMD, so the extension bundle was never
rebuilt and went stale, silently dropping custom tools on the Rivet path (QA finding F-005).
Rebuild it from the mounted src on start.

Claude-Session: https://claude.ai/code/session_01KsGSJQwsUdgWcNSEt2P2qD
mmabrouk added a commit that referenced this pull request Jun 20, 2026
Adds docs/design/agent-workflows/qa/: the autohealing QA recipe (README), the Gherkin
scenario matrix with a live scoreboard, the findings log (F-001..F-010 in the open-issues
style), a reusable /invoke driver with captured runs, and the regression-test research plus
the replay-test skill draft. Produced by a live end-to-end QA pass across the harness x
environment x capability matrix; it documents and motivates the runner fixes in the sibling
PRs (#4776, #4778).

Claude-Session: https://claude.ai/code/session_01KsGSJQwsUdgWcNSEt2P2qD
@mmabrouk mmabrouk changed the base branch from main to big-agents June 22, 2026 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant