feat(channels): add extensible channel abstraction with generic webho…#327
feat(channels): add extensible channel abstraction with generic webho…#327danielmillerp wants to merge 1 commit into
Conversation
2e41d47 to
744749b
Compare
✱ Stainless preview buildsThis PR will update the openapi python typescript Edit this comment to update them. They will appear in their respective SDK's changelogs. ✅ agentex-sdk-typescript studio · code · diff
✅ agentex-sdk-python studio · code · diff
✅ agentex-sdk-openapi studio · code · diff
This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push. |
| agent = await agents_use_case.get(name=binding.agent_name) | ||
|
|
||
| inbound = _WEBHOOK.to_inbound(route_id, body) | ||
| result = await ChannelRouter(agents_acp_use_case).dispatch( | ||
| inbound, binding, agent.acp_type |
There was a problem hiding this comment.
Unhandled agent-not-found propagates to webhook callers
agents_use_case.get(name=binding.agent_name) is called without error handling. If the agent_name in CHANNELS_WEBHOOK_ROUTES doesn't match a registered agent, the exception thrown by the use case will propagate directly to the webhook caller — either as an AttributeError (if get() returns None and agent.acp_type is dereferenced) or as a misrouted 404 that looks identical to an unknown route error. A misconfigured env var would cause every request to that route to fail with a confusing 500/404, with no log message pointing at the root cause. Wrapping this in a try/except and returning a clear 500 or 503 would make misconfiguration much easier to diagnose.
Prompt To Fix With AI
This is a comment left during a code review.
Path: agentex/src/api/routes/channels.py
Line: 95-99
Comment:
**Unhandled agent-not-found propagates to webhook callers**
`agents_use_case.get(name=binding.agent_name)` is called without error handling. If the `agent_name` in `CHANNELS_WEBHOOK_ROUTES` doesn't match a registered agent, the exception thrown by the use case will propagate directly to the webhook caller — either as an `AttributeError` (if `get()` returns `None` and `agent.acp_type` is dereferenced) or as a misrouted 404 that looks identical to an unknown route error. A misconfigured env var would cause every request to that route to fail with a confusing 500/404, with no log message pointing at the root cause. Wrapping this in a try/except and returning a clear 500 or 503 would make misconfiguration much easier to diagnose.
How can I resolve this? If you propose a fix, please make it concise.…ok ingress
Introduce a channel layer that normalizes external surfaces into agent turns and,
for channels that respond, delivers the agent's reply back. The agent-driving core
is channel-agnostic; a new channel (e.g. Slack) implements one interface without
touching the router.
- domain/channels/base.py: InboundMessage, ChannelBinding (route -> agent + an OPAQUE
params dict forwarded verbatim to task/create; the platform does not interpret it),
timing-safe shared-secret / HMAC-SHA256 verifiers, and the Channel ABC. Ingress:
authenticate + to_inbound. Outbound (optional, OpenClaw-style): deliver + chunk
(textChunkLimit), plus deliver_reply() — the buffered dispatcher that chunks a reply
and delivers each block. A plain webhook leaves outbound unset (its reply is the
HTTP response); push channels (Slack) set supports_outbound and implement deliver.
- domain/channels/router.py: ChannelRouter.dispatch() — task/create (get-or-create on
a per-conversation session key) + the turn, branching on ACP type (sync: message/send
returns the reply inline; async: event/send, reply lands on the stream). await_reply()
retrieves an async agent's settled reply for responding channels.
- domain/channels/webhook.py: WebhookChannel — generic HTTP ingress with a per-route
secret. Accepts Authorization: Bearer / x-openclaw-webhook-secret, or an
X-Hub-Signature-256 HMAC (GitHub/Gitea). Generic JSON normalization.
- api/routes/channels.py: POST /channels/webhook/{route_id} (+ optional ?wait to return
the reply for synchronous callers). Route bindings load from CHANNELS_WEBHOOK_ROUTES
env for now (seam for a future config store).
- Whitelist /channels/webhook in the auth middleware so it bypasses the agent API-key
check and verifies its own per-route secret instead; register the router.
- Regenerate openapi.yaml.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
744749b to
239fb1a
Compare
…ok ingress
Introduce a channel layer that normalizes external surfaces into agent turns, so platforms beyond the chat UI can drive agents. The agent-driving core is channel-agnostic; a new channel (e.g. Slack) implements one interface without touching the router.
task/create (get-or-create on a per-conversation session key) + event/send via the in-process ACP use case.
Greptile Summary
This PR introduces a channel abstraction layer that normalises external HTTP surfaces (webhooks, and future platforms like Slack) into agent turns, keeping the agent-driving core channel-agnostic. A new channel implements two methods (
authenticate+to_inbound) without touching the router or existing API paths.domain/channels/—InboundMessage,ChannelABC,ChannelBinding, timing-safe HMAC-SHA256 and shared-secret helpers, andChannelRouter(task get-or-create + sync/async dispatch + quiescence-based reply polling).domain/channels/webhook.py—WebhookChannelacceptingAuthorization: Bearer,x-openclaw-webhook-secret, orX-Hub-Signature-256; generic JSON text normalization with an 8 KB fallback.api/routes/channels.py—POST /channels/webhook/{route_id}with size/auth/content-type guards and an optionalwaitparam for synchronous callers; route bindings loaded fromCHANNELS_WEBHOOK_ROUTESenv var.middleware_utils.py—/channels/webhookwhitelisted from agentex API-key auth so per-route secret verification takes over.Confidence Score: 5/5
Safe to merge — auth helpers are timing-safe, the middleware whitelist correctly scopes only webhook sub-paths, and the sync/async dispatch paths are structurally sound.
No correctness defects were found in the changed code. The auth branching (HMAC-SHA256 preferred over bearer), the get-or-create session-key continuity, and the after_id cursor for reply isolation all work as intended. The two flagged items are quality improvements, not bugs that produce wrong behavior.
No files require special attention before merging, though router.py and routes/channels.py have two findings worth addressing in a follow-up.
Important Files Changed
Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant Caller as Webhook Caller participant MW as Auth Middleware participant Route as POST /channels/webhook/{route_id} participant WC as WebhookChannel.authenticate() participant CR as ChannelRouter.dispatch() participant ACP as AgentsACPUseCase participant MS as TaskMessageService Caller->>MW: "POST /channels/webhook/{route_id}" MW->>MW: is_whitelisted_route() skip API-key check MW->>Route: forward request Route->>Route: _webhook_binding(route_id) ChannelBinding Route->>WC: authenticate(binding, request, raw_body) WC-->>Route: True / False (HMAC or bearer) Route->>CR: dispatch(inbound, binding, acp_type) CR->>ACP: TASK_CREATE (get-or-create, session_key) ACP-->>CR: task_id alt "acp_type == SYNC" CR->>ACP: MESSAGE_SEND(task_id, content) ACP-->>CR: reply messages inline CR-->>Route: "DispatchResult(reply=text)" else "acp_type == ASYNC" CR->>MS: "get_messages(limit=1) after_id" CR->>ACP: EVENT_SEND(task_id, content) CR-->>Route: "DispatchResult(reply=None, after_id)" end alt "wait=True and reply is None" loop every 2s until stable 6s or 120s timeout Route->>MS: get_messages(after_id, asc) MS-->>Route: agent messages end end Route-->>Caller: ok, task_id, reply?%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant Caller as Webhook Caller participant MW as Auth Middleware participant Route as POST /channels/webhook/{route_id} participant WC as WebhookChannel.authenticate() participant CR as ChannelRouter.dispatch() participant ACP as AgentsACPUseCase participant MS as TaskMessageService Caller->>MW: "POST /channels/webhook/{route_id}" MW->>MW: is_whitelisted_route() skip API-key check MW->>Route: forward request Route->>Route: _webhook_binding(route_id) ChannelBinding Route->>WC: authenticate(binding, request, raw_body) WC-->>Route: True / False (HMAC or bearer) Route->>CR: dispatch(inbound, binding, acp_type) CR->>ACP: TASK_CREATE (get-or-create, session_key) ACP-->>CR: task_id alt "acp_type == SYNC" CR->>ACP: MESSAGE_SEND(task_id, content) ACP-->>CR: reply messages inline CR-->>Route: "DispatchResult(reply=text)" else "acp_type == ASYNC" CR->>MS: "get_messages(limit=1) after_id" CR->>ACP: EVENT_SEND(task_id, content) CR-->>Route: "DispatchResult(reply=None, after_id)" end alt "wait=True and reply is None" loop every 2s until stable 6s or 120s timeout Route->>MS: get_messages(after_id, asc) MS-->>Route: agent messages end end Route-->>Caller: ok, task_id, reply?Prompt To Fix All With AI
Reviews (3): Last reviewed commit: "feat(channels): add extensible channel a..." | Re-trigger Greptile