[codex] Add tool-driven Aevatar core invocation sources#830
Conversation
Records the architectural decision to collapse ChatRouteAction to Reject + ForwardToModel, exposing GAgent/Team/Workflow invocation as IAgentToolSource tools through the existing ToolCallLoop. Supersedes ADR-0024 §D5 (v1 action set) and ADR-0025 (voice v1 ForwardToGAgent); ADR-0024 D1/D2/D3/D4/D6 stand. Tracked end-to-end in epic #808; voice GA prerequisite in #809. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements ADR-0026 Stage 1 unit-1 (epic #808). New project src/Aevatar.AI.ToolProviders.AevatarInvocation/ exposes aevatar_invoke_gagent / _invoke_team / _start_workflow / _observe_run / _query_readmodel as IAgentToolSource, so the LLM can drive orchestration through the existing ToolCallLoop instead of parallel router branches. Design: - Tool payloads are proto-derived strict JSON-Schema (no map<string,string> bags) - wait=ack|stream|complete supported; stream is default for long-running tools; GAgent/workflow wait=complete returns wait_complete_unavailable until Stage 2 session actor lands - Caller scope flows through AgentToolRequestContext only; protected caller-scope keys (LLMRequestMetadataKeys.*) are stripped from LLM-supplied payload.headers before server values are stamped, so the LLM cannot inject overrides for nyxid.access_token / scope_id / owner_subject etc. - query_readmodel is bounded to a closed registered set - Dispatch reuses existing surfaces (IActorDispatchPort, ITeamEntryMemberResolver + IStaticGAgentStreamInvocationPort<AGUIEvent>, ICommandDispatchService<WorkflowChatRunRequest,...>); no new dispatch chain 21 tests pass (4 credential-injection regression + 1 ObserveRun fast-fail added in post-review hardening); arch_guards + test_stability + docs lint all PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements ADR-0026 Stage 1 unit-2 (D7 prerequisite) for the Lark
outbound caller-scope guarantee. After auditing the existing path
(LarkMessagesSendTool → LarkNyxClient → NyxIdApiClient) no production
refactor was required: the tool already reads
AgentToolRequestContext.NyxIdAccessToken (no credential parameters) and
forwards the caller bearer through NyxID's api-lark-bot proxy, which
exchanges to a Lark tenant_access_token without seeing the caller's
authorization header. The metadata-bag credential-injection surface that
unit-1 had to harden is structurally absent here (no headers/metadata
bag at the dispatch boundary).
Added 2 regression tests:
- Asserts the dispatched NyxID call carries AgentToolRequestContext's
trusted typed NyxIdAccessToken
- Asserts a malicious LLM payload (smuggled nyx_id_access_token, fake
headers, ExternalMetadata overriding LLMRequestMetadataKeys.NyxIdAccessToken)
cannot override the trusted caller token at dispatch
NyxID investigation summary (verified via gh against ChronoAIProject/NyxID
backend source): /api/v1/proxy/s/api-lark-bot/open-apis/im/v1/messages
accepts only the caller's NyxID bearer; NyxID resolves caller's
api-lark-bot binding, exchanges {app_id, app_secret} → tenant_access_token
per channel_adapters/lark.rs::lark_family_token_exchange_config, strips
the inbound authorization, and injects bearer for outbound to Lark.
Semantic: messages post as the caller's bound Lark bot (NyxID-mediated),
not as the human user's OAuth identity and not as Aevatar's service-level
identity. This satisfies ADR-0026 §D7's "lands in the caller's Lark
account" use case.
61/61 tests pass; arch_guards + test_stability + docs lint all PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes ADR-0026 Stage 1 (epic #808). Integration test demonstrates the new tool-first ingress path works end-to-end after units 1+2 landed, without touching any production code. Test: MainnetResponsesEndpointsTests.PostResponses_StreamWithAevatarInvokeGAgentAdditiveTool_ShouldDispatchActorEnvelope Scenario: - /v1/responses streamed request with real DI registration of unit-1's AddAevatarInvocationTools (5 production IAgentToolSource instances) - Stubbed LLM emits aevatar_invoke_gagent tool call with a malicious payload that smuggles nyxid.access_token + aevatar.scope_id overrides - ResponsesCompletionApplicationService executes the local tool call inline (not as function_call SSE output — verified against production StreamAsync behavior) - AevatarInvocationDispatcher dispatches through IActorDispatchPort (captured by RecordingActorDispatchPort) - LLM round 2 continues after tool result, SSE lifecycle completes Assertions: - Dispatched envelope's Route.PublisherActorId == DirectGAgentPublisherId - Dispatched ChatRequestEvent.Headers carry the trusted bearer/scope (caller-scope protection from unit-1 verified end-to-end) - ThrowingStaticGAgentStreamInvocationPort.InvocationCount == 0 (the legacy ForwardToGAgent/ForwardToTeam path in ResponsesEndpoints.cs:779-927 is NOT entered) 202/202 tests pass in Aevatar.Hosting.Tests; arch_guards + test_stability_guards + docs lint all PASS. No production code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## dev #830 +/- ##
==========================================
+ Coverage 83.06% 83.07% +0.01%
==========================================
Files 981 981
Lines 61936 61936
Branches 8069 8069
==========================================
+ Hits 51447 51454 +7
+ Misses 7009 6996 -13
- Partials 3480 3486 +6
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
…hint and add ToolSetRegistry Stage 2 Unit 1 of ADR-0026 (epic #808). Lays the proto + composition foundation that Stage 2 Units 2 (resolver translation) and 3 (ChatRunActor) consume. Proto changes (src/Aevatar.ChatRouting.Abstractions/chat_route_policy.proto): - ForwardToModel.tool_set_ref (field 2) — ChatRouteToolSetRef typed message - ForwardToModel.tool_choice_hint (field 3) — ChatRouteToolChoiceHint typed message with google.protobuf.Struct prefilled_arguments (NOT map<string,string> per CLAUDE.md typed-proto rule) - No oneof reshape; legacy ForwardToGAgent/Team/Workflow variants untouched New project src/Aevatar.AI.ToolProviders.ToolSetRegistry/: - IToolSetRegistry: resolves ChatRouteToolSetRef.name to IAgentToolSource list at boundary time via DI factories (NOT cached at registration — preserves per-request scope for tool sources carrying caller context) - ToolSetResolveResult / ToolSetResolveError: structured errors (tool_set_name_required / unknown_tool_set), no exceptions for normal failure - ResponsesAevatarToolProvider now also implements IAgentToolSource so existing Responses substitute tools participate in named composition without changing the IResponsesToolProvider path Three default named sets (registered in MainnetHostBuilderExtensions): - workspace.default: comprehensive (Stage 1 invocation + substitute/additive tools + NyxID/Lark/Telegram/ChronoStorage/Web) - lark.self_notify: minimal (lark_messages_send + aevatar_query_readmodel) — for ADR-0026 §D7 "push to my Lark" use case - voice.realtime: placeholder for Stage 5 (currently same shape as workspace.default; tightens when voice convergence lands) Argument-merge policy documented for Unit 2 boundary code: server-set prefilled_arguments are trusted route policy; LLM-supplied arguments that conflict on the same key MUST be rejected, not silently overwritten. 10 tests added across 3 test projects (registry resolve / proto round-trip / Mainnet DI composition incl. lark.self_notify minimal-set assertion). Build + arch_guards + test_stability + docs lint all PASS. No ChatRouteResolver changes (Unit 2). No ChatRunActor (Unit 3). No legacy deletion (Stage 4). No external repo changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reconciles ADR-0026 Stage 2 unit-1 with concurrent fix(status-probes) work on feature/core-loop. Both modified MainnetHostBuilderExtensions.cs DI chain; both additions kept (StatusProbeAuthorizationResolver + AddToolSetRegistry). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The loop maintains state, prompts, logs, reviews, run summaries, and worktrees under .implement-loop/ at the repo root. They are session-local artifacts, not source of truth; ignoring them prevents accidental staging during merges back into the working branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ool_choice_hint
Stage 2 Unit 2 of ADR-0026. ChatRouteResolver now performs runtime
translation so existing policy rules emitting legacy ForwardToGAgent /
ForwardToTeam / ForwardToWorkflow continue to work while the consumer
side (Unit 3's ChatRunActor and the existing /v1/responses path) only
needs to understand ForwardToModel + tool_choice_hint. Persisted policy
proto is unchanged; Stage 4 will delete the legacy oneof variants.
Translations (in-memory only, ChatRouteDecision stays transient per
ADR-0024 D2):
ForwardToGAgent(actor_id) → ForwardToModel {
tool_set_ref = workspace.default
tool_choice_hint = {
tool_name = aevatar_invoke_gagent
prefilled = { actor_id }
}
}
ForwardToTeam(team_id, endpoint) → ForwardToModel {
tool_set_ref = workspace.default
tool_choice_hint = {
tool_name = aevatar_invoke_team
prefilled = { team_id, endpoint_id,
[scope_id if non-empty] }
}
}
ForwardToWorkflow(workflow_id) → ForwardToModel {
tool_set_ref = workspace.default
tool_choice_hint = {
tool_name = aevatar_start_workflow
prefilled = { workflow_id }
}
}
ForwardToModel → pass-through (preserve new fields)
Reject → pass-through
ChatRoutePolicyMigrator.MigrateLegacyActions(snapshot): pure in-memory
transform that rewrites legacy actions to the canonical shape; idempotent
on already-new-shape snapshots. Persistence flow is Stage 3 work.
Tool-set name sourced from ToolSetNames.WorkspaceDefault (no magic strings).
prefilled_arguments built as google.protobuf.Struct (NOT map<string,string>)
via Value.ForString. voice_module_name from legacy ForwardToGAgent is NOT
emitted because aevatar_invoke_gagent proto has no such field.
Known asymmetry to address in Unit 3: scope_id is emitted for
aevatar_invoke_team but InvokeTeamToolRequest proto has no scope_id field;
the boundary consumer must strip unknown args before JsonParser.Default.Parse
or surface scope_id through a different channel.
36/36 ChatRouting.Core.Tests pass (16 new); build + arch_guards +
test_stability all PASS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… integration Stage 2 Unit 3 of ADR-0026 (epic #808). Closes Stage 2. ChatRunActor (src/platform/Aevatar.GAgentService.Core/GAgents/ChatRunActor.cs): - Business-named, session-scoped (keyed chat-run:{response_id}) - Typed proto State (ChatRunState in llm_sessions.proto): LLM context + tool_call_history + active_sub_run_subscriptions as repeated typed ChatRunSubRunSubscription (NOT Dictionary — CLAUDE.md 中间层状态约束) - Event-sourced via PersistDomainEventAsync + reducer - Single-threaded event loop (no lock / ConcurrentDictionary / GetAwaiter) - Self-continuation via PublishAsync(..., TopologyAudience.Self) Cross-actor sub-run observation (the architectural pattern this unit settles): - ChatRunActor itself owns the observation via IStreamProvider.GetStream(targetActorId).UpsertRelayAsync( StreamForwardingMode.HandleThenForward, EventTypeFilter:CommittedStateEventPublished) — mirrors src/Aevatar.CQRS.Projection.Core/Orchestration/ProjectionScopeGAgentBase precedent - Forwarded envelopes consumed via [AllEventHandler], correlated by run_id + publisher_actor_id, fold terminal result on actor event loop, publish typed ChatRunToolResultReady on self-stream - ChatRunToolCompletionCoordinator (Application layer) only awaits typed ChatRunToolResultReady through IChatRunActorPort — NO raw SubscribeAsync<EventEnvelope> from Application (round-1 review caught the original violation; round-2 verified the fix) Boundary integration in ResponsesEndpoints + ResponsesCompletionApplicationService: - Resolves ForwardToModel.tool_set_ref via IToolSetRegistry → turn's additive tool set - tool_choice_hint pinned to tool_name; prefilled_arguments stamped on; LLM arguments conflicting on prefilled keys → structured tool_choice_prefill_conflict error so LLM self-corrects (Stage 1 unit-1 caller-scope hardening pattern extended to this layer) - Only aevatar_invoke_gagent/_invoke_team/_start_workflow with wait=complete route through ChatRunActor; everything else (wait=ack/stream, other tools) keeps current inline behavior — minimal viable integration scope_id asymmetry (unit-2 review concern) resolved via ProtoToolArguments.Parse(WithIgnoreUnknownFields(true)) — scoped to invocation- tool-request parsing only, NOT global. caller-scope channel via dispatcher metadata is authoritative; inline scope_id from legacy ForwardToTeam is silently dropped from InvokeTeamToolRequest parsing. Regression test asserts the dispatched envelope still has correct caller scope. Stage 1 wait_complete_unavailable migrated: AevatarInvocationDispatcher no longer rejects wait=complete; dispatches and returns receipt; completion is folded by ChatRunActor when the path goes through coordinator. Regression test updated accordingly. Test for user-owned ResponsesForwardTeamInternalProbeExecutorTests.cs was renamed + reasserted to reflect Stage 2 unit-2's cascade: resolver now translates legacy ForwardToTeam → ForwardToModel + tool_choice_hint, so the probe correctly reports Down for that route shape until the user migrates the probe's expectation. Production probe code untouched. Explanatory comment added linking to ChatRoutePolicyMigrator + ADR-0026 D2. 814/814 tests pass (36+23+207+548 across 4 affected projects). SubscribeAsync<EventEnvelope> ban in dispatch_projection_boundary_guard passes when arch_guards run from worktree (canonical CI invocation). Non-blocking follow-up: RemoveRelayAsync not called on ChatRunActor HandleTerminateAsync — bounded leak per session, see review round-2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ndpoint Stage 3 Unit 1 of ADR-0026 (epic #808). Makes Stage 4's upcoming deletion of ForwardToGAgent/Team/Workflow safe by giving operators visible signal and a runtime-accessible migration tool. Resolver deprecation signaling: - ChatRouteDecision.deprecations: repeated ChatRouteDeprecation typed proto sub-messages (NOT map<string,string>); transient per ADR-0024 D2 - ChatRouteDeprecation carries: code, message, action_kind, matched_rule_id, translated_target - ChatRouteResolver emits LogWarning chat_route_legacy_action_used with structured fields when a rule resolves to a legacy action (one log per matched legacy rule per resolve call, not per request) - Covers both rule-match and default_target paths Consumer header propagation (RFC 9745 Deprecation + RFC 9110 Warning): - ApplyChatRouteDeprecationHeaders helper called from ResponsesEndpoints and MessagesEndpoints after route resolution - Sets `Deprecation: true` + `Warning: 299 - "chat_route_legacy_action_used: <details>"` - Voice boundary log-only (WebSocket can't carry response headers after upgrade) Migration helper exposure: - New admin endpoint POST /api/scopes/{scopeId}/chat-route-policy/migrate - Dry-run default: returns the migrated UpsertChatRoutePolicyRequested payload - ?apply=true: dispatches as single atomic UpsertChatRoutePolicyRequested command to ChatRoutePolicyGAgent (ADR-0024 D5: no temporary invalid state) - Reuses existing chat-route-policy admin auth (scope ownership / admin role) Tests cover: resolver deprecation per legacy action variant + empty for ForwardToModel + structured warning fields; admin endpoint dry-run + apply; SSE response Deprecation+Warning headers for legacy ForwardToGAgent. No deletion of legacy proto variants (Stage 4). No actor changes. No external repo changes. Build + ChatRouting.Core.Tests + Hosting.Tests + test_stability_guards + architecture_guards (through workflow_binding boundary) all PASS canonically from worktree; playground_asset_drift_guard env-blocked in worktree per known infrastructure gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stage 3 Unit 2 of ADR-0026 (follow-up from Stage 2 unit-3 review). Closes the bounded forwarding-registry-binding leak per chat run session. Mirrors src/Aevatar.CQRS.Projection.Core/Orchestration/ProjectionScopeGAgentBase RemoveObservationRelayAsync precedent: ChatRunActor.HandleTerminateAsync snapshots active_sub_run_subscriptions before persisting ChatRunTerminatedEvent (which clears them via the existing reducer), then iterates the snapshot and calls IStreamProvider.GetStream(targetActorId).RemoveRelayAsync(Id, ct) for each target. Cleanup runs sequentially on the actor event loop (no Task.Run / Timer); failures are best-effort with Logger.LogWarning; iteration continues; state remains cleared regardless. OperationCanceledException not swallowed. Regression tests (2 new in ChatRunActorTests): - Terminate_ShouldRemoveRelayAsync_ForEachActiveSubscription: N=2 active subscriptions, observable IStreamProvider, asserts RemoveRelayAsync called per target_actor_id + state cleared - Terminate_WhenRemoveRelayThrows_ShouldStillClearState: simulated throw, asserts cleanup attempted for all targets, warnings logged, no exception propagated, state cleared 550/550 GAgentService.Tests pass; arch_guards + test_stability PASS canonically from worktree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| CancellationToken ct) | ||
| { | ||
| if (!string.IsNullOrWhiteSpace(request.ActorId)) | ||
| return ActorTargetResolution.Success(request.ActorId.Trim()); |
There was a problem hiding this comment.
这里不能直接信任调用方传入的 actor_id。actor_name 分支会通过 IGAgentActorRegistryQueryPort.ListActorsAsync(scope.ScopeId, ...) 限定在 caller scope 内解析,但 actor_id 分支绕过了这层归属校验,知道或猜到其他 scope 的 actor id 后就可以让 aevatar_invoke_gagent 直接投递到那个 actor。建议对 actor_id 也用 caller scope 的 registry/readmodel 做 membership 校验,或只允许使用按 scope 解析出的目标 id。
|
|
||
| _ = ObserveDetachedInvocationAsync(invocationTask); | ||
|
|
||
| var accepted = await acceptedSource.Task.WaitAsync(ct); |
There was a problem hiding this comment.
InvokeTeamToAcceptanceAsync 只等待 acceptedSource.Task,但 invocationTask 失败或返回 Succeeded=false 且没有触发 onAcceptedAsync 时,这个 TCS 永远不会完成。比如 service resolution/admission 在 _teamInvocationPort.InvokeAsync 内失败,ObserveDetachedInvocationAsync 只是记录日志,调用方会一直挂到请求取消/超时,而不是拿到结构化 tool error。这里需要把 invocationTask 的失败/未 accepted 结果也 race/propagate 到 acceptedSource,或直接等待 invocation task 并在未 accepted 时返回错误。
|
补充一下:我本地按当前分支 HEAD(比 PR 远端 head 多 7 个提交)跑 |
背景
这个 PR 是 ADR-0026 的 Stage 1:把 aevatar 的核心能力重新定位为 LLM tool source,让模型通过 function call 主动选择何时使用 workflow、GAgent、team、readmodel observation 等能力,而不是继续在入口层维护
ForwardToGAgent/ForwardToTeam这类并行路由方言。改动
docs/adr/0026-tool-first-chat-ingress.md,明确 tool-first chat ingress 的目标、边界和后续阶段。Aevatar.AI.ToolProviders.AevatarInvocation,提供 5 个 invocation tools:aevatar_invoke_gagentaevatar_invoke_teamaevatar_start_workflowaevatar_observe_runaevatar_query_readmodelAevatarInvocationDispatcher,统一做 proto 参数解析、caller scope 注入、调度、readmodel 查询与结构化错误返回。IAgentToolSource发现链路。AgentToolRequestContext.NyxIdAccessToken,payload/外部 metadata 不能覆盖调用者凭据。/v1/responsesE2E 测试,证明模型发出的aevatar_invoke_gagentadditive tool call 会走 tool loop 并通过IActorDispatchPort投递 actor envelope,而不是走 legacyForwardToGAgent静态调用链路。影响
wait=complete仍返回结构化wait_complete_unavailable;当前阶段支持ack/stream,后续由 session actor/观察链路承接长任务 continuation。aevatar_query_readmodel只允许查询封闭集合 readmodel,不开放任意 document collection。验证
dotnet test test/Aevatar.AI.ToolProviders.AevatarInvocation.Tests/Aevatar.AI.ToolProviders.AevatarInvocation.Tests.csproj --nologo:通过,21 passed。dotnet test test/Aevatar.AI.ToolProviders.Lark.Tests/Aevatar.AI.ToolProviders.Lark.Tests.csproj --nologo:通过,61 passed。dotnet test test/Aevatar.Hosting.Tests/Aevatar.Hosting.Tests.csproj --filter FullyQualifiedName~PostResponses_StreamWithAevatarInvokeGAgentAdditiveTool_ShouldDispatchActorEnvelope --nologo:通过,1 passed。bash tools/ci/test_stability_guards.sh:通过。bash tools/ci/architecture_guards.sh:通过。git diff --check origin/dev..HEAD:通过。备注:本地测试仍有既有 NuGet source mapping / analyzer warnings,没有测试失败。