iter57 cluster-071: typed ChatRequestEvent.llm_control + WorkflowChatSource (#953) by loning · Pull Request #957 · aevatarAI/aevatar

loning · 2026-05-24T08:07:01Z

摘要

iter57 cluster-071(severity:high)— 新 typed ChatRequestEvent.llm_control + WorkflowChatSource,删 Metadata/ToolContext-as-LLM-control 双重语义

Old:workflow 和 LLM 控制信息(NyxID/model/route)挤在 ToolContext / Metadata bag → 双重语义(违反 "API 字段单一语义")
New:typed LLMControlContext + typed WorkflowChatSource;删 Metadata fallback + ToolContext-as-LLM-control paths

违反:CLAUDE.md「核心语义强类型」「字段命名 Metadata 决策树」「API 字段单一语义」

Phase 9 共识链路(4-round + reflector)

Round	Verdict
r1	converge:typed contract scope
r2	converge:NyxID/model/route carrier + step-control scope
r3	escalate:stalled(2:1 ToolContext vs llm_control)
reflector r1	retry-fix:CLAUDE 单一语义判 ToolContext reuse 出局
r4	`META_JUDGE_DONE:consensus:structural:add narrow ChatRequestEvent.llm_control and typed WorkflowChatSource; remove Metadata/ToolContext carrier paths` ✅

Scope

新 LLMControlContext typed contract + LLMControlContextMapper
新 ITypedConversationReplyGenerator
新 WorkflowChatSource typed
涉及:NyxID/channel reply generation / Studio authoring / scheduled skill runner / workflow / streaming proxy / response paths
测试覆盖 typed carrier behavior

local PASS:architecture + test_stability + build + focused suites

closes #953

🤖 Generated with Claude Code via codex-refactor-loop iter57

⟦AI:AUTO-LOOP⟧

…Source(#953) Phase 9 r4 consensus(4-round + reflector r1 救场;CLAUDE 单一语义判 ToolContext 出局): - 新 typed LLMControlContext + ChatRequestEvent.llm_control proto field - 新 typed WorkflowChatSource + normalized chat source handling - 删 Metadata fallback + ToolContext-as-LLM-control paths(单一语义) - 迁移 active LLM routing/auth control 到 LLMControlContext - 涉及:NyxID/channel reply generation / Studio authoring / scheduled skill runner / workflow / streaming proxy / response paths - 测试 typed carrier behavior closes #953 local PASS:architecture + test_stability + build + focused channel/runtime suites ⟦AI:AUTO-LOOP⟧

…rId 字段 + Mapper 真传 PR #957 r1 reject P1: - ChatRunRequestNormalizer.cs:159 parsed actorId 但未传 InlineYamlBundle - WorkflowRunActorResolver 依赖 source.ActorId 解 / 验证 source actor - typed inline-yaml + source.actorId 静默当作新 inline run(不用指定 actor) - legacy workflowYamls+agentId 路径保留此 field → typed-source regression 修: - WorkflowChatSource.InlineYamlBundle 加 ActorId 字段 - Mapper 真传 actorId 到 typed bundle - ChatRunRequestNormalizer.cs:159 真传值 - 新 regression test 覆盖 typed inline-yaml + source.actorId 等价 legacy 行为 ⟦AI:AUTO-LOOP⟧

PR #957 r3 reviewer reject:r2 regression test 只覆盖 typed inline-yaml,legacy Workflow+AgentId non-inline 路径未覆盖。修: - 新 test 覆盖 legacy Workflow + AgentId non-inline 路径 - 验证 typed + legacy 两 path 都正确 resolve / validate source actor ⟦AI:AUTO-LOOP⟧

loning · 2026-05-24T08:40:40Z

📊 codex 进展 fix-pr957-r3 (⏳ 进行中; 已跑 3 min)

+        result.Succeeded.Should().BeTrue();
+        result.Request!.Source.Should().BeEquivalentTo(
+            WorkflowChatSource.DefinitionActor("source-actor-1", "direct"));
+        result.Request.WorkflowName.Should().Be("direct");
+        result.Request.ActorId.Should().Be("source-actor-1");
+    }
+
+    [Fact]
     public void ChatRunRequestNormalizer_ShouldDerivePromptAndInputParts_FromMultimodalInput()
     {
         var input = new ChatInput

tokens used
111,306
Fixed PR #957 round 3.

Changed:
- Legacy non-inline `workflow + agentId` now normalizes to `WorkflowChatSource.DefinitionActor(agentId, workflowName)` instead of dropping `agentId` into catalog workflow path.
- Added API regression test for legacy `Workflow + AgentId`.
- Added resolver regression test proving the source actor binding is actually resolved and used.

Verified:
- `dotnet test test/Aevatar.Workflow.Host.Api.Tests/Aevatar.Workflow.Host.Api.Tests.csproj --nologo --filter "FullyQualifiedName~WorkflowCapabilityEndpointsCoverageTests"`
- `dotnet test test/Aevatar.Workflow.Application.Tests/Aevatar.Workflow.Application.Tests.csproj --nologo --filter "FullyQualifiedName~WorkflowRunActorResolverTests"`
- `bash tools/ci/test_stability_guards.sh`

自动更新每 10 分钟;edit-in-place 不堆评论;codex 完成后此 comment 自动删除(per Auric "完成后删掉就好了否则太占空间")。
🤖 controller progress reporter

loning · 2026-05-24T08:42:30Z

📊 codex 进展 review-pr957-omnibus-r3 (⏳ 进行中; 已跑 12 min)


Coverage gap: the regression test does not cover legacy non-inline `Workflow + AgentId`, which is the path now regressed.

Focused tests run on PR head `c83751e9`:
- `dotnet test test/Aevatar.Workflow.Host.Api.Tests/Aevatar.Workflow.Host.Api.Tests.csproj --no-restore --filter "FullyQualifiedName~WorkflowCapabilityEndpointsCoverageTests|FullyQualifiedName~ChatEndpointsInternalTests" --nologo`
- `dotnet test test/Aevatar.Workflow.Application.Tests/Aevatar.Workflow.Application.Tests.csproj --no-restore --filter "FullyQualifiedName~WorkflowRunActorResolverTests|FullyQualifiedName~WorkflowRunFallbackCoverageTests" --nologo`

REVIEW_DONE:957:omnibus:reject
⟦AI:AUTO-LOOP⟧
tokens used
111,324
reject

[P1] Legacy non-inline `workflow + agentId` compatibility is broken. In [ChatRunRequestNormalizer.cs](/tmp/aevatar-pr957-review/src/workflow/Aevatar.Workflow.Infrastructure/CapabilityApi/ChatRunRequestNormalizer.cs:131), legacy input with both `Workflow` and `AgentId` now returns `WorkflowChatSource.CatalogWorkflow(...)` before considering `AgentId`. The final `WorkflowChatRunRequest.ActorId` becomes `null`, and [WorkflowRunActorResolver.cs](/tmp/aevatar-pr957-review/src/workflow/Aevatar.Workflow.Application/Runs/WorkflowRunActorResolver.cs:69) bypasses source-actor resolution and creates a fresh registry-backed run. Before this PR, the same request preserved `ActorId` and resolved through the specified actor binding. This violates “legacy compatibility 保留” and can still silently create-new for a caller that explicitly supplied an actor.

Verified fixed parts:
- typed inline-yaml `source.ActorId` is now preserved by the normalizer.
- resolver reads `request.Source.ActorId`.
- regression tests cover typed inline-yaml and legacy inline-yaml behavior.

Coverage gap: the regression test does not cover legacy non-inline `Workflow + AgentId`, which is the path now regressed.

Focused tests run on PR head `c83751e9`:
- `dotnet test test/Aevatar.Workflow.Host.Api.Tests/Aevatar.Workflow.Host.Api.Tests.csproj --no-restore --filter "FullyQualifiedName~WorkflowCapabilityEndpointsCoverageTests|FullyQualifiedName~ChatEndpointsInternalTests" --nologo`
- `dotnet test test/Aevatar.Workflow.Application.Tests/Aevatar.Workflow.Application.Tests.csproj --no-restore --filter "FullyQualifiedName~WorkflowRunActorResolverTests|FullyQualifiedName~WorkflowRunFallbackCoverageTests" --nologo`

自动更新每 10 分钟;edit-in-place 不堆评论;codex 完成后此 comment 自动删除(per Auric "完成后删掉就好了否则太占空间")。
🤖 controller progress reporter

loning · 2026-05-24T08:43:20Z

Verified r3 fix at head 0283beb.

Verdict: approve. GitHub refused a formal approve review because this checkout is authenticated as the PR author, so recording this as a PR comment.

No blocking findings. The legacy Workflow + AgentId non-inline path now normalizes to DefinitionActor(agentId, workflowName), and resolver coverage validates that the source actor binding is read and used for the run. The typed inline-yaml + source.actorId path is also covered at both normalizer and resolver levels.

Validation run from detached PR worktree:

dotnet test test/Aevatar.Workflow.Host.Api.Tests/Aevatar.Workflow.Host.Api.Tests.csproj --nologo --filter "FullyQualifiedName~WorkflowCapabilityEndpointsCoverageTests"
dotnet test test/Aevatar.Workflow.Application.Tests/Aevatar.Workflow.Application.Tests.csproj --nologo --filter "FullyQualifiedName~WorkflowRunActorResolverTests"
bash tools/ci/test_stability_guards.sh

⟦AI:AUTO-LOOP⟧

…rce / mapper coverage(coverage-quality fail) PR #957 coverage-quality fail。补 narrow tests 覆盖 194 production lines on: - LLMControlContext typed contract - LLMControlContextMapper - WorkflowChatSource normalize - ITypedConversationReplyGenerator local PASS:LLMControl filter 3/3 + test_stability_guards ⟦AI:AUTO-LOOP⟧

loning · 2026-05-24T09:35:57Z

📊 当前状态 — 🔧 fixing(test r4 — 不需要人介入)

维度	值
阶段	Phase 8 fix r4(test-only,迁旧 metadata 测试到 typed LLMControl)
上轮触发	test-add r1 (commit `757feb7c`) 只补了 LLMControlContext/Mapper/WorkflowChatSource 覆盖,未迁 NyxIdChatEndpointsCoverageTests / NyxIdLLMProviderRoutingTests / StreamingProxyNyxParticipantCoordinatorTests / ConnectedServicesContextMiddlewareTests / ScopeServiceEndpointsTests(共 ~30 旧 metadata-based 测试)
CI 现状	coverage-quality FAIL(`dotnet test` 阶段挂),fast-gates / projection-provider-e2e / host-composition-smoke / slow-test-guards 全 pass
派出	fix codex r4(test-only,timeout 5400s)
是否需要人介入	❌ 否(纯测试迁移,production 已切 typed,无规则解读分歧)

下一步自动会做:fix codex 把每个失败测试的 Metadata-based setup/assertion 迁到 typed LLMControlContext API → controller commit/push → CI 重检 → 全绿即合并。

何时需要人介入:

同 r4 fix 后仍同类 fail(说明 test 不仅是 carrier 问题,而是 production 真 bug),累计 r6 触发 reflector
reflector 也 escalate-human 才升 maintainer

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

… 测试到 typed LLMControl PR #957 coverage-quality fail 根因:test-add r1 commit 757feb7 只补 LLMControlContext/Mapper/WorkflowChatSource 覆盖,未迁旧 endpoint 测试;~30 个 dotnet test 阶段挂(NyxIdChatEndpointsCoverageTests / NyxIdLLMProviderRoutingTests / StreamingProxyNyxParticipantCoordinatorTests / ConnectedServicesContextMiddlewareTests / ScopeServiceEndpointsTests)。 fix r4(28 测试断言迁移,test-only,production 不动): - NyxId LLM provider routing:LLMControlContext 承载 access_token/route_preference/model_override - NyxId chat endpoint/interaction:断言 ChatRequestEvent.LlmControl 而非 Metadata - StreamingProxy participant coordinator:断言 LlmControl token/route - ConnectedServices middleware:typed ToolContext.ConnectedServices - ScopeService helper:BuildScopedLlmControlAsync,headers 只保留真实 header/passthrough 验证: - Aevatar.AI.Tests:653 passed - Aevatar.GAgentService.Integration.Tests:297 passed - Aevatar.AI.Core build:passed ⟦AI:AUTO-LOOP⟧

loning · 2026-05-24T09:53:54Z

🤖 Phase 8 fix r4 — 28 测试已迁完到 typed LLMControl

维度	值
Round	r4(test-only,production 不动)
Applied	28(NyxIdLLMProviderRouting + NyxIdChatEndpointsCoverage + StreamingProxyNyxParticipantCoordinator + ConnectedServicesContextMiddleware + ScopeServiceEndpoints)
Rejected as false-positive	0
Blocked	1(codex 误判 gh 不可用,controller 接手 post 本评论)
dotnet test Aevatar.AI.Tests	653 passed
dotnet test Aevatar.GAgentService.Integration.Tests	297 passed
dotnet build Aevatar.AI.Core	passed
修复 commit	`1b7170fc`

变更摘要:

NyxId LLM provider routing 测试改 LLMControlContext 承载 access_token / route_preference / model_override
NyxId chat endpoint / interaction 测试断言 ChatRequestEvent.LlmControl(非 Metadata["nyxid.*"])
StreamingProxy participant coordinator 测试断言 LlmControl token/route
ConnectedServices middleware 测试 typed ToolContext.ConnectedServices
ScopeService helper 测试 scoped LLM config → BuildScopedLlmControlAsync,headers 只保留真 header/passthrough

下一步自动:CI 重检(coverage-quality)→ 全绿即 Phase 8 unanimous + auto-merge → close issue #953。

⟦AI:AUTO-LOOP⟧

codecov · 2026-05-24T10:09:13Z

Codecov Report

❌ Patch coverage is 84.19811% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.03%. Comparing base (87e9f2d) to head (1b7170f).
⚠️ Report is 2 commits behind head on auto-refact-dev.

Files with missing lines	Patch %	Lines
...ervice.Hosting/Endpoints/ScopeWorkflowEndpoints.cs	50.81%	25 Missing and 5 partials ⚠️
...Service.Hosting/Endpoints/ScopeServiceEndpoints.cs	75.92%	9 Missing and 4 partials ⚠️
...low/Aevatar.Workflow.Core/Modules/LLMCallModule.cs	55.55%	8 Missing ⚠️
...tructure/CapabilityApi/ChatRunRequestNormalizer.cs	93.68%	0 Missing and 6 partials ⚠️
...atar.AI.Core/LLMProviders/OwnerLlmConfigApplier.cs	80.00%	0 Missing and 3 partials ⚠️
...ication/Runs/WorkflowChatRequestEnvelopeFactory.cs	0.00%	1 Missing and 1 partial ⚠️
....AI.Abstractions/LLMProviders/LLMControlContext.cs	97.43%	0 Missing and 1 partial ⚠️
...s/ToolProviders/AgentToolExecutionContextMapper.cs	50.00%	0 Missing and 1 partial ⚠️
src/Aevatar.AI.Core/AIGAgentBase.cs	85.71%	0 Missing and 1 partial ⚠️
src/Aevatar.AI.Core/Chat/ChatRuntime.cs	95.00%	1 Missing ⚠️
... and 1 more

@@                 Coverage Diff                 @@
##           auto-refact-dev     #957      +/-   ##
===================================================
- Coverage            83.04%   83.03%   -0.02%     
===================================================
  Files                  987      990       +3     
  Lines                63558    63862     +304     
  Branches              8214     8248      +34     
===================================================
+ Hits                 52780    53025     +245     
- Misses                7246     7282      +36     
- Partials              3532     3555      +23

Flag	Coverage Δ
ci	`83.03% <84.19%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...stractions/LLMProviders/LLMControlContextMapper.cs	`100.00% <100.00%> (ø)`
...Aevatar.AI.Abstractions/LLMProviders/LLMRequest.cs	`77.61% <100.00%> (+0.33%)`	⬆️
src/Aevatar.AI.Core/RoleGAgent.cs	`82.38% <100.00%> (+0.01%)`	⬆️
src/Aevatar.AI.Core/Tools/ToolCallLoop.cs	`90.45% <100.00%> (+0.07%)`	⬆️
...ice.Application/Responses/MessagesCommandFacade.cs	`79.67% <100.00%> (+0.40%)`	⬆️
...ce.Application/Responses/ResponsesCommandFacade.cs	`77.10% <100.00%> (+0.20%)`	⬆️
...ication.Abstractions/Runs/WorkflowChatRunModels.cs	`100.00% <100.00%> (ø)`
...w.Application/Runs/WorkflowDirectFallbackPolicy.cs	`74.54% <100.00%> (+0.47%)`	⬆️
...kflow.Application/Runs/WorkflowRunActorResolver.cs	`93.01% <100.00%> (+0.42%)`	⬆️
...orkflow/Aevatar.Workflow.Core/WorkflowRunGAgent.cs	`83.07% <100.00%> (ø)`
... and 12 more

... and 6 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

loning · 2026-05-24T10:20:19Z

🤖 Phase 8 Re-review — `tests-r6-quickverify` (头 SHA `1b7170f`)

Verdict: approve-r6-quick

I quick-verified the r4-only range 757feb7c..1b7170fc against PR head 1b7170fc1bd3c42c2aefd22e10007bed385380b5.

Checks performed:

Scope is test-only: all 5 changed files are under test/**; no production file changed in this range.
Carrier migration is real: old keyed Metadata[LLMRequestMetadataKeys.*] / nyxid.* assertions and fixtures were replaced with typed LlmControl / LLMControlContext / AgentToolConnectedServicesContext assertions.
Assertions remain concrete: token, route, model override, max tool rounds, user memory, connected services, and custom metadata preservation are still checked with exact expected values or strict NotContainKey checks for removed metadata keys.
No test disabling found: no [Skip], Skip = ..., Explicit, or disable-style marker in the r4 diff.
CI state on GitHub is green for the relevant gates: coverage-quality, codecov/patch, fast-gates, host-composition-smoke, projection-provider-e2e, and slow-test-guards are successful on this head.

No blocking issue found. This r4 push is a narrow test-only typed-carrier migration and is OK to merge.

REVIEW_DONE:957:tests-r6-quickverify:approve-r6-quick
⟦AI:AUTO-LOOP⟧

loning · 2026-05-24T10:20:48Z

✅ Phase 8 r4 + r6 quick-verify approve — auto-merging

Review	Verdict
r5 (3/3) on `757feb7`	architect approve · tests approve · quality approve
r6 quick-verify on `1b7170f`	tests approve(test-only carrier 迁移,无 production 改动,无 [Skip],断言未放宽)

r4 fix 是 28 测试断言迁移(NyxIdLLMProviderRouting / NyxIdChatEndpointsCoverage / StreamingProxyNyxParticipantCoordinator / ConnectedServicesContextMiddleware / ScopeServiceEndpoints)到 typed LLMControlContext carrier;production code 0 改动。CI 现状全绿(coverage-quality / codecov/patch / fast-gates / host-composition-smoke / projection-provider-e2e / slow-test-guards)。

下一步:gh pr merge 957 --squash --delete-branch,close #953。

🤖 controller consensus + auto-merge

⟦AI:AUTO-LOOP⟧

loning requested review from eanzhao, jason-aelf and louis4li as code owners May 24, 2026 08:07

loning added auto-loop Created by codex-refactor-loop skill 🚀 phase:pr-open 👀 phase:reviewing 🤖 human:auto-推进 🔧 phase:fixing and removed 👀 phase:reviewing labels May 24, 2026

loning added 👀 phase:reviewing 🔧 phase:fixing and removed 🔧 phase:fixing 👀 phase:reviewing labels May 24, 2026

loning added 👀 phase:reviewing and removed 🔧 phase:fixing labels May 24, 2026

loning added 🔧 phase:fixing and removed 👀 phase:reviewing labels May 24, 2026

loning merged commit e98af86 into auto-refact-dev May 24, 2026
12 checks passed

loning mentioned this pull request May 24, 2026

[refactor-design] cluster-070-string-bag-control-semantics residual after #949 #953

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iter57 cluster-071: typed ChatRequestEvent.llm_control + WorkflowChatSource (#953)#957

iter57 cluster-071: typed ChatRequestEvent.llm_control + WorkflowChatSource (#953)#957
loning merged 5 commits into
auto-refact-devfrom
refactor/iter57-cluster-071-llm-control-typed-carrier

loning commented May 24, 2026

Uh oh!

loning commented May 24, 2026

Uh oh!

loning commented May 24, 2026

Uh oh!

loning commented May 24, 2026

Uh oh!

loning commented May 24, 2026

Uh oh!

loning commented May 24, 2026

Uh oh!

codecov Bot commented May 24, 2026

Uh oh!

loning commented May 24, 2026

Uh oh!

loning commented May 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

loning commented May 24, 2026

摘要

Phase 9 共识链路(4-round + reflector)

Scope

Uh oh!

loning commented May 24, 2026

📊 codex 进展 fix-pr957-r3 (⏳ 进行中; 已跑 3 min)

Uh oh!

loning commented May 24, 2026

📊 codex 进展 review-pr957-omnibus-r3 (⏳ 进行中; 已跑 12 min)

Uh oh!

loning commented May 24, 2026

Uh oh!

loning commented May 24, 2026

📊 当前状态 — 🔧 fixing(test r4 — 不需要人介入)

Uh oh!

loning commented May 24, 2026

🤖 Phase 8 fix r4 — 28 测试已迁完到 typed LLMControl

Uh oh!

codecov Bot commented May 24, 2026

Codecov Report

Uh oh!

loning commented May 24, 2026

🤖 Phase 8 Re-review — tests-r6-quickverify (头 SHA 1b7170f)

Uh oh!

loning commented May 24, 2026

✅ Phase 8 r4 + r6 quick-verify approve — auto-merging

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

🤖 Phase 8 Re-review — `tests-r6-quickverify` (头 SHA `1b7170f`)