Skip to content

抽出 AgentRun typed 生成执行器#952

Merged
loning merged 2 commits into
auto-refact-devfrom
refactor/2026-05-24_agentrun-typed-executor
May 24, 2026
Merged

抽出 AgentRun typed 生成执行器#952
loning merged 2 commits into
auto-refact-devfrom
refactor/2026-05-24_agentrun-typed-executor

Conversation

@loning
Copy link
Copy Markdown
Contributor

@loning loning commented May 24, 2026

问题

#947 要求修复 AgentRunGAgent 在 actor turn 内直接执行 metadata enrichment、LLM/tool streaming generation 和 streaming chunk dispatch 的结构问题。r5 共识是保留 AgentRunGAgent 作为唯一 run fact owner,只抽出单一 typed stateless generation executor,accepted-only 的 LlmReplyReadyEvent handoff 继续留在 AgentRunGAgent。

方案

  • 新增 IAgentRunReplyGenerationExecutorPortAgentRunReplyGenerationExecutor,把 metadata 构建、LLM reply generation、interactive intent capture、streaming chunk/finalize、generation failure classification 移出 AgentRunGAgent
  • agent_run.proto 增加 ReplyGenerationRequested 状态与 typed requested/completed/failed/timed-out continuation payload。
  • AgentRunGAgent 改为 admission/drop/stale gate 后持久化 generation requested fact,只做 accepted-only executor handoff;completion/failure/timeout 回到 actor 后再由 actor 持久化 produced/dispatched/failed/cleanup facts。
  • 保留 output dispatch retry、terminal cleanup、drop notification 与 LlmReplyReadyEvent SendToAsync 在 AgentRunGAgent 内,不新增 actor type、envelope kind、projection phase 或第二 output executor stage。
  • 更新 AgentRunGAgentTests,覆盖 requested handoff、duplicate requested 不二次启动 executor,并保持原有 ready handoff、streaming、metadata、retry、token echo 行为。

验证

  • dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj --no-restore --nologo --filter AgentRunGAgentTests
  • bash tools/ci/architecture_guards.sh
  • bash tools/ci/test_stability_guards.sh
  • dotnet build aevatar.slnx --nologo

IMPLEMENT_DONE:cluster-069-agentrun-typed-executor:ok
⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 状态卡片 — reviewer 派出

维度
阶段 派出 codex(role=reviewer)
codex log review-pr952-omnibus-r1.log
工作目录 /Users/auric/aevatar
timeout 5400s(~90 min 上限)
上下文 omnibus r1(架构/测试/质量综合)
下一步自动会做 1. 三 reviewer 完成 verdict marker 2. controller 计算 consensus 3. unanimous → auto-merge / reject → fix r<N+1>
是否需要人介入 ❌ 否(自动推进)

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 状态卡片 — fix 派出

维度
阶段 派出 codex(role=fix)
codex log fix-pr952-r2.log
工作目录 /Users/auric/aevatar
timeout 5400s(~90 min 上限)
上下文 r1 reject(timeout race + 与 dev 冲突)→ fix r2 修 race + rebase
下一步自动会做 1. fix r 完成 marker FIX_DONE:... 2. controller commit + push 3. 派 reviewer r<N+1>
是否需要人介入 ❌ 否(自动推进)

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 codex 进展 review-pr952-omnibus-r1 (⏳ 进行中; 已跑 10 min)

exec
/bin/zsh -lc 'git worktree remove /tmp/aevatar-pr-952-review --force' in /Users/auric/aevatar
 succeeded in 352ms:
codex
reject

Finding:
- **P1** `AgentRunGAgent` schedules `AgentRunReplyGenerationTimedOut` before starting the executor, but the timeout handler only persists `AgentRunFailedEvent` and cleanup. It does not send `LlmReplyReadyEvent` or a drop notification. If the scheduler timeout fires before the executor’s own timeout completion is delivered, the run becomes terminal `Failed`; the executor’s later `AgentRunReplyGenerationCompleted` is ignored by `IsCurrentGenerationContinuation`, so the conversation actor never receives the user-visible timeout fallback and the pending relay/loading state can remain orphaned. See `agents/Aevatar.GAgents.NyxidChat/AgentRunGAgent.cs:382`, `:265`, `:271`, `:866`, and `agents/Aevatar.GAgents.NyxidChat/AgentRunReplyGenerationExecutor.cs:190`.

Other notes:
- GitHub reports PR #952 as `CONFLICTING` against `dev`.
- Targeted verification passed after restore: `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj --no-restore --nologo --filter AgentRunGAgentTests` -> 48 passed.

REVIEW_DONE:952:omnibus:reject
⟦AI:AUTO-LOOP⟧
tokens used
151,106
reject

Finding:
- **P1** `AgentRunGAgent` schedules `AgentRunReplyGenerationTimedOut` before starting the executor, but the timeout handler only persists `AgentRunFailedEvent` and cleanup. It does not send `LlmReplyReadyEvent` or a drop notification. If the scheduler timeout fires before the executor’s own timeout completion is delivered, the run becomes terminal `Failed`; the executor’s later `AgentRunReplyGenerationCompleted` is ignored by `IsCurrentGenerationContinuation`, so the conversation actor never receives the user-visible timeout fallback and the pending relay/loading state can remain orphaned. See `agents/Aevatar.GAgents.NyxidChat/AgentRunGAgent.cs:382`, `:265`, `:271`, `:866`, and `agents/Aevatar.GAgents.NyxidChat/AgentRunReplyGenerationExecutor.cs:190`.

Other notes:
- GitHub reports PR #952 as `CONFLICTING` against `dev`.
- Targeted verification passed after restore: `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj --no-restore --nologo --filter AgentRunGAgentTests` -> 48 passed.

自动更新每 10 分钟;edit-in-place 不堆评论;codex 完成后此 comment 自动删除(per Auric "完成后删掉就好了 否则太占空间")。
🤖 controller progress reporter

@loning loning changed the base branch from dev to auto-refact-dev May 24, 2026 05:34
loning added a commit that referenced this pull request May 24, 2026
事故:
- #952 (cluster-069 AgentRun typed executor) implement codex 自跑 gh pr create
- 默认 base = repo default branch = dev(应该是 auto-refact-dev)
- 结果:PR base 错,与 dev CONFLICTING + 误发布到外部 base

修:
- SKILL.md hard rule #4 显式列 git push / gh pr create / gh pr edit / git branch
- prompts/implement.md 红线节加 'gh pr create' / 'gh pr edit' verbatim 禁令
- 引用 #952 事故记录

⟦AI:AUTO-LOOP⟧
loning added 2 commits May 24, 2026 13:42
…LlmReplyDroppedEvent)+ rebase

PR #952 r1 reject blocker:
- scheduler timeout 提前 fire → terminal Failed → executor 后续 completion 被忽略 → orphaned pending/loading state

修:
- scheduler timeout 改先 dispatch DeferredLlmReplyDroppedEvent(reason=llm_reply_timeout)再 persist terminal Failed
- conversation actor 收到 timeout signal,pending/loading state 清理
- late executor completion 仍被 terminal continuation gating 忽略(不再次 emit)
- 新增 regression test:scheduler-timeout-before-executor-completion
- 49/49 AgentRunGAgentTests pass

⟦AI:AUTO-LOOP⟧
@loning loning force-pushed the refactor/2026-05-24_agentrun-typed-executor branch from a5dbf60 to da696aa Compare May 24, 2026 05:45
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 状态卡片 — reviewer 派出

维度
阶段 派出 codex(role=reviewer)
codex log review-pr952-omnibus-r3.log
工作目录 /Users/auric/aevatar
timeout 5400s(~90 min 上限)
上下文 fix r2 done(timeout race 修 + rebase + base retarget)→ omnibus r3
下一步自动会做 1. 三 reviewer 完成 verdict marker 2. controller 计算 consensus 3. unanimous → auto-merge / reject → fix r<N+1>
是否需要人介入 ❌ 否(自动推进)

🤖 controller status banner

⟦AI:AUTO-LOOP⟧

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 codex 进展 review-pr952-omnibus-r3 (⏳ 进行中; 已跑 6 min)

approve

Verified PR #952 head `da696aae18ed979749cb108cd1a2c44a50c04917` against base `auto-refact-dev`.

Findings:
- Scheduler timeout now dispatches `DeferredLlmReplyDroppedEvent` with `reason=llm_reply_timeout` before persisting terminal `AgentRunFailedEvent`.
- Conversation actor already consumes that drop event by emitting a not-retryable failure and removing the matching `PendingLlmReplyRequests` entry, so the pending/loading orphan race is addressed.
- Late executor completion is gated by terminal/current-generation checks and does not dispatch `LlmReplyReadyEvent` after timeout.
- No new actor class or projection topology was introduced by the timeout fix path. The executor still reports back through typed commands over the existing dispatch port.
- `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj --filter FullyQualifiedName~AgentRunGAgentTests --nologo` passed: 49/49.

REVIEW_DONE:952:omnibus:approve
⟦AI:AUTO-LOOP⟧
tokens used
137,068
approve

Verified PR #952 head `da696aae18ed979749cb108cd1a2c44a50c04917` against base `auto-refact-dev`.

Findings:
- Scheduler timeout now dispatches `DeferredLlmReplyDroppedEvent` with `reason=llm_reply_timeout` before persisting terminal `AgentRunFailedEvent`.
- Conversation actor already consumes that drop event by emitting a not-retryable failure and removing the matching `PendingLlmReplyRequests` entry, so the pending/loading orphan race is addressed.
- Late executor completion is gated by terminal/current-generation checks and does not dispatch `LlmReplyReadyEvent` after timeout.
- No new actor class or projection topology was introduced by the timeout fix path. The executor still reports back through typed commands over the existing dispatch port.
- `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj --filter FullyQualifiedName~AgentRunGAgentTests --nologo` passed: 49/49.

自动更新每 10 分钟;edit-in-place 不堆评论;codex 完成后此 comment 自动删除(per Auric "完成后删掉就好了 否则太占空间")。
🤖 controller progress reporter

@codecov
Copy link
Copy Markdown

codecov Bot commented May 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.53%. Comparing base (f82a239) to head (da696aa).
⚠️ Report is 42 commits behind head on auto-refact-dev.

@@                 Coverage Diff                 @@
##           auto-refact-dev     #952      +/-   ##
===================================================
+ Coverage            82.83%   83.53%   +0.70%     
===================================================
  Files                  998      987      -11     
  Lines                63092    63559     +467     
  Branches              8161     8214      +53     
===================================================
+ Hits                 52261    53094     +833     
+ Misses                7304     6906     -398     
- Partials              3527     3559      +32     
Flag Coverage Δ
ci 83.53% <ø> (+0.70%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 74 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@loning loning marked this pull request as ready for review May 24, 2026 06:03
@loning loning merged commit ecf47d9 into auto-refact-dev May 24, 2026
12 checks passed
@loning loning deleted the refactor/2026-05-24_agentrun-typed-executor branch May 24, 2026 06:03
@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 codex 进展 prompts-audit-iter57 (⏳ 进行中; 已跑 4 min)

+   - `_markers.md`: marker regex catalog.
+2. Replace every role-local GitHub post block with a short reference to `_github-post-rules.md`.
+3. Delete bilingual templates and `_en/_zh` fields from `audit.md`, `design-issue-body.md`, `design-issue-reply.md`, `solver-*.md`, and `meta-judge.md`.
+4. Remove dead `comment-monitor.sh` wording from prompt layer.
+5. Normalize role boundaries: reviewer/read-only, solver/plan-only, implement/test-add/fix/write-only-in-worktree, triage/issue-edit-only.
+
+## Totals
+
+- P0: 1
+- P1: 35
+- P2: 24
+
+PROMPTS_AUDIT_DONE:iter57:p0=1:p1=35:p2=24
+⟦AI:AUTO-LOOP⟧

tokens used
97,923
已完成,只读审计并新增报告:

[.refactor-loop/runs/prompts-audit-iter57.md](/Users/auric/aevatar/.refactor-loop/runs/prompts-audit-iter57.md)

校验结果:
- 未修改 `.claude/skills/codex-refactor-loop/prompts/` 任何文件。
- 报告末尾包含 `PROMPTS_AUDIT_DONE:iter57:p0=1:p1=35:p2=24`
- 报告末尾独立一行包含 sentinel `⟦AI:AUTO-LOOP⟧`

自动更新每 10 分钟;edit-in-place 不堆评论;codex 完成后此 comment 自动删除(per Auric "完成后删掉就好了 否则太占空间")。
🤖 controller progress reporter

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 codex 进展 implement-iter72-cluster-072-nyxid-stub (⏳ 进行中; 已跑 9 min)

-        "This tool no longer functions. All channel bot management is now handled by the channel_registrations tool.";
-
-    public string ParametersSchema => """{"type":"object","properties":{"action":{"type":"string"}}}""";
-
-    public Task<string> ExecuteAsync(string argumentsJson, CancellationToken ct = default) =>
-        Task.FromResult("""{"error":"nyxid_channel_bots is deprecated and no longer works. Use the channel_registrations tool instead. Example: channel_registrations action=register platform=lark nyx_provider_slug=api-lark-bot"}""");
-}

tokens used
89,294
已完成 `cluster-072-nyxid-channel-bots-deprecated-stub`。

删除并 staged:
`src/Aevatar.AI.ToolProviders.NyxId/Tools/NyxIdChannelBotsDeprecatedStub.cs`

验证已通过:
- `rg -n "NyxIdChannelBotsDeprecatedStub" src test agents` 无匹配
- `dotnet build src/Aevatar.AI.ToolProviders.NyxId/Aevatar.AI.ToolProviders.NyxId.csproj --nologo`
- `dotnet test --nologo`
- `bash /Users/auric/aevatar/tools/ci/test_stability_guards.sh`
- `bash /Users/auric/aevatar/tools/ci/architecture_guards.sh`

已写摘要:
`/Users/auric/aevatar/.refactor-loop/runs/implement-iter72-cluster-072-nyxid-stub.md`

自动更新每 10 分钟;edit-in-place 不堆评论;codex 完成后此 comment 自动删除(per Auric "完成后删掉就好了 否则太占空间")。
🤖 controller progress reporter

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 codex 进展 implement-iter71-cluster-071-identity-rebuild (⏳ 进行中; 已跑 8 min)

-            "a remote-side revoke/self-heal must overwrite any stale active binding readmodel from the actor's empty state");
+            initialVersion,
+            "empty revoke must not append a projection-only no-op event");
     }
 
     [Fact]

tokens used
249,934
Implemented and staged, no commit.

Key changes:
- Removed identity/OAuth command-path no-op `ProjectionRebuildRequested` event emissions.
- Removed rebuild events from projection activation routing.
- Deleted the active proto event messages and regenerated via build.
- Updated identity tests to assert no version bump for idempotent no-op paths.
- Wrote artifact: `/Users/auric/aevatar/.refactor-loop/runs/implement-iter71-cluster-071-identity-rebuild.md`

Verification passed:
- `dotnet build agents/Aevatar.GAgents.Channel.Identity/Aevatar.GAgents.Channel.Identity.csproj --nologo`
- `dotnet test test/Aevatar.GAgents.ChannelRuntime.Tests/Aevatar.GAgents.ChannelRuntime.Tests.csproj --nologo --filter FullyQualifiedName~Identity`
- `dotnet test test/Aevatar.CQRS.Projection.Core.Tests/Aevatar.CQRS.Projection.Core.Tests.csproj --nologo`
- `bash /Users/auric/aevatar/tools/ci/test_stability_guards.sh`
- `bash /Users/auric/aevatar/tools/ci/architecture_guards.sh`

自动更新每 10 分钟;edit-in-place 不堆评论;codex 完成后此 comment 自动删除(per Auric "完成后删掉就好了 否则太占空间")。
🤖 controller progress reporter

@loning
Copy link
Copy Markdown
Contributor Author

loning commented May 24, 2026

📊 codex 进展 implement-iter72-cluster-072-studio-legacy (⏳ 进行中; 已跑 7 min)

-            Layout: null,
-            Findings: [],
-            draft.UpdatedAtUtc);
-    }
-    #pragma warning restore CS0618
 }

tokens used
101,907
Implemented and staged the legacy Studio workflow public surface removal.

Changed:
- Removed obsolete `/api/workspace/workflows*` controller actions.
- Removed legacy `WorkspaceService` wrappers.
- Removed legacy DTOs: `WorkflowSummary`, `WorkflowFileResponse`, `SaveWorkflowFileRequest`.
- Kept `workflow-drafts` surface unchanged.

Validation:
- `dotnet build src/Aevatar.Studio.Hosting/Aevatar.Studio.Hosting.csproj --nologo` passed.
- `dotnet test test/Aevatar.Studio.Tests/Aevatar.Studio.Tests.csproj --nologo` passed: 591/591.
- `test_stability_guards.sh` passed.
- `architecture_guards.sh` passed.
- Frontend `tsc` could not run because `apps/aevatar-console-web/node_modules` is missing and dependency install is forbidden.

`git add -A && git status` was run. Three files are staged; no commit/push was performed. Summary artifact written to `/Users/auric/aevatar/.refactor-loop/runs/implement-iter72-cluster-072-studio-legacy.md`.

自动更新每 10 分钟;edit-in-place 不堆评论;codex 完成后此 comment 自动删除(per Auric "完成后删掉就好了 否则太占空间")。
🤖 controller progress reporter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant