.NET: Persist ForeachExecutor iteration state across checkpoints#6051
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a .NET declarative workflow bug where ForeachExecutor lost its in-memory iteration cursor across checkpoint/restore (e.g., pausing inside the loop body and resuming in a new process), causing the loop to terminate after the first iteration.
Changes:
- Persist
ForeachExecutoriteration state (_index,_values,HasValue) during checkpointing and rehydrate it on checkpoint restore. - Add regression/unit tests covering checkpoint restore behavior, including empty sources and restoring with no saved state.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| dotnet/src/Microsoft.Agents.AI.Workflows.Declarative/ObjectModel/ForeachExecutor.cs | Adds checkpoint save/restore hooks to persist and restore foreach iteration state. |
| dotnet/tests/Microsoft.Agents.AI.Workflows.Declarative.UnitTests/ObjectModel/ForeachExecutorTest.cs | Adds regression tests and an in-memory workflow context to exercise checkpointing behavior. |
There was a problem hiding this comment.
Automated Code Review
Reviewers: 4 | Confidence: 90%
✓ Correctness
The implementation correctly persists and restores ForeachExecutor iteration state across checkpoints. The accessibility modifiers are correct (protected override of protected internal virtual from a different assembly without InternalsVisibleTo), the base call ordering follows existing patterns, the PortableValue roundtrip logic handles all expected value types including blanks, and the null guard on savedValues correctly handles the first-run/no-checkpoint case. The state keys are scoped to the executor via BindWorkflowContext(executor.Id) in the runtime, avoiding collision risk.
✓ Security Reliability
The checkpoint persistence implementation is well-designed from a security and reliability perspective. State is properly scoped per-executor via BoundWorkflowContext, null-safety is maintained through the savedValues gate, cross-assembly accessibility modifiers are correct, and base class calls are properly ordered. No security vulnerabilities (injection, secrets, unsafe deserialization) or significant reliability concerns were identified.
✓ Test Coverage
The new tests provide good coverage for the checkpoint/restore regression fix (GH-5009). Three test cases cover the core regression scenario (multi-item loop restored mid-iteration), the edge case of restoring with no saved state, and the edge case of an empty values array. The assertions on HasValue are meaningful since they directly verify the broken behavior (premature loop exit). One minor gap: no test verifies that the actual iterated VALUES survive the PortableValue roundtrip correctly — only the cursor position and HasValue flag are checked.
✗ Design Approach
The checkpoint persistence added here fixes the interpreted declarative
ForeachExecutor, but it does not cover the source-generated foreach path that uses the same private cursor fields and the sameHasValue-based routing. That leaves a class of checkpoint/resume scenarios still broken, so I would request changes before merging.
Flagged Issues
- The PR fixes only the interpreted
ObjectModel/ForeachExecutor, but source-generated foreach executors still keep_index,_values, andHasValuein private fields (CodeGen/ForeachTemplate.cs:39-42) and route onHasValueafterTakeNextAsync(Interpreter/WorkflowTemplateVisitor.cs:147-160). Because generated actions inheritExecutorthroughActionExecutor<TMessage>(Kit/ActionExecutor.cs:54-64) and participate in the same checkpoint lifecycle, a source-generated workflow that checkpoints inside a foreach body will still lose its iteration cursor on resume.
Automated review by peibekwe's agents
Motivation and Context
ForeachExecutorkept_index,_values, andHasValuein private fields that the checkpoint pipeline does not persist. When a workflow checkpoints inside a Foreach body (e.g. while a Question awaits input) and resumes in a new process,ExecutorBinding.CreateInstanceAsyncrebuilds the executor with defaults on the nextTakeNextAsync, so the loop exits after iteration 1.Fixes #5009
Description
Override
OnCheckpointingAsyncandOnCheckpointRestoredAsyncto track these values during checkpointing and rehydrate on resume.Contribution Checklist