feat(decider): run compiled wasm deciders in the production runtime#461
Conversation
yordis
commented
Jul 2, 2026
- PR feat(decider): compile deciders to WASM/WIT components #384 made decider purity enforceable by compiling deciders to zero-import WASM components, but those components could only run in the sim and the conformance CLI; production command execution still required a native Rust decider, so the sandboxing guarantees never reached the event store.
- Execution composes through the storage-neutral ports of trogon-decider-runtime instead of the typed Decider trait, because guest session state lives behind a wasmtime store handle and per-command write preconditions come from the module descriptor, shapes the trait cannot express without erasing its error types.
- Zero-import enforcement is structural (an empty component linker) rather than the sim's wasm-tools shell-out, because production hosts cannot depend on a dev-time binary to uphold a safety contract.
- The guest never folds decided events back into session state, so snapshots are captured only after re-evolving them; skipping that would make snapshot-resumed sessions silently diverge from full replay.
Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
PR SummaryHigh Risk Overview Loading and routing: Components are compiled with wasmtime, rejected if they declare any import (empty linker), and probed once for Execution: CI: The wasm-decider workflow now runs Reviewed by Cursor Bugbot for commit 6444b37. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
Warning Review limit reached
Next review available in: 35 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (13)
WalkthroughAdds a new ChangesWASM Decider Runtime Crate
Estimated code review effort: 4 (Complex) | ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (5)
rsworkspace/crates/trogon-decider-wasm-runtime/src/opaque_snapshot.rs (1)
55-63: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winUnnecessary
thread_localfor a constant value.
snapshot_type()still.clone()s the cached value on every call (Line 35), so thethread_localonly avoids re-validating a fixed literal — a cost that's already negligible. A plain construction inline (or a process-wideLazyLockif reuse across many calls truly matters) is simpler and avoids the extra#[allow(clippy::unwrap_used)]indirection.♻️ Simplify by dropping the thread_local
impl SnapshotType for OpaqueSnapshotPayload { type Error = std::convert::Infallible; fn snapshot_type() -> Result<SnapshotTypeName, Self::Error> { - Ok(SNAPSHOT_TYPE_NAME.with(|name| name.clone())) + #[allow( + clippy::unwrap_used, + reason = "fixed non-empty ASCII literal without control characters; construction cannot fail" + )] + Ok(SnapshotTypeName::new("wasm-decider-opaque.v1").unwrap()) } } - -thread_local! { - static SNAPSHOT_TYPE_NAME: SnapshotTypeName = { - #[allow( - clippy::unwrap_used, - reason = "fixed non-empty ASCII literal without control characters; construction cannot fail" - )] - SnapshotTypeName::new("wasm-decider-opaque.v1").unwrap() - }; -}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/opaque_snapshot.rs` around lines 55 - 63, The `SNAPSHOT_TYPE_NAME` thread_local in `opaque_snapshot.rs` is unnecessary because `snapshot_type()` only clones a fixed value and the extra caching adds avoidable `clippy::unwrap_used` suppression. Simplify `snapshot_type()` by constructing `SnapshotTypeName::new("wasm-decider-opaque.v1")` directly where it is used, or switch to a `LazyLock` only if shared reuse is actually needed. Remove the `thread_local!` block and keep the logic centered around `snapshot_type()` and `SnapshotTypeName::new`.rsworkspace/crates/trogon-decider-wasm-runtime/Cargo.toml (1)
1-29: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueLGTM overall; minor inconsistency in
chronodependency.Every other dependency uses
{ workspace = true }, butchrono(Line 24) pins a bare version. Consider adding it to the workspace dependency table for consistent version management.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rsworkspace/crates/trogon-decider-wasm-runtime/Cargo.toml` around lines 1 - 29, The dependency list in the wasm runtime manifest is inconsistent because `chrono` is pinned directly while the other shared crates use workspace-managed versions. Update the `chrono` entry to use the same workspace dependency pattern as the surrounding dependencies, and ensure the workspace dependency table includes `chrono` so version management stays centralized in the manifest for `trogon-decider-wasm-runtime`.rsworkspace/crates/trogon-decider-wasm-runtime/src/module_name.rs (1)
12-83: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDuplicate newtype pattern across three identifier types.
ModuleNamehere, along withModuleVersion(rsworkspace/crates/trogon-decider-wasm-runtime/src/module_version.rs) andCommandType, implement virtually identical validation (is_empty/is_controlchecks),FromStr/TryFrom<String>/TryFrom<&str>/Display/Borrow<str>/AsRef<str>impls, and error enums differing only by name/messages. Consider extracting a shared macro or generic validated-string wrapper to reduce duplication while preserving each type's distinct identity.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/module_name.rs` around lines 12 - 83, The `ModuleName` newtype is duplicating the same validated-string pattern used by `ModuleVersion` and `CommandType`, including the `new` validation, `FromStr`/`TryFrom`/`Display`/`Borrow<str>`/`AsRef<str>` impls, and near-identical error enums. Extract the shared logic into a reusable macro or generic wrapper that centralizes the empty/control-character validation and common trait impls, then keep `ModuleName` as a thin type-specific wrapper with its own name and error type.rsworkspace/crates/trogon-decider-wasm-runtime/src/execution.rs (2)
227-231: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low valueSession handle not dropped on error paths.
If
replay_eventsordecidereturnErr(lines 228-229, 294), the function returns early via?without callinghost::drop_session. Since a freshStoreis created perexecute()call (line 206/280) and dropped at function-scope end regardless of the return path, the guest resource table is torn down along with it either way, so this isn't a cross-call leak — but any guest-side resource destructor logic (if the WITsessionresource has one) never runs on these paths. Worth a defensive cleanup (e.g., a small RAII guard or explicit drop-on-error) if resource destructors gain side effects later.Also applies to: 292-296
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/execution.rs` around lines 227 - 231, The session cleanup in execute is skipped when replay_events or decide return early, so the host::drop_session call in execution.rs only runs on the success path. Update the flow around create_session, replay_events, decide, and fold_decided_events to ensure the session is always released, such as by adding a small RAII guard or explicit cleanup on error before propagating the WasmCommandError. Keep the fix localized to the execute/session-handling path so the session handle is dropped consistently regardless of whether replay_events or decide fails.
199-251: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy liftSignificant duplication between the two
execute()implementations.The
WithoutSnapshotStoreandWithSnapshotStoreexecute()methods (lines 199-251 and 269-330) share nearly identical instantiate/stream-id/precondition/replay/decide/fold/encode/append orchestration, differing only in snapshot read/write. This duplication is exactly how the unnecessaryfold_decided_eventscall above ended up copy-pasted into the no-snapshot path. Consider factoring the shared pipeline (instantiate → stream id → replay → decide → precondition resolution → encode → append) into a shared private helper parameterized over an optional "post-decide, pre-append" hook (fold + snapshot capture), so behavior can't silently diverge between the two builder states again.Also applies to: 253-379
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/execution.rs` around lines 199 - 251, The two `execute()` implementations in `WithoutSnapshotStore` and `WithSnapshotStore` duplicate the same instantiate/stream-id/read/replay/decide/append flow, which caused the stray `fold_decided_events` call to appear in the no-snapshot path. Refactor the shared pipeline in `execute()` into a private helper that handles the common orchestration and accepts an optional hook for snapshot-specific work between `decide` and `append`, so the `WithoutSnapshotStore` and `WithSnapshotStore` variants only supply their differing snapshot behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/engine.rs`:
- Around line 82-88: WasmDeciderEngine::new currently leaks wasmtime::Error from
the constructor instead of using the crate’s typed error pattern. Update the
constructor in engine.rs so it returns a crate-local error type and wraps any
Engine::new failure into that error, matching the other constructor conventions
in this crate; use the WasmDeciderEngine::new and Engine::new symbols to find
the right spot.
In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/execution.rs`:
- Around line 199-251: The `execute` method in `WithoutSnapshotStore`
unconditionally calls `fold_decided_events` after `decide`, even though the
folded state is never consumed before `host::drop_session`; this extra guest
`evolve` replay can waste fuel and trap valid no-snapshot commands. Update
`execute` so the fold step is only performed when a later snapshot actually
needs the session state, and keep the normal append path unchanged by relying on
the existing `create_session`, `decide`, `drop_session`, and `append_stream`
flow.
In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/module.rs`:
- Around line 43-49: The guest name/version validation in the module descriptor
parsing is losing the real failure reason by mapping every `ModuleName::new` and
`ModuleVersion::new` error to
`InvalidDescriptorError::EmptyName`/`EmptyVersion`. Update the
`InvalidDescriptorError` enum in `module.rs` to carry the underlying source
error for these cases, and change the parsing logic that uses `map_err` so it
preserves and propagates the actual `ModuleName`/`ModuleVersion` validation
error instead of hardcoding “empty”.
- Around line 58-64: `WasmDeciderModule` is still storing the guest wire
`ModuleDescriptor`, so the module boundary is not holding a validated domain
object as intended. Update the `WasmDeciderModule` type and its construction
path to keep a typed descriptor or typed command list instead of the raw
descriptor, and make sure `registry.rs` no longer needs to re-parse each command
with `CommandType::new` when accessing module data.
In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/registry.rs`:
- Around line 13-21: `DuplicateCommandType` is storing module identities as raw
`String` instead of the validated `ModuleName` value object. Update the
`DuplicateCommandType` variant in `registry.rs` to use `ModuleName` for both
`existing_module` and `new_module`, and adjust the places that construct this
error (including the routing conflict path around `duplicate` handling) to pass
the typed module names directly rather than calling `.name().to_string()`. Keep
`command_type` as `CommandType` so the error remains fully domain-typed and
consistent with the rest of the crate.
In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/snapshot_id.rs`:
- Around line 16-28: The WasmSnapshotId::new renderer currently concatenates
ModuleName, ModuleVersion, and stream_id with “@” and “/”, which allows
collisions when those characters appear in the inputs. Update
WasmSnapshotId::new in snapshot_id.rs to escape or encode each segment before
building the snapshot id, and keep as_str unchanged for the final rendered
value. If stream_id still needs boundary guarantees, replace the raw &str
parameter with a validated identifier type so unique identifiers remain
unambiguous.
---
Nitpick comments:
In `@rsworkspace/crates/trogon-decider-wasm-runtime/Cargo.toml`:
- Around line 1-29: The dependency list in the wasm runtime manifest is
inconsistent because `chrono` is pinned directly while the other shared crates
use workspace-managed versions. Update the `chrono` entry to use the same
workspace dependency pattern as the surrounding dependencies, and ensure the
workspace dependency table includes `chrono` so version management stays
centralized in the manifest for `trogon-decider-wasm-runtime`.
In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/execution.rs`:
- Around line 227-231: The session cleanup in execute is skipped when
replay_events or decide return early, so the host::drop_session call in
execution.rs only runs on the success path. Update the flow around
create_session, replay_events, decide, and fold_decided_events to ensure the
session is always released, such as by adding a small RAII guard or explicit
cleanup on error before propagating the WasmCommandError. Keep the fix localized
to the execute/session-handling path so the session handle is dropped
consistently regardless of whether replay_events or decide fails.
- Around line 199-251: The two `execute()` implementations in
`WithoutSnapshotStore` and `WithSnapshotStore` duplicate the same
instantiate/stream-id/read/replay/decide/append flow, which caused the stray
`fold_decided_events` call to appear in the no-snapshot path. Refactor the
shared pipeline in `execute()` into a private helper that handles the common
orchestration and accepts an optional hook for snapshot-specific work between
`decide` and `append`, so the `WithoutSnapshotStore` and `WithSnapshotStore`
variants only supply their differing snapshot behavior.
In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/module_name.rs`:
- Around line 12-83: The `ModuleName` newtype is duplicating the same
validated-string pattern used by `ModuleVersion` and `CommandType`, including
the `new` validation, `FromStr`/`TryFrom`/`Display`/`Borrow<str>`/`AsRef<str>`
impls, and near-identical error enums. Extract the shared logic into a reusable
macro or generic wrapper that centralizes the empty/control-character validation
and common trait impls, then keep `ModuleName` as a thin type-specific wrapper
with its own name and error type.
In `@rsworkspace/crates/trogon-decider-wasm-runtime/src/opaque_snapshot.rs`:
- Around line 55-63: The `SNAPSHOT_TYPE_NAME` thread_local in
`opaque_snapshot.rs` is unnecessary because `snapshot_type()` only clones a
fixed value and the extra caching adds avoidable `clippy::unwrap_used`
suppression. Simplify `snapshot_type()` by constructing
`SnapshotTypeName::new("wasm-decider-opaque.v1")` directly where it is used, or
switch to a `LazyLock` only if shared reuse is actually needed. Remove the
`thread_local!` block and keep the logic centered around `snapshot_type()` and
`SnapshotTypeName::new`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1aee659b-6d9e-413a-b990-7bef8a72d408
⛔ Files ignored due to path filters (1)
rsworkspace/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (27)
.github/workflows/ci-rust.ymlrsworkspace/Cargo.tomlrsworkspace/crates/trogon-decider-wasm-runtime/Cargo.tomlrsworkspace/crates/trogon-decider-wasm-runtime/src/command_type.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/command_type/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/domain_error_detail.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/domain_error_detail/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/engine.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/engine/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/execution.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/execution/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/lib.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/module.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/module/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/module_name.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/module_name/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/module_version.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/module_version/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/opaque_snapshot.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/opaque_snapshot/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/registry.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/registry/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/snapshot_id.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/snapshot_id/tests.rsrsworkspace/crates/trogon-decider-wasm-runtime/src/test_fixture.rsrsworkspace/crates/trogon-decider-wasm-runtime/tests/schedules_execution.rsrsworkspace/crates/trogon-decider-wasm-runtime/tests/support/mod.rs
Code Coverage SummaryDetailsDiff against mainResults for commit: 6444b37 Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit cec9714. Configure here.
Signed-off-by: Yordis Prieto <yordis.prieto@gmail.com>
