2.3.0 rc#5114
Conversation
NathanFlurry
commented
May 28, 2026
- fix(rivetkit): exit pid1 after signal shutdown
- fix(rivetkit): use engine actor stop threshold for shutdown
- test(depot-client): stale vfs cache reads fail closed
- test(depot-client): head fence read poisons vfs
- test(depot-client): vfs stale page cache writer
- test(depot-client): delayed read ahead stale pages
- test(depot-client): startup preload stale pages
- test(rivetkit-core): sqlite lifecycle fuzz harness
- chore(kitchen-sink): agent load test
- test(depot-client): batch atomic cap repro
- test(depot-client): warm pidx stale read rmw repro
- test(depot-client): natural warm pidx repro
- test(depot-client): natural reopen warm pidx repro
- [SLOP(claude-opus-4-7)] feat(envoy-client): add observability metrics for ws transport and sqlite request lifecycle
- [SLOP(claude-opus-4-7)] fix(envoy-client): emit Stopped(Error) on lost-timeout to prevent silent destroy
- Fix actor lost on envoy-client
- DO NOT MERGE: serverless restart race condition
- fix(rivetkit): use engine actor stop threshold for shutdown
- test(kitchen-sink): sigterm sleep probe fixtures
- feat(kitchen-sink): rust counter-latency harness
- chore(kitchen-sink): refresh bench + smoke scripts
- chore(kitchen-sink): counter actor + sigterm probe tweaks
- chore(envoy-client): trace websocket backpressure
- feat(envoy-client): add EnvoyStatusHandle wrapper
- feat(rivetkit-core): wire EnvoyStatusHandle into dispatcher
- feat(rivetkit-core): expose envoy status through /metrics
- feat(rivetkit-napi): expose actorStopThresholdMs + envoy-aware health/metrics
- feat(rivetkit-core): record connection close reason + lifetime metrics
- feat(kitchen-sink): ws-ping fast-path on tunnel-stress + load-test-agent
- Add debugging
- fix(pegboard): add actor-scoped generation key for sqlite fencing
- Revert "fix(pegboard): add actor-scoped generation key for sqlite fencing"
- Cargo fmt
- Fix actor generation validation for sqlite
- [SLOP(claude-sonnet-4-5)] feat(metrics): add envoy lifecycle, stop reason, ws traffic, and js runtime metrics
- [SLOP(claude-sonnet-4-5)] chore(logs): promote actor stop logs to info
- [SLOP(claude-sonnet-4-5)] chore(logs): improve actor stop and envoy ping diagnostics
- Remove slop
- chore(kitchen-sink): add rivet cloud deploy workflow
- [SLOP(gpt-5)] fix(rivetkit): reject comma-joined serverless endpoint header
- [SLOP(gpt-5)] fix(rivetkit): disable cached serverless envoy by default
- [SLOP(gpt-5)] fix(rivetkit): warn on cached serverless envoy regional mismatch
- [SLOP(gpt-5)] docs(rivetkit): record performance audit notes
- [SLOP(gpt-5)] test(envoy-client): update SharedContext fixtures for websocket diagnostics
- [RIVETER(rivetkit-perf-fixes-4lv24k3r,[SVC-2555] Set up issue templates #1,gpt-5.5)] chore: perf(envoy-client): convert StdMutex SharedContext fields to scc
- chore(kitchen-sink): update deployment diagnostics wiring
- [RIVETER(rivetkit-perf-fixes-4lv24k3r,[SVC-2479] Send cluster events to PostHog #2,gpt-5.5)] chore: perf(envoy-client): replace ws_tx tokio Mutex with ArcSwapOption on hot path
- [RIVETER(rivetkit-perf-fixes-4lv24k3r,[SVC-2504] Fix 5 GB upload limit for local development from Cloudflare #3,gpt-5.5)] chore: perf(envoy-client): replace BufferMap String keys with u64/[u8;8]
- [RIVETER(rivetkit-perf-fixes-4lv24k3r,[SVC-2483] Remove hardcoded uses of rivet.gg #4,gpt-5.5)] chore: perf(rivetkit-core): sample record_inbox_depths instead of every loop iteration
- [RIVETER(rivetkit-perf-fixes-4lv24k3r,[SVC-2358] Enable Redis memory overcommit #5,gpt-5.5)] chore: fix(rivetkit): repair setInterval missing-delay bug in actor-conn keepalive
- perf(rivetkit-core): tighten queue_metadata lock around enqueue
- perf(rivetkit-core, envoy-client): convert scc sync methods to async in async contexts
- perf(envoy-client, guard): enable TCP_NODELAY by default + expose ws_tx_depth metric
- Add gradual shutdown for load test
- Fix actor stopped restart
… for ws transport and sqlite request lifecycle
…t-timeout to prevent silent destroy
chore(envoy-client): init new backpressure tracking fields
…ason, ws traffic, and js runtime metrics
…ebsocket diagnostics
…client): convert StdMutex<HashMap> SharedContext fields to scc
…client): replace ws_tx tokio Mutex with ArcSwapOption on hot path
…client): replace BufferMap String keys with u64/[u8;8]
…it-core): sample record_inbox_depths instead of every loop iteration
…t): repair setInterval missing-delay bug in actor-conn keepalive
…in async contexts
|
🚅 Deployed to the rivet-pr-5114 environment in rivet-frontend
|
Code Review: PR #5114 - 2.3.0 rcLarge release candidate (128 files, +32k/-945) spanning observability, performance, correctness, and testing. Overview
Issues1. "DO NOT MERGE" commit in PR bodyThe PR description includes "DO NOT MERGE: serverless restart race condition" as one of the commits. Please confirm this was addressed or reverted before merging. 2. actor.rs - Lost-timeout sends SleepIntent; metrics will misattributeThe behavioral fix is correct: sending 3. conn.rs - actor_started_at and actor_stop_meta grow unbounded
4. vfs.rs - tracing::debug promoted to tracing::info on hot VFS pathsThe 5. Duplicate stop_reason_label in actor_lifecycle.rs and actor.rsBoth files define an identical Non-blocking ObservationsWsTxMessage::Send struct expansion (context.rs): The is_ping_healthy refactor (handle.rs): Clean refactoring to PID1 finishShutdownSignal fix (registry/index.ts): actorStopThresholdMs for shutdown grace (registry/index.ts): Grace period shifts from hardcoded 30s to the engine-provided serverless_cache_envoy: false (registry/envoy_callbacks.rs): Correct default. The current protocol cannot authenticate per-request envoy reuse. The opt-in regional-mismatch warning is a good addition. BufferMap key String to [u8; 8] (utils.rs): Cleaner and faster than the cyrb53 hash string. scc::HashMap migration in SharedContext (context.rs, envoy.rs): Correct per project conventions. The Preload limit increases (optimization_flags.rs): MAX_STARTUP_PRELOAD_MAX_BYTES 8 MB to 64 MB and MAX_STARTUP_PRELOAD_FIRST_PAGE_COUNT 256 to 16,384 are caps on user-configured values, not defaults. A 64 MB ceiling could still cause startup memory pressure; worth documenting the expected envelope. Minor / Style
Summary: The correctness and performance work is solid. Three items need attention before merge: (1) audit the "DO NOT MERGE" commit, (2) misleading metric label for lost-timeout actors appearing as |