test(drive-abci): assert check_tx never mutates committed grovedb state#3844
Conversation
CheckTx fee estimation runs validate_fees_of_event with transaction: None, so any eager GroveDB write on that path (e.g. inside a drive-op converter) commits straight to disk, diverges the on-disk root from the signed app hash and deterministically halts the chain — exactly what happened on devnet paloma at height 788 via the pre-#3823 shielded InsertNullifiers converter arm. #3823 removed the bug; these regression tests guard the class: - new test helper assert_committed_root_hash_unchanged that asserts the committed (transaction: None) root hash is byte-identical around a CheckTx-path call - targeted test driving validate_fees_of_event for a synthetic type-20 PaidFromShieldedPoolToNewIdentity event whose ops carry InsertNullifiers + InsertNote + UpdateTotalBalance (the exact shape that halted paloma, no Halo2 proving needed) - broad-sweep tests wrapping full check_tx (FirstTimeCheck + Recheck) for data contract create (Paid arm) and identity top up (PaidFromAssetLock arm) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
✅ Review complete (commit 49f3993) |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (18)
📝 WalkthroughWalkthroughAdds a test helper and test-only instrumentation plus multiple regression tests and test-guards asserting that CheckTx/fee-estimation paths do not mutate the committed GroveDB root hash across various transition flows. ChangesCheckTx Fee Estimation State Guard
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
… test Fee estimation converts the AddNewIdentity op, which parses the key bytes — the dummy zeroed key the validate_state tests use fails with "unable to create pub key". Build the synthetic transition from a real random ECDSA master key instead. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## v3.1-dev #3844 +/- ##
=============================================
- Coverage 87.04% 70.73% -16.32%
=============================================
Files 2677 20 -2657
Lines 329918 2788 -327130
=============================================
- Hits 287182 1972 -285210
+ Misses 42736 816 -41920
🚀 New features to boost your workflow:
|
…tate transition types Makes the paloma h788 guard airtight: - cfg(test) auto-guards in the Platform::check_tx and validate_fees_of_event (transaction: None) dispatchers assert the committed grovedb root hash is byte-identical around every call, so every present and future test exercising any transition through CheckTx enforces the invariant automatically (zero production cost) - new assert_check_tx_valid_at_all_levels test helper runs FirstTimeCheck + Recheck and asserts validity so fee estimation genuinely executes for the canonical valid fixture of each type - wired the helper into the canonical success tests of every state transition type not already covered by existing check_tx tests: identity credit transfer/withdrawal, masternode vote (via the shared perform_vote helper, covering all vote tests), the six address-funded types, and the five shielded types (whose fixtures carry real Halo2 proofs, so full check_tx passes proof validation and reaches estimation) - execution::check_tx module raised to pub(crate) so the test helper can name CheckTxLevel Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
Test-only PR adding a state_mutation_guard helper plus three regression tests that pin the CheckTx committed-state invariant (paloma height-788 halt class). The shipped helper and three tests are correct, but the broad invariant promised by the PR description and the in-file broad-sweep guard doc comment is not wired in: the helper is never installed inside Platform::check_tx or validate_fees_of_event under #[cfg(test)], and only three of the 21 state-transition variants are explicitly wrapped. Suggestions only, no consensus or production-code defects.
🟡 2 suggestion(s) | 💬 2 nitpick(s)
1 additional finding(s) omitted (not in diff).
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `packages/rs-drive-abci/src/execution/check_tx/v0/mod.rs`:
- [SUGGESTION] packages/rs-drive-abci/src/execution/check_tx/v0/mod.rs:157-184: CheckTx mutation guard is opt-in instead of wired into the dispatcher
`check_tx_v0` calls `state_transition_to_execution_event_for_check_tx` and `validate_fees_of_event(..., None, ...)` directly with no `#[cfg(test)]` invariant wrapper. The new `assert_committed_root_hash_unchanged` helper is invoked only at three call sites (this file lines 3909/3931/4079/4101 and `identity_create_from_shielded_pool/tests.rs:581`); every other existing or future check_tx / fee-estimation test (e.g. lines 2417, 2543, 2592, 2693, 2742, 2979, 3028, 3224, 3246, 3282, 3333, 4044) continues to invoke these entry points unwrapped.
The PR description and the doc comment at line 3824 frame this as a broad-sweep guard that protects any future eager-write regression suite-wide; in practice the invariant is only enforced where a contributor remembers to wrap the call. Wrap `Platform::check_tx` and `validate_fees_of_event` (when `transaction.is_none()`) in `assert_committed_root_hash_unchanged` under `#[cfg(test)]`, so the invariant is structural rather than per-test convention.
- [SUGGESTION] packages/rs-drive-abci/src/execution/check_tx/v0/mod.rs:3824-3837: Doc comment and PR description overstate the actual variant coverage
The doc comment calls this 'the broad-sweep guard for the 2026-06-10 devnet paloma chain halt' and the PR body lists explicit coverage across data contract create + identity top up + masternode vote + all six address-funded variants + all five shielded variants via an `assert_check_tx_valid_at_all_levels` helper that retrofits canonical valid-fixture tests. The shipped diff includes only three guard sites: this test, `check_tx_does_not_mutate_committed_state_identity_top_up` immediately below, and `check_tx_fee_estimation_does_not_mutate_committed_state` in `identity_create_from_shielded_pool/tests.rs`. The other address-funded fee-estimation arms (`PaidFromAddressInputs`, `PaidFromAssetLockToPool`), the remaining four shielded transition types, masternode vote, identity credit transfer/withdrawal, and the `assert_check_tx_valid_at_all_levels` helper are absent.
Either land the missing retrofits (preferably alongside the dispatcher guard from the previous finding) or trim the doc comment and PR body so future investigators do not over-trust a safety net that only covers three transitions.
| /// CheckTx must NEVER mutate committed GroveDB state — the broad-sweep guard for the | ||
| /// 2026-06-10 devnet paloma chain halt (height 788). | ||
| /// | ||
| /// CheckTx fee estimation runs `validate_fees_of_event(..., transaction: None, ...)` (see | ||
| /// `check_tx_v0` above), so an eager GroveDB write anywhere on the CheckTx path (e.g. inside | ||
| /// a drive-op converter, as the pre-#3823 shielded `InsertNullifiers` arm did with | ||
| /// `store_nullifiers_for_block`) commits straight to disk on every node that validates the | ||
| /// gossiped transition. The on-disk root then diverges from the signed app hash and every | ||
| /// proposer panics with "drive and platform state app hash mismatch" — a chain halt that | ||
| /// restarts cannot heal because the write is durable. This test covers the identity-paid | ||
| /// (`ExecutionEvent::Paid`) estimation arm; the asset-lock arm is covered by | ||
| /// `check_tx_does_not_mutate_committed_state_identity_top_up` below, and the exact type-20 | ||
| /// shape that halted paloma by `check_tx_fee_estimation_does_not_mutate_committed_state` in | ||
| /// `identity_create_from_shielded_pool/tests.rs`. |
There was a problem hiding this comment.
🟡 Suggestion: Doc comment and PR description overstate the actual variant coverage
The doc comment calls this 'the broad-sweep guard for the 2026-06-10 devnet paloma chain halt' and the PR body lists explicit coverage across data contract create + identity top up + masternode vote + all six address-funded variants + all five shielded variants via an assert_check_tx_valid_at_all_levels helper that retrofits canonical valid-fixture tests. The shipped diff includes only three guard sites: this test, check_tx_does_not_mutate_committed_state_identity_top_up immediately below, and check_tx_fee_estimation_does_not_mutate_committed_state in identity_create_from_shielded_pool/tests.rs. The other address-funded fee-estimation arms (PaidFromAddressInputs, PaidFromAssetLockToPool), the remaining four shielded transition types, masternode vote, identity credit transfer/withdrawal, and the assert_check_tx_valid_at_all_levels helper are absent.
Either land the missing retrofits (preferably alongside the dispatcher guard from the previous finding) or trim the doc comment and PR body so future investigators do not over-trust a safety net that only covers three transitions.
source: ['claude']
| assert!( | ||
| fee_result.data.is_some(), | ||
| "estimation must produce a fee result" | ||
| ); |
There was a problem hiding this comment.
💬 Nitpick: Assert fee_result.is_valid() rather than just data.is_some()
validate_fees_of_event returns a ConsensusValidationResult; the surrounding .expect("fee estimation must not error") only catches Err. A future change that turns the type-20 estimation path into a short-circuit producing a populated data with consensus errors would still satisfy fee_result.data.is_some() — yet the comment above promises a check that estimation 'actually runs every arm of the shielded low-level converter'. Stronger: assert fee_result.is_valid() (or fee_result.errors.is_empty()) so a silent weakening of the estimation path is caught here instead of slipping past.
source: ['claude']
| let recheck_result = assert_committed_root_hash_unchanged( | ||
| &platform.drive, | ||
| platform_version, | ||
| "check_tx Recheck (data contract create)", | ||
| || { | ||
| platform.check_tx( | ||
| serialized.as_slice(), | ||
| Recheck, | ||
| &platform_ref, | ||
| platform_version, | ||
| ) | ||
| }, | ||
| ) | ||
| .expect("expected to check tx"); | ||
| assert!(recheck_result.is_valid()); |
There was a problem hiding this comment.
💬 Nitpick: Recheck is exercised without an intervening block commit
Both check_tx_does_not_mutate_committed_state_data_contract_create (lines 3931-3945) and check_tx_does_not_mutate_committed_state_identity_top_up (lines 4101-4115) run the same serialized transition through FirstTimeCheck and then immediately through Recheck without a block commit between them. In production, Recheck only runs after a block commits — re-validating mempool transitions against the new state — so this sequence does not match how Recheck is actually reached. The root-hash invariant still holds, but the recheck arm here exercises an unrealistic state shape. Either drop the Recheck half (the FirstTimeCheck half already pins the invariant for these transition types) or commit a block between the two check_tx calls so the recheck path is realistic.
source: ['claude']
Issue being fixed or feature implemented
On 2026-06-10 devnet paloma chain-halted at height 788. Root cause: in pre-#3823 builds, the
ShieldedPoolOperationType::InsertNullifierslow-level converter arm eagerly calleddrive.store_nullifiers_for_block(...)— a direct write — with whateverTransactionArgit was handed. CheckTx fee estimation callsvalidate_fees_of_event(..., transaction: None, ...)(check_tx_v0), so the eager write committed DIRECTLY to GroveDB on every node that validated the gossiped type-20 transition. The on-disk root diverged from the signed app hash, every proposer panicked with "drive and platform state app hash mismatch" (prepare_proposal.rs), and restarts panic forever ininfo.rsbecause the write is durable.#3823 removed that subsystem (the converter is now pure), but nothing guarded the CLASS: anyone reintroducing an eager write inside an op converter (or anywhere on the CheckTx path) would re-create the halt. This PR makes that class of bug unable to slip through tests for ANY state transition type.
What was done?
Test-only change in
rs-drive-abci(the only production-code touch is acfg(test)block in two dispatchers and onepub(crate)visibility bump):1. Automatic invariant enforcement (
#[cfg(test)], zero production cost):Platform::check_txasserts the committed (transaction: None, on-disk) GroveDB root hash is byte-identical around every call.validate_fees_of_eventasserts the same whenever called withtransaction: None(the CheckTx estimation mode).2. Shared guard helpers (
test/helpers/state_mutation_guard.rs):assert_committed_root_hash_unchanged(drive, version, context, closure)— explicit wrapper used by the dedicated regression tests.assert_check_tx_valid_at_all_levels(platform, serialized, context)— runsFirstTimeCheck+Recheckand asserts BOTH return valid, so fee estimation genuinely executes (an early consensus rejection would leave the type's drive-op converters unexercised in estimation mode).3. Coverage of all 21
StateTransitionvariants:check_tx_fee_estimation_does_not_mutate_committed_statebuilds a syntheticIdentityCreateFromShieldedPoolTransitionAction(no Halo2 proving), asserts its event ops carryInsertNullifiers+InsertNote+UpdateTotalBalance, and drivesvalidate_fees_of_event(..., None, ...)under the guard.check_tx/v0/mod.rs).assert_check_tx_valid_at_all_levelswired into the canonical valid-fixture test of every type not already exercised through check_tx: identity credit transfer, identity credit withdrawal, masternode vote (inside the sharedperform_votehelper, so every vote test inherits it), all six address-funded types (IdentityCreditTransferToAddresses,IdentityCreateFromAddresses,IdentityTopUpFromAddresses,AddressFundsTransfer,AddressFundingFromAssetLock,AddressCreditWithdrawal), and all five shielded types (Shield,Unshield,ShieldedTransfer,ShieldedWithdrawal,ShieldFromAssetLock) — the shielded fixtures carry real Halo2 proofs, so full check_tx passes proof validation and reaches estimation.This covers all five fee-estimation arms that touch GroveDB (
Paid,PaidFromAssetLock,PaidFromAddressInputs,PaidFromAssetLockToPool,PaidFromShieldedPoolToNewIdentity) and every drive-op converter family reachable from CheckTx.How Has This Been Tested?
cargo test -p drive-abci --lib— full suite with the automatic guards firing in every check_tx/estimation callInsertNullifiersconverter arm makes the targeted test fail with the root-hash mismatch assertion, confirming the guard catches the paloma bug class (change reverted)cargo fmt --allandcargo check --workspace --all-targetsBreaking Changes
None — test-only behavior; the dispatcher guards compile only under
cfg(test).Checklist:
For repository code-owners and collaborators only
🤖 Generated with Claude Code