fix(dpp): return error instead of panicking on storage-fee refund div-by-zero#3799
fix(dpp): return error instead of panicking on storage-fee refund div-by-zero#3799QuantumExplorer wants to merge 1 commit into
Conversation
…-by-zero `original_removed_credits_multiplier_from` computed `dec!(1) / ratio_used`. `FEE_DISTRIBUTION_TABLE` has exactly `PERPETUAL_STORAGE_ERAS` (50) entries, so once a refund's original storage epoch is at least that whole window behind the repayment epoch (`current_era >= 50`), every table era is `Less`, the iterator yields nothing, and `ratio_used` sums to zero. `rust_decimal::Decimal`'s `/` operator PANICS on a zero divisor (unlike integer/`f64` division and unlike its own `checked_div`). On the consensus path (`process_block_fees_and_validate_sum_trees` at epoch change) this would abort every node simultaneously and halt the chain. Not reachable today, but a guaranteed liveness failure as the chain ages. The function now returns `Result<Decimal, ProtocolError>`: it returns `ProtocolError::DivideByZero` when the window is fully elapsed and guards the `paid_epochs` subtraction with `checked_sub` (`ProtocolError::Overflow`) so a repayment epoch before the original storage epoch can't underflow either. The sole production caller (`restore_original_removed_credits_amount`) already returns a Result and propagates with `?`. Test call sites updated to `.expect(...)`; adds tests for both error paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThe fee epoch distribution module refactors ChangesCredit Multiplier Error Handling
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
✅ Review complete (commit 34fe261) |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/rs-dpp/src/fee/epoch/distribution.rs (1)
225-229:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix unchecked
u16subtraction overflow inrefund_storage_fee_to_epochs_map
EpochIndexisu16, andrefund_storage_fee_to_epochs_mapcomputesstart_eraviaskip_until_epoch_index - start_epoch_indexbefore reachingrestore_original_removed_credits_amount(...), so whencurrent_epoch_index + 1 < start_epoch_indexthis can overflow (panic in debug / wrap in release) instead of returning a clean error. Add a checked subtraction forstart_eraand add a caller-level regression test throughsubtract_refunds_from_epoch_credits_collectionfor the invalid epoch ordering case.🛠️ Suggested fix
- let start_era: u16 = (skip_until_epoch_index - start_epoch_index) / epochs_per_era; + let paid_epochs = skip_until_epoch_index + .checked_sub(start_epoch_index) + .ok_or(ProtocolError::Overflow( + "start repayment epoch is before the original storage epoch", + ))?; + let start_era: u16 = paid_epochs / epochs_per_era;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/rs-dpp/src/fee/epoch/distribution.rs` around lines 225 - 229, refund_storage_fee_to_epochs_map performs subtraction on EpochIndex (u16) that can underflow when skip_until_epoch_index < start_epoch_index; change the subtraction that computes start_era to use a checked subtraction (e.g., checked_sub) and return the existing error type (or convert to the appropriate Err) when it returns None, so the function returns a clean error instead of panicking/wrapping; update the call sites if needed (e.g., where restore_original_removed_credits_amount / original_removed_credits_multiplier_from are used) to propagate the error, and add a caller-level regression test named subtract_refunds_from_epoch_credits_collection that supplies an invalid epoch ordering (current_epoch_index + 1 < start_epoch_index) and asserts the function returns the expected error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@packages/rs-dpp/src/fee/epoch/distribution.rs`:
- Around line 225-229: refund_storage_fee_to_epochs_map performs subtraction on
EpochIndex (u16) that can underflow when skip_until_epoch_index <
start_epoch_index; change the subtraction that computes start_era to use a
checked subtraction (e.g., checked_sub) and return the existing error type (or
convert to the appropriate Err) when it returns None, so the function returns a
clean error instead of panicking/wrapping; update the call sites if needed
(e.g., where restore_original_removed_credits_amount /
original_removed_credits_multiplier_from are used) to propagate the error, and
add a caller-level regression test named
subtract_refunds_from_epoch_credits_collection that supplies an invalid epoch
ordering (current_epoch_index + 1 < start_epoch_index) and asserts the function
returns the expected error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 86cb3154-7c23-4b23-bb52-928c2ed42458
📒 Files selected for processing (1)
packages/rs-dpp/src/fee/epoch/distribution.rs
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## v3.1-dev #3799 +/- ##
==========================================
Coverage 87.18% 87.19%
==========================================
Files 2624 2624
Lines 321014 321228 +214
==========================================
+ Hits 279892 280085 +193
- Misses 41122 41143 +21
🚀 New features to boost your workflow:
|
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
Targeted, well-scoped fix that converts a latent Decimal divide-by-zero panic in original_removed_credits_multiplier_from into a propagable ProtocolError::DivideByZero. The core fix is sound and consensus-safe. One suggestion: the companion defensive checked_sub for paid_epochs is bypassed by an unchecked subtraction one frame up in refund_storage_fee_to_epochs_map (line 311), so the underflow guard is unreachable on the production path. The deeper semantic question of whether aged-out refunds should be no-ops rather than errors is acknowledged as out-of-scope by the PR description.
🟡 1 suggestion(s)
1 additional finding(s) omitted (not in diff).
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `packages/rs-dpp/src/fee/epoch/distribution.rs`:
- [SUGGESTION] packages/rs-dpp/src/fee/epoch/distribution.rs:311: New `checked_sub` underflow guard is bypassed by an earlier unchecked subtraction on the same call path
The PR adds a `checked_sub` in `original_removed_credits_multiplier_from` (lines 174-178) explicitly to ensure corrupted/unexpected inputs return `ProtocolError::Overflow` rather than panicking. The sole production caller, `refund_storage_fee_to_epochs_map`, performs the same subtraction one frame up at line 311: `let start_era: u16 = (skip_until_epoch_index - start_epoch_index) / epochs_per_era;`. This runs before `restore_original_removed_credits_amount` (line 317) and therefore before the new guard ever sees the inputs. If `skip_until_epoch_index < start_epoch_index` ever occurred, line 311 would panic (debug) or wrap (release) first, so the guard is unreachable for the underflow case it's meant to defend. Either guard line 311 with `checked_sub` for a consistent defensive posture, or drop the underflow guard (and its test) since the invariant is structurally enforced. The divide-by-zero portion of the fix is unaffected by this and remains correct.
|
✅ DashSDKFFI.xcframework built for this PR.
SwiftPM (host the zip at a stable URL, then use): .binaryTarget(
name: "DashSDKFFI",
url: "https://your.cdn.example/DashSDKFFI.xcframework.zip",
checksum: "6986038bb0917700a4002b1c55a04e4d9ab8834e1fe2a17f5daf69153e2a452d"
)Xcode manual integration:
|
Important
This PR removes the panic (good — a panic risks unwinding-across-FFI/abort and is caught only by the node's panic hook), but it does not restore liveness. Returning
Errfromoriginal_removed_credits_multiplier_frompropagates up the consensus path:restore_original_removed_credits_amount→refund_storage_fee_to_epochs_map→add_distribute_storage_fee_to_epochs_operations→process_block_fees_and_validate_sum_trees→run_block_proposal_v0(the?returns the outerErr; it is never folded intoValidationResult.errors) →process_proposal(the?short-circuits before the gracefulif !run_result.is_valid()reject path) →ResponseException. Per the ABCI spec, CometBFT/Tenderdash panics on aResponseException. So every validator still halts deterministically at the same block — the halt just moves from drive-abci to Tenderdash.Proper fix (deferred — needs a fee-model semantics decision): for a fully-amortized refund (
current_era >= PERPETUAL_STORAGE_ERAS, i.e. the storage cost has been entirely distributed over the full 50-era window), return a multiplier of1(refund the amount unchanged) so the block processes and the chain continues, instead of erroring. The near-boundary multiplier explosion (current_era == 49→ ~32000×) should likely be clamped too.Also missed: the same three functions contain five more unguarded divide-by-zeros if
epochs_per_era == 0(paid_epochs / epochs_per_era,paid_epochs % epochs_per_era, and threeDecimaldivisions inoriginal_removed_credits_multiplier_from/refund_storage_fee_to_epochs_map/distribution_storage_fee_to_epochs_map).epochs_per_erais a plainu16deserialized from config with no zero-check (noNonZeroU16), so a misconfigured node panics. These should be guarded (orepochs_per_eramade a validatedNonZeroU16).The
checked_sub/is_zero()changes here are still worth keeping (they remove the unguarded panic and the underflow), but this PR should be treated as a mitigation, not the liveness fix. Theis_zero()check is correct (the empty table sum is exactlyDecimal::ZERO, not a denormal).Issue being fixed or feature implemented
Latent chain-halt panic in storage-fee refund accounting.
original_removed_credits_multiplier_from(fee/epoch/distribution.rs) computeddec!(1) / ratio_used.FEE_DISTRIBUTION_TABLEhas exactlyPERPETUAL_STORAGE_ERAS(50) entries, so once a refund's original storage epoch is at least that whole window behind the repayment epoch (current_era >= 50), every table era comparesOrdering::Less, the iterator yields nothing, andratio_usedsums to zero.rust_decimal::Decimal's/operator panics on a zero divisor — unlike integer/f64division, and unlike its ownchecked_div. This call runs on the consensus path (process_block_fees_and_validate_sum_treesat epoch change, viarestore_original_removed_credits_amount→refund_storage_fee_to_epochs_map). A panic there aborts every node simultaneously → chain halt. Not reachable today (it needs the chain to have run ~50 eras with a document surviving the full perpetual-storage window), but a guaranteed liveness failure as the chain ages, with no graceful error path.What was done?
original_removed_credits_multiplier_fromnow returnsResult<Decimal, ProtocolError>:ProtocolError::DivideByZerowhenratio_used == 0(refund older than the entire perpetual-storage window) instead of panicking.paid_epochs = start_repayment - startsubtraction withchecked_sub→ProtocolError::Overflow, so a repayment epoch before the original storage epoch can't underflow (panic in debug / wrap in release) either.The sole production caller
restore_original_removed_credits_amountalready returns aResultand now propagates with?. Test call sites updated to.expect(...).How Has This Been Tested?
cargo test -p dpp --lib distribution— 455 passed, 0 failed, including two new tests:should_return_error_instead_of_panicking_when_window_fully_elapsed: assertsoriginal_removed_credits_multiplier_from(0, 1000, 20)(current_era = 50) returnsErr(DivideByZero)and thatrestore_original_removed_credits_amountpropagates it rather than panicking.should_return_error_when_repayment_epoch_precedes_start: asserts the underflow guard returnsErr(Overflow).cargo fmt --allclean.Breaking Changes
None. Internal fee-distribution helper signature only; behavior is unchanged for all in-window refunds (the only difference is a clean error instead of a panic in the out-of-window edge case).
Checklist:
For repository code-owners and collaborators only
Summary by CodeRabbit