Skip to content

perf: optimize division to uint32 to make re-target-pow calculation much faster#7352

Open
knst wants to merge 2 commits into
dashpay:developfrom
knst:perf-header-sync
Open

perf: optimize division to uint32 to make re-target-pow calculation much faster#7352
knst wants to merge 2 commits into
dashpay:developfrom
knst:perf-header-sync

Conversation

@knst

@knst knst commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Issue being fixed or feature implemented

Dash retargets difficulty every block via DarkGravityWave (24-block loop), so every header processed during sync runs several 256-bit divisions. Bitcoin only retargets once per 2016 blocks.

Every division in DGW is by a small integer (nCountBlocks+1) * nTargetTimespan (3600).
But the code routes through the only available overload, operator/=(const base_uint&), which does bit-by-bit long division: ~224 iterations, each doing a full-width div >>= 1

Perf shows that during header sync up to 35% of CPU is wasted on this division on release build (accordingly perf):

  20.36%  d-msghand dash-qt [.] base_uint<256u>::operator>>=(unsigned int)
  15.00%  d-msghand dash-qt [.] base_uint<256u>::operator/=(base_uint<256u> const&)

What was done?

Implemented optimization for division to unsigned values that could fit inside uint32_t).

How Has This Been Tested?

Got GUIX build from changes in this PR and develop:

[develop] 5af9f5754e25/src/qt/dash-qt  -datadir=/tmp/dd -reindex 

real    7m42.152s
user    7m0.396s
sys     0m3.328s

[PR] 8608a3c3bbcd/src/qt/dash-qt  -datadir=/tmp/dd -reindex 

real    4m17.294s
user    2m55.474s
sys     0m2.731s

Perf stats are updated as expected:

-    1.61%     0.06%  d-msghand dash-qt [.] base_uint<256u>::operator/=(base_uint<256u> const&)
   - 1.55% base_uint<256u>::operator/=(base_uint<256u> const&)
        0.92% base_uint<256u>::operator>>=(unsigned int)
+    1.48%     0.57%  d-msghand dash-qt [.] base_uint<256u>::operator>>=(unsigned int)

So total header sync just a half of the time. Please note, that header sync will be slightly slower due to #7320 so overall win is smaller.

Breaking Changes

N/A

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

@knst knst added this to the 24 milestone Jun 9, 2026
@knst knst marked this pull request as ready for review June 9, 2026 13:42
@knst knst requested review from PastaPastaPasta and UdjinM6 June 9, 2026 13:42
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

This PR adds scalar division operator overloads to the base_uint<BITS> template for uint64_t divisors. The changes include a new operator/=(uint64_t) member function declaration and implementation, a non-member operator/(const base_uint&, uint64_t) friend function, and supporting test cases. The implementation validates input, delegates to existing big-integer division when needed, and performs optimized long division for values within uint32_t range.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main optimization: adding a faster division path for small divisors (uint32_t) to improve re-target-pow calculation performance.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description clearly explains the performance problem (DarkGravityWave divisions consuming ~35% CPU), the implemented optimization (uint32_t division overload), and provides concrete benchmark evidence showing significant improvement (7m42s → 4m17s).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/arith_uint256.cpp (1)

76-77: ⚡ Quick win

Parameter name b32 is misleading for uint64_t type.

The parameter is named b32 but the function accepts uint64_t. This inconsistency appears in both the declaration (line 161 of arith_uint256.h) and implementation. Consider renaming to b64 or b for clarity.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/arith_uint256.cpp` around lines 76 - 77, Rename the misleading parameter
b32 in the division operator to a name that matches its type and intent (e.g.,
b64 or simply b) in both the declaration and implementation of template<unsigned
int BITS> base_uint<BITS>::operator/=(uint64_t b32); update the parameter name
in the function signature and all uses inside the implementation so they remain
consistent (search for base_uint<BITS>::operator/= and the declaration in the
header to change both places).
src/arith_uint256.h (1)

161-161: ⚡ Quick win

Parameter name b32 is misleading for uint64_t type.

The parameter is named b32 but accepts uint64_t. While the implementation optimizes for values ≤ uint32_t max, the parameter type is 64-bit and the name could cause confusion. Consider renaming to b64 or simply b (as used in line 211) for clarity.

✏️ Suggested naming improvement
-    base_uint& operator/=(uint64_t b32);
+    base_uint& operator/=(uint64_t b);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/arith_uint256.h` at line 161, Rename the misleading parameter `b32` in
the declaration of operator/= to a clearer name (e.g., `b` or `b64`) and update
the corresponding definition/implementation to use the same name; specifically
modify the signature `base_uint& operator/=(uint64_t b32);` to `base_uint&
operator/=(uint64_t b);` (or `b64`) and change the implementation function
parameter and any internal references to match, and then update any
callers/forward-declarations that reference the old name to keep declarations
and definitions consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/arith_uint256.cpp`:
- Around line 76-77: Rename the misleading parameter b32 in the division
operator to a name that matches its type and intent (e.g., b64 or simply b) in
both the declaration and implementation of template<unsigned int BITS>
base_uint<BITS>::operator/=(uint64_t b32); update the parameter name in the
function signature and all uses inside the implementation so they remain
consistent (search for base_uint<BITS>::operator/= and the declaration in the
header to change both places).

In `@src/arith_uint256.h`:
- Line 161: Rename the misleading parameter `b32` in the declaration of
operator/= to a clearer name (e.g., `b` or `b64`) and update the corresponding
definition/implementation to use the same name; specifically modify the
signature `base_uint& operator/=(uint64_t b32);` to `base_uint&
operator/=(uint64_t b);` (or `b64`) and change the implementation function
parameter and any internal references to match, and then update any
callers/forward-declarations that reference the old name to keep declarations
and definitions consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bf974529-fc8b-450b-857a-fe88d01cd909

📥 Commits

Reviewing files that changed from the base of the PR and between 317917a and e8435d3.

📒 Files selected for processing (3)
  • src/arith_uint256.cpp
  • src/arith_uint256.h
  • src/test/arith_uint256_tests.cpp

@thepastaclaw

thepastaclaw commented Jun 10, 2026

Copy link
Copy Markdown

✅ Review complete (commit e8435d3)

@UdjinM6 UdjinM6 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and it's much faster indeed, see f2d8de6 (140x on my machine)

utACK e8435d3

Comment thread src/arith_uint256.cpp
}

template <unsigned int BITS>
base_uint<BITS>& base_uint<BITS>::operator/=(uint64_t b32)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it a typo?

Suggested change
base_uint<BITS>& base_uint<BITS>::operator/=(uint64_t b32)
base_uint<BITS>& base_uint<BITS>::operator/=(uint64_t b64)

same in other places

@knst knst Jun 11, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really a typo, because it's supposed optimized for b32 only. If you think that's confusing I'd rename it.

It is supposed to be uint32_t b32 but compiler won't say a word if that's called with uint64_t, so, I keep it mixed.

@thepastaclaw thepastaclaw left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Small, well-scoped performance PR: adds a uint64_t divisor overload to base_uint with a fast word-wise long-division path when the divisor fits in uint32_t, falling back to the existing base_uint divisor otherwise. The schoolbook radix-2^32 loop is correct (the running remainder is always < b32 ≤ 2^32, so (rem<<32)|pn[i] fits in uint64_t), zero is rejected, and the regression test asserts bit-identity against the widened arith_uint256 divisor across both code paths plus the zero-throw case. Both agents and CodeRabbit found no in-scope issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants