Skip to content

MBFF clock-tree power credit - promote higher banking ratios#10796

Open
dsengupta0628 wants to merge 12 commits into
The-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:mbff_improve_banking
Open

MBFF clock-tree power credit - promote higher banking ratios#10796
dsengupta0628 wants to merge 12 commits into
The-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:mbff_improve_banking

Conversation

@dsengupta0628

Copy link
Copy Markdown
Contributor

Summary

Adds an opt-in cluster_flops -clock_power_weight flag that lets the MBFF clustering ILP credit clock-tree power saved by banking, enabling higher banking ratios on designs where the current cost model stalls.

Problem

cluster_flops can't exceed a certain banking on some customer libraries, while commercial tools can reach almost double of that. Higher banking cuts CTS routing load and yields more accurate post-route power.

Root cause: the clustering ILP (MBFF::RunILP) only models in-cell clock-pin sharing. The dominant real benefit- collapsing N clock-tree sinks (buffers + clock-net wire) down to 1 - is external to the cell and uncounted. With the savings understated, the ILP stops banking early. ORFS libs already show a large per-bit tray-power drop so they're unaffected; customer libs with a modest drop gate hard.

One Solution

Model a per-sink clock-tree power P_ct = k · single_bit_power. An N-bit tray presents one sink regardless of N, so banking eliminates (N−1)·P_ct. Fold into per-tray cost, keeping the 1-bit baseline at 1.0:

tray_cost[N-bit] = (tray_total_power + P_ct) / (single_bit_power + P_ct)
tray_cost[1-bit] = 1.0

  • k = 0 (default) --> P_ct = 0 --> cost model bit-identical to today. ORFS/goldens untouched.
  • k > 0 --> ratio shrinks toward 1, large trays get relatively cheaper --> higher banking.
  • k < 0 --> rejected with [ERROR GPL-0115].

Single conceptual edit in MBFF::SetRatios; RunILP already reads norm_power_. No change to displacement/timing terms.

Interface

cluster_flops [-clock_power_weight k] # float, k >= 0, default 0.0

Type of Change

  • New feature
  • Documentation update

Impact

Improves banking ratio of multi bit flop clustering using the new feature

Verification

  • I have verified that the local build succeeds (./etc/Build.sh).
  • I have run the relevant tests and they pass.
  • My code follows the repository's formatting guidelines.
  • I have included tests to prevent regressions.
  • I have signed my commits (DCO).

Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
(cherry picked from commit 320a848df4db3d7da0f4f71aa494612a15970a6a)
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
@dsengupta0628 dsengupta0628 self-assigned this Jul 1, 2026
@github-actions github-actions Bot added the size/M label Jul 1, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new -clock_power_weight parameter to the multi-bit flip-flop (MBFF) clustering algorithm, allowing users to credit clock-tree power savings and promote higher banking ratios. The changes span the documentation, C++ implementation, TCL interface, and include new integration tests. The feedback suggests adding a defensive check in the C++ MBFF::Run entry point to validate that clock_power_weight is non-negative, preventing potential division-by-zero errors when the function is called directly from other APIs.

Comment thread src/gpl/src/mbff.cpp
const float beta,
const float clock_power_weight)
{
clock_power_weight_ = clock_power_weight;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While clock_power_weight is validated to be non-negative in the TCL interface, it is not validated in the C++ MBFF::Run entry point. If MBFF::Run is called directly (e.g., via Python or other C++ APIs) with a negative clock_power_weight, it can lead to a division by zero in MBFF::SetRatios where single_bit_power_ + p_ct is used as a divisor. Adding a defensive check here using log_->error ensures robust input validation.

  if (clock_power_weight < 0.0f) {
    log_->error(GPL, 115, "-clock_power_weight must be non-negative.");
  }
  clock_power_weight_ = clock_power_weight;
References
  1. Calls to logger_->error are expected to terminate the program, so additional error handling like returning an error code or throwing an exception is not necessary.

@dsengupta0628 dsengupta0628 marked this pull request as ready for review July 1, 2026 19:00
@dsengupta0628 dsengupta0628 requested a review from a team as a code owner July 1, 2026 19:00
@dsengupta0628 dsengupta0628 requested a review from LucasYuki July 1, 2026 19:00
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
@dsengupta0628

Copy link
Copy Markdown
Contributor Author

@QuantamHD

Copy link
Copy Markdown
Collaborator

Do we need an opt-in flag for this behavior. To me MBFF's primary benefit is around power reduction and we should do all we can to reduce power of which clock power is one dimension. My worry here is that this flag just makes the default behavior worse.

@dsengupta0628

Copy link
Copy Markdown
Contributor Author

Do we need an opt-in flag for this behavior. To me MBFF's primary benefit is around power reduction and we should do all we can to reduce power of which clock power is one dimension. My worry here is that this flag just makes the default behavior worse.

The reason it's opt-in rather than on-by-default is that this change doesn't really measure clock-tree power - it models a proxy, P_ct = k · single_bit_power, where k is a hand-tuned weight with no physically-grounded value. That makes a safe default hard to justify:

  1. More banking isn't free - it increases displacement. Higher k pushes the ILP to bank harder, which drags flops further from their ideal locations. On timing-/wirelength-sensitive designs that added displacement can regress QoR.
  2. The timing counterweight is itself opt-in and off by default. The ILP objective does have a timing term - but it only sums
    over num_paths critical FF-pairs, and num_paths defaults to 0, so the term is empty unless the user supplies paths (and timing_weight defaults to 0.1). So a nonzero default k would push banking with no timing term opposing it in the default configuration. A default k + default-inert timing term is exactly the combination that risks over-banking critical paths.

So the flag isn't a decision to keep clock-power savings off - it's that turning an un-calibrated proxy on by default, while the timing safeguard is also default-off, isn't defensible without per-design justification.

My intention with this was to have a code now with k=0 default (safe, reviewable, unblocks customers who need higher banking today), and open a follow-up to make clock-power credit default via some kind of a measured estimate - deriving P_ct from the actual clock tree (buffer library + fanout + RC) instead of a user knob- haven't explored that yet. But since we did see this code helping with the customer banking ratio already, I figured this is a good representation of giving clock power credit instead of actual estimation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants