Skip to content

Latest STA changes 06/29#379

Open
dsengupta0628 wants to merge 37 commits into
The-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:sta_latest_0629
Open

Latest STA changes 06/29#379
dsengupta0628 wants to merge 37 commits into
The-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:sta_latest_0629

Conversation

@dsengupta0628

Copy link
Copy Markdown
Contributor

Resolved merge conflict in .github/workflows/ci.yml: pinned actions/checkout to v7 (9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0), keeping SHA-pin style over upstream's @v7 tag.

Replaced deprecated LibertyCell::hasSequentials() with isSequential() (identical impl) at 9 call sites across 5 test files: TestPower.cc, TestLibertyClasses.cc, TestLibertyStaCallbacks.cc, TestLibertyStaBasicsB.cc, TestLibertyStaBasics.cc. Clears all -Wdeprecated-declarations warnings from these files.

calewis and others added 30 commits June 18, 2026 13:52
…lver (#450)

This commit refactors the Dartu-Menezes-Pileggi (DMP) effective capacitance
algorithm solver to use the Eigen library, and implements a Determinant Guarded
solver to handle singular and near-singular Jacobians safely and efficiently.

Key changes:
1. Refactored the DMP solver to use Eigen stack-based data structures,
   replacing the custom Crout LU decomposition and solver.
2. Extracted the linear solver logic into a dedicated helper function
   DmpAlg::solveNewtonStep.
3. Implemented a "Determinant Guarded" solver:
   - Manually checks the determinant of the Jacobian (safety guard).
   - Throws a DmpError if the determinant is dangerously close to zero (|det| < 1e-12),
     safely triggering the lumped capacitance fallback.
   - Otherwise, solves using Eigen's highly optimized analytical inverse (fast path).
4. Added documentation in the comments on how to easily swap back to LU
   decomposition with partial pivoting (PartialPivLU) if any numerical
   precision issues arise in the future.

Benchmark Results (Optimized Release Build, 5,288 DMP calls):

| Test Case | Calls | 1. Original (Crout LU) | 2. Eigen (PartialPivLU) | 3. Eigen (Determinant Guarded) | Speedup (3 vs 1) | Speedup (3 vs 2) |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
| power | 2,608 | 1.8822 ms | 1.6791 ms | 1.2702 ms | +32.5% | +24.4% |
| power_vcd | 2,608 | 1.8729 ms | 1.6704 ms | 1.2768 ms | +31.8% | +23.6% |
| spef_parasitics | 24 | 0.0193 ms | 0.0187 ms | 0.0140 ms | +27.5% | +25.1% |
| mcmm3 | 48 | 0.0728 ms | 0.0756 ms | 0.0817 ms | -12.2% | -8.1% |
| Total DMP Time | 5,288 | 3.8472 ms | 3.4438 ms | 2.6427 ms | +31.3% | +23.3% |
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: James Cherry <cherry@CerezoBook.local>
James Cherry and others added 3 commits June 27, 2026 20:43
doc
Signed-off-by: James Cherry <cherry@CerezoBook.local>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
@dsengupta0628 dsengupta0628 self-assigned this Jun 29, 2026
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 4 committers have signed the CLA.

✅ dsengupta0628
❌ James Cherry
❌ calewis
❌ dependabot[bot]


James Cherry seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Newton-Raphson solver in DmpCeff to use Eigen, optimizes generated clock and latch arrival propagation by postponing latch evaluations, introduces new graph traversal helpers, and ensures input delay reference pin edges survive graph rebuilds. The review feedback identifies several critical issues: a potential stale value bug in DmpCeff when nr_order is not 3, an uninitialized pointer and double-free risk in PathEnumFaninVisitor due to an empty copy constructor, numerical instability in the determinant-based singularity check, potential null pointer dereferences in findDelays and visitFanouts, and a division-by-zero risk in Power::findActivity if the clock period is zero.

Comment thread dcalc/DmpCeff.cc
Comment on lines +141 to +143
Eigen::Vector3d x = Eigen::Vector3d::Zero();
if (nr_order_ == 3)
x_[DmpParam::ceff] = ceff;
x[DmpParam::ceff] = ceff;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

When nr_order_ is not 3 (e.g., nr_order_ == 2 for DmpOnePole), ceff_ is never updated to the input ceff before newtonRaphson(x) is called. Since DmpOnePole::evalDmpEqns relies on ceff_, this leads to stale or uninitialized values being used during the Newton-Raphson iterations. Setting ceff_ = ceff; in the else branch resolves this issue.

Suggested change
Eigen::Vector3d x = Eigen::Vector3d::Zero();
if (nr_order_ == 3)
x_[DmpParam::ceff] = ceff;
x[DmpParam::ceff] = ceff;
Eigen::Vector3d x = Eigen::Vector3d::Zero();
if (nr_order_ == 3)
x[DmpParam::ceff] = ceff;
else
ceff_ = ceff;

Comment thread search/PathEnum.cc
Comment on lines 357 to +363
{
}

PathEnumFaninVisitor::~PathEnumFaninVisitor()
{
delete pred_;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The custom copy constructor PathEnumFaninVisitor::PathEnumFaninVisitor(const PathEnumFaninVisitor &visitor) is empty and does not call the base class copy constructor. This causes the base class to be default-constructed, leaving pred_ uninitialized or null in the copied visitor, which will cause crashes when copy() is called. Additionally, manually deleting pred_ in the destructor will cause a double-free if the base class copy constructor is called and copies the pointer. To fix this robustly, consider passing true for the delete_pred parameter in the PathVisitor constructor so PathVisitor manages the lifetime of pred_, removing the custom destructor entirely, and removing the custom copy constructor entirely to let the compiler generate a correct member-wise copy constructor.

Comment thread dcalc/DmpCeff.cc
Comment on lines +970 to 985
if (nr_order_ == 2) {
double det = fjac.topLeftCorner<2, 2>().determinant();
if (std::abs(det) < 1e-12) {
throw DmpError("Jacobian is singular (order 2)");
}
p.head<2>() = fjac.topLeftCorner<2, 2>().inverse() * -fvec.head<2>();
return p;
}
}

// Solves fjac_ * x = p_ for x, assuming fjac_ is LU form from luDecomp.
// Solution overwrites p_.
void
DmpAlg::luSolve()
{
const int size = nr_order_;

// Transform p_ allowing for leading zeros.
int non_zero = -1;
for (int i = 0; i < size; i++) {
int iperm = index_[i];
double sum = p_[iperm];
p_[iperm] = p_[i];
if (non_zero != -1) {
for (int j = non_zero; j <= i - 1; j++)
sum -= fjac_[i][j] * p_[j];
}
else {
if (sum != 0.0)
non_zero = i;
}
p_[i] = sum;
}
// Backsubstitution.
for (int i = size - 1; i >= 0; i--) {
double sum = p_[i];
for (int j = i + 1; j < size; j++)
sum -= fjac_[i][j] * p_[j];
p_[i] = sum / fjac_[i][i];
double det = fjac.determinant();
if (std::abs(det) < 1e-12) {
throw DmpError("Jacobian is singular (order 3)");
}
p = fjac.inverse() * -fvec;
return p;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The absolute determinant check std::abs(det) < 1e-12 is highly sensitive to the scaling of the matrix elements, which depend on physical units (e.g., Farads and seconds). If the units result in small values in fjac, the determinant can easily fall below 1e-12 even for a well-conditioned matrix, causing unexpected DmpError failures. Using a robust LU decomposition with partial pivoting (as documented in your comments) is much safer and numerically stable.

Suggested change
if (nr_order_ == 2) {
double det = fjac.topLeftCorner<2, 2>().determinant();
if (std::abs(det) < 1e-12) {
throw DmpError("Jacobian is singular (order 2)");
}
p.head<2>() = fjac.topLeftCorner<2, 2>().inverse() * -fvec.head<2>();
return p;
}
}
// Solves fjac_ * x = p_ for x, assuming fjac_ is LU form from luDecomp.
// Solution overwrites p_.
void
DmpAlg::luSolve()
{
const int size = nr_order_;
// Transform p_ allowing for leading zeros.
int non_zero = -1;
for (int i = 0; i < size; i++) {
int iperm = index_[i];
double sum = p_[iperm];
p_[iperm] = p_[i];
if (non_zero != -1) {
for (int j = non_zero; j <= i - 1; j++)
sum -= fjac_[i][j] * p_[j];
}
else {
if (sum != 0.0)
non_zero = i;
}
p_[i] = sum;
}
// Backsubstitution.
for (int i = size - 1; i >= 0; i--) {
double sum = p_[i];
for (int j = i + 1; j < size; j++)
sum -= fjac_[i][j] * p_[j];
p_[i] = sum / fjac_[i][i];
double det = fjac.determinant();
if (std::abs(det) < 1e-12) {
throw DmpError("Jacobian is singular (order 3)");
}
p = fjac.inverse() * -fvec;
return p;
}
if (nr_order_ == 2) {
auto lu = fjac.topLeftCorner<2, 2>().partialPivLu();
if (std::abs(lu.matrixLU().diagonal().prod()) < 1e-12) {
throw DmpError("Jacobian is singular (order 2)");
}
p.head<2>() = lu.solve(-fvec.head<2>());
return p;
}
auto lu = fjac.partialPivLu();
if (std::abs(lu.matrixLU().diagonal().prod()) < 1e-12) {
throw DmpError("Jacobian is singular (order 3)");
}
p = lu.solve(-fvec);
return p;

Comment thread dcalc/GraphDelayCalc.cc
iter_->ensureSize();
if (incremental_)
seedInvalidDelays();
Stats stats(debug_, report_);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Removing the if (arc_delay_calc_) guard in findDelays could lead to null pointer dereferences or unnecessary calculations if arc_delay_calc_ is null. It is safer to keep the guard at the beginning of the function.

  if (!arc_delay_calc_)
    return;
  Stats stats(debug_, report_);

Comment thread graph/Graph.cc
Comment on lines +578 to +583
void
Graph::visitFanouts(Vertex *vertex,
SearchPred *pred,
const VertexFn &fn)
{
if (pred->searchFrom(vertex)) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pred pointer is dereferenced directly without checking if it is null. To prevent potential null pointer dereferences, consider adding a null check or an assertion.

void
Graph::visitFanouts(Vertex *vertex,
                    SearchPred *pred,
                    const VertexFn &fn)
{
  if (pred && pred->searchFrom(vertex)) {

Comment thread power/Power.cc
Comment on lines +1593 to +1596
if (clk) {
float duty = clockDuty(clk);
return PwrActivity(2.0 / clk->period(), duty, PwrActivityOrigin::clock);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If clk->period() is 0.0, the expression 2.0 / clk->period() will result in a division by zero (producing infinity or NaN). It is safer to guard against a zero period.

    if (clk && clk->period() != 0.0) {
      float duty = clockDuty(clk);
      return PwrActivity(2.0 / clk->period(), duty, PwrActivityOrigin::clock);
    }

James Cherry and others added 4 commits June 30, 2026 09:10
Signed-off-by: James Cherry <cherry@CerezoBook.local>
* make mode/scene pass as tcl objects

Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>

* address review comments

Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>

* address review comments take 2

Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>

* revert to using cmd mode name

Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>

---------

Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
Signed-off-by: dsengupta0628 <dsengupta@precisioninno.com>
@dsengupta0628

Copy link
Copy Markdown
Contributor Author

CI fails because we need OpenROAD side changes to accomodate these:

The-OpenROAD-Project/OpenROAD#10790

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants