Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdded numerous concurrent-halt early-exit checks across barrier and dual-simplex code paths; changed some large solver/basis objects to heap allocation (std::unique_ptr) and added detached threads that capture/move solver state when returning due to concurrent-limit. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/src/barrier/barrier.cu (1)
401-417:⚠️ Potential issue | 🟠 MajorThese halt exits still pay full
iteration_data_tteardown.By this point the constructor has already allocated most of the barrier state, and
barrier_solver_t::solvestill ownsiteration_data_t dataon the stack at Line 3448. Areturnhere therefore still blocks on destroyingchol,device_*, and the manyrmm::device_uvectors beforesolve()can reportCONCURRENT_LIMIT, which caps the latency win once allocation has happened.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/barrier/barrier.cu` around lines 401 - 417, The current inline returns when settings.concurrent_halt is set still occur after heavy allocations (chol, device_*, rmm::device_uvector) so teardown blocks reporting CONCURRENT_LIMIT; fix this by checking settings.concurrent_halt before you allocate those heavy resources and avoid late returns — move the concurrent_halt check to just after computing factorization_size (before the make_unique call that constructs sparse_cholesky_cudss_t) and remove the later scattered returns inside form_augmented/form_adat and analyze; instead, if a concurrent halt is observed after those operations, set the solver/iteration status to CONCURRENT_LIMIT (or an equivalent flag on iteration_data_t) and perform a minimal, lightweight early exit path (or jump to a small cleanup label) so you don't pay the full destruction cost of chol, device_* and rmm::device_uvector.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/dual_simplex/solve.cpp`:
- Around line 726-733: When handling lp_status_t::CONCURRENT_LIMIT in the
async-detach branch, preserve the iteration count by copying
lp_solution.iterations into the user-facing solution before returning (i.e.,
ensure the wrapper around solve_linear_program_advanced transfers
lp_solution.iterations to the returned/outer solution object). Also update the
phase‑1 CONCURRENT_LIMIT branch in solve_linear_program_with_advanced_basis to
set original_solution.iterations = iter so the phase‑1 abort reports actual
progress; locate the branches that check for lp_status_t::CONCURRENT_LIMIT and
add the iteration assignments using the existing symbols lp_solution.iterations,
solution (or original_solution), and iter.
---
Outside diff comments:
In `@cpp/src/barrier/barrier.cu`:
- Around line 401-417: The current inline returns when settings.concurrent_halt
is set still occur after heavy allocations (chol, device_*, rmm::device_uvector)
so teardown blocks reporting CONCURRENT_LIMIT; fix this by checking
settings.concurrent_halt before you allocate those heavy resources and avoid
late returns — move the concurrent_halt check to just after computing
factorization_size (before the make_unique call that constructs
sparse_cholesky_cudss_t) and remove the later scattered returns inside
form_augmented/form_adat and analyze; instead, if a concurrent halt is observed
after those operations, set the solver/iteration status to CONCURRENT_LIMIT (or
an equivalent flag on iteration_data_t) and perform a minimal, lightweight early
exit path (or jump to a small cleanup label) so you don't pay the full
destruction cost of chol, device_* and rmm::device_uvector.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: f593adba-f4c2-49c4-a2e0-52987737cf80
📒 Files selected for processing (4)
cpp/src/barrier/barrier.cucpp/src/dual_simplex/basis_updates.cppcpp/src/dual_simplex/phase2.cppcpp/src/dual_simplex/solve.cpp
rg20
left a comment
There was a problem hiding this comment.
Thanks for the improvements. I have minor comments.
Can you also please address the coderabbit reviews ?
cpp/src/dual_simplex/phase2.cpp
Outdated
| iter, | ||
| delta_y_steepest_edge, | ||
| work_unit_context); | ||
| if (result == dual::status_t::CONCURRENT_LIMIT) { |
There was a problem hiding this comment.
Can you please add comments here?
cpp/src/dual_simplex/solve.cpp
Outdated
| if (phase1_status == dual::status_t::ITERATION_LIMIT) { return lp_status_t::ITERATION_LIMIT; } | ||
| if (phase1_status == dual::status_t::CONCURRENT_LIMIT) { return lp_status_t::CONCURRENT_LIMIT; } | ||
| if (phase1_status == dual::status_t::CONCURRENT_LIMIT) { | ||
| std::thread([plp = std::move(presolved_lp), |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
cpp/src/dual_simplex/solve.cpp (1)
233-241:⚠️ Potential issue | 🟡 MinorPreserve the iteration count on phase-1
CONCURRENT_LIMIT.The phase-1 early exit does not preserve the iteration count in
original_solution. When concurrent-halt triggers during phase 1, the caller will see zero iterations instead of actual progress. This was flagged in a previous review.Proposed fix
if (phase1_status == dual::status_t::CONCURRENT_LIMIT) { + original_solution.iterations = iter; std::thread([plp = std::move(presolved_lp), pi = std::move(presolve_info), lpp = std::move(lp), cs = std::move(column_scales), p1 = std::move(phase1_problem), p1v = std::move(phase1_vstatus), p1s = std::move(phase1_solution)]() {}).detach(); return lp_status_t::CONCURRENT_LIMIT; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/dual_simplex/solve.cpp` around lines 233 - 241, When handling phase1_status == dual::status_t::CONCURRENT_LIMIT, preserve the iteration count from the phase-1 result into the returned/original solution before returning; specifically, copy the iteration counter (e.g. phase1_solution.iterations or the actual field/accessor used) into original_solution (or the solution object returned to caller) prior to detaching the background thread and returning lp_status_t::CONCURRENT_LIMIT, so the caller sees the progress done in phase 1. Ensure you perform this assignment before the std::thread([...])...detach() and return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@cpp/src/dual_simplex/solve.cpp`:
- Around line 233-241: When handling phase1_status ==
dual::status_t::CONCURRENT_LIMIT, preserve the iteration count from the phase-1
result into the returned/original solution before returning; specifically, copy
the iteration counter (e.g. phase1_solution.iterations or the actual
field/accessor used) into original_solution (or the solution object returned to
caller) prior to detaching the background thread and returning
lp_status_t::CONCURRENT_LIMIT, so the caller sees the progress done in phase 1.
Ensure you perform this assignment before the std::thread([...])...detach() and
return.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 2aa434a7-6a28-478c-aee8-464d27f54495
📒 Files selected for processing (1)
cpp/src/dual_simplex/solve.cpp
There was a problem hiding this comment.
♻️ Duplicate comments (1)
cpp/src/dual_simplex/solve.cpp (1)
233-243:⚠️ Potential issue | 🟡 MinorConsider preserving iteration count on phase-1
CONCURRENT_LIMIT.When returning
CONCURRENT_LIMITafter phase-1, the iteration progress (iter) is not preserved inoriginal_solution.iterations. While phase-2'sCONCURRENT_LIMITpath (line 332) correctly sets this, the phase-1 path does not.If callers use
iterationsto gauge solver progress, they would see stale/zero values for phase-1 aborts.Suggested fix
if (phase1_status == dual::status_t::CONCURRENT_LIMIT) { + original_solution.iterations = iter; // Keep phase-1 state alive while the concurrent solve continues asynchronously. std::thread([plp = std::move(presolved_lp),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/dual_simplex/solve.cpp` around lines 233 - 243, The phase-1 CONCURRENT_LIMIT branch (when phase1_status == dual::status_t::CONCURRENT_LIMIT) currently returns without preserving the iteration count; before spawning the detached thread and returning lp_status_t::CONCURRENT_LIMIT, capture the current iter value and store it into original_solution.iterations so callers see phase-1 progress (i.e., set original_solution.iterations = iter), then proceed to move the large objects into the lambda and detach as before; update references around phase1_status/phase1_problem/phase1_solution to ensure original_solution is modified prior to the std::thread creation and return.
🧹 Nitpick comments (1)
cpp/src/dual_simplex/phase2.cpp (1)
2495-2518: Detached thread pattern looks correct for deferring teardown.The
std::unique_ptrforftand the detached thread capturing moved state achieves the latency reduction goal. A few observations:
superbasic_list(line 2494) is captured but never populated or used indual_phase2_with_advanced_basis- it's always empty. Consider removing it from the capture to reduce noise:if (result == dual::status_t::CONCURRENT_LIMIT) { // Keep basis state alive while the concurrent solve continues asynchronously. std::thread([bl = std::move(basic_list), nl = std::move(nonbasic_list), - sl = std::move(superbasic_list), f = std::move(ft)]() {}).detach(); }
- The empty lambda body
{}is intentional - the thread exists solely to extend object lifetimes until destruction. A brief comment clarifying this intent (already present at line 2512) is helpful.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/dual_simplex/phase2.cpp` around lines 2495 - 2518, The detached thread captures moved state to keep the basis alive, but it captures superbasic_list (symbol superbasic_list) which is never populated or used by dual_phase2_with_advanced_basis; remove sl = std::move(superbasic_list) from the capture list and only capture the actually used moved objects (bl = std::move(basic_list), nl = std::move(nonbasic_list), f = std::move(ft)), preserving the empty lambda body and its current comment so the detached thread still solely extends lifetimes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@cpp/src/dual_simplex/solve.cpp`:
- Around line 233-243: The phase-1 CONCURRENT_LIMIT branch (when phase1_status
== dual::status_t::CONCURRENT_LIMIT) currently returns without preserving the
iteration count; before spawning the detached thread and returning
lp_status_t::CONCURRENT_LIMIT, capture the current iter value and store it into
original_solution.iterations so callers see phase-1 progress (i.e., set
original_solution.iterations = iter), then proceed to move the large objects
into the lambda and detach as before; update references around
phase1_status/phase1_problem/phase1_solution to ensure original_solution is
modified prior to the std::thread creation and return.
---
Nitpick comments:
In `@cpp/src/dual_simplex/phase2.cpp`:
- Around line 2495-2518: The detached thread captures moved state to keep the
basis alive, but it captures superbasic_list (symbol superbasic_list) which is
never populated or used by dual_phase2_with_advanced_basis; remove sl =
std::move(superbasic_list) from the capture list and only capture the actually
used moved objects (bl = std::move(basic_list), nl = std::move(nonbasic_list), f
= std::move(ft)), preserving the empty lambda body and its current comment so
the detached thread still solely extends lifetimes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 5d603b5d-4833-4fe3-a2c8-bd5f5200b8ee
📒 Files selected for processing (2)
cpp/src/dual_simplex/phase2.cppcpp/src/dual_simplex/solve.cpp
chris-maes
left a comment
There was a problem hiding this comment.
Thanks for putting up this PR @srib
The reduction here in latency is small (~5 seconds on one problem and ~3 seconds on another). However, introducing extra threads for destructors adds additional complexity that I'm worried will make the code difficult to debug and maintain. So I don't think it is worth it to include these threading changes.
The other changes which add additional checks on the concurrent halt pointer are great.
Please remove the threading changes from the PR.
|
@chris-maes Done here: 8b896e6 |
chris-maes
left a comment
There was a problem hiding this comment.
LGTM. One minor change would be good to make before merging.
|
/ok to test 7ce0feb |
Description
These changes reduce concurrent-halt latency in the LP solve path by checking the halt flag at more points before long synchronous work and by moving expensive cleanup off the return path when we exit with
CONCURRENT_LIMIT.basis_update_mpf_tandbarrier_solver_tfrom stack-owned temporaries tostd::unique_ptrin the affected solve paths so their destruction can be deferred onCONCURRENT_LIMIT.CONCURRENT_LIMIT, detached cleanup threads now take ownership of large temporary solver state so the main solve path can return sooner instead of blocking on teardown.The intended behavior is unchanged aside from returning more quickly when a concurrent halt is requested, particularly in paths that previously spent significant time in setup or destruction before exiting.
(Description co-authored with Codex)
Results
@rg20 wanted me to note this here: With regards to moving the destructor to a separate thread, we think there is an underlying issue that we probably want to understand first and fix any bug. The changes for moving the destructor to a detached thread has been removed. (Please see comments below) cc @chris-maes
Issue
Checklist