Many improvements, including the ability to compare two branches by name#217
Draft
tameware wants to merge 20 commits into
Draft
Many improvements, including the ability to compare two branches by name#217tameware wants to merge 20 commits into
tameware wants to merge 20 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR bundles several incremental improvements across the solver and tooling, notably enhancing benchmark.sh so it can build and compare dtest binaries by git branch name (not just by path), alongside some cleanup/simplification in solver internals.
Changes:
- Extend
benchmark.shto support--branch NAME(once or twice) to builddtestfrom one or two git refs and compare results, with quieter default output and improved cleanup/restore behavior. - Simplify/correct rank/aggregate indexing by removing unnecessary casts in several hot-path areas.
- Adjust trump-void heuristic weighting logic for the “trumps led, second hand void” pitching case.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| library/src/quick_tricks.cpp | Removes redundant casts when indexing bit_map_rank from AbsRankType::rank. |
| library/src/heuristic_sorting/heuristic_sorting.cpp | Updates void-in-trump weighting (including a new lead_suit == trump pitch branch) and simplifies rel_rank indexing. |
| library/src/ab_search.cpp | Removes redundant casts when copying abs_rank winner/second-best into Pos. |
| benchmark.sh | Adds branch-name-based build/compare workflow, improves output behavior, and reports total elapsed time. |
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Combining --branch NAME with --compare PATH now builds NAME as the branch binary and compares it against PATH, ignoring the current branch dtest. Co-authored-by: Cursor <cursoragent@cursor.com>
A second --repeats silently overwrote the first; error out instead. Co-authored-by: Cursor <cursoragent@cursor.com>
The summary was only printed in --compare mode, so a single binary with --repeats > 1 cleared its transient per-run rows and showed nothing. Add a single-binary summary (avg user ms per solver/file plus total elapsed) for the non-compare case so a summary is always shown. Also show the per-run detail table only with --details. Without it the rows are transient progress on a tty and suppressed otherwise. Co-authored-by: Cursor <cursoragent@cursor.com>
The detail rows were shown as transient progress on a tty and then erased with cursor-up escapes. With many repeats the rows scroll off-screen, so the erase cannot reach them and they persist. Show per-run rows only with --details and drop the transient progress table; the summary is always shown. Co-authored-by: Cursor <cursoragent@cursor.com>
git checkout + bazel output is build noise. Capture it to a log and surface it only on failure (or with --details); the short "Building..." labels remain as progress markers. ANSI save/restore (DECSC/DECRC) cannot erase this output reliably: bazel drives its own cursor save/restore for its progress display, clobbering the saved position, so a later restore+clear erases nothing. Capturing to a log is robust and works whether or not stderr is a tty. Co-authored-by: Cursor <cursoragent@cursor.com>
Resolve a "." branch argument to the current branch (or HEAD when detached) before ref validation, so it can be compared without naming it explicitly. Co-authored-by: Cursor <cursoragent@cursor.com>
Display the script's per-run timing rows as progress while the benchmark runs. On a tty without --details they are shown on the alternate screen and discarded when we switch back to the main screen just before the summary, so they never clutter the final output regardless of how much scrolls. With --details they stay on the main screen; off a tty they are suppressed (summary only). dtest's own output is captured, not shown. Replaces the old transient per-run rows, whose ANSI line-clearing could not erase content that had scrolled off-screen. Co-authored-by: Cursor <cursoragent@cursor.com>
The cleanliness check for --branch used --untracked-files=no, so
untracked files passed the check but could still block git checkout
("would be overwritten"). Include untracked files so the check matches
the documented "clean tree" requirement.
Co-authored-by: Cursor <cursoragent@cursor.com>
Hard-coding /usr/bin/time reduces portability (some environments only provide time as a shell keyword or at a different path). Using command time -p keeps the portable format while avoiding an absolute path dependency. Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Comment on lines
+5
to
+9
| # (list100/1000/…/1), largest files first. Always prints a summary. Per-run | ||
| # timing rows and build (git/bazel) output are shown only with --details; | ||
| # otherwise build output is captured to a log (surfaced only on failure) and | ||
| # per-run rows are suppressed. Does not pass dtest options unless given after | ||
| # "--" (see below). |
| # | ||
| local out | ||
| if ! out="$("${cmd[@]}" 2>&1)"; then | ||
| if ! out="$(command time -p "${cmd[@]}" 2>&1)"; then |
Comment on lines
+578
to
+581
| if [[ "$ALT_SCREEN" == "1" ]]; then | ||
| printf '\033[?1049h\033[H\033[2J' >/dev/tty # enter alt screen, home, clear | ||
| ALT_SCREEN_ACTIVE=1 | ||
| fi |
Comment on lines
+627
to
+630
| if [[ "$ALT_SCREEN" == "1" ]]; then | ||
| printf '\033[?1049l' >/dev/tty | ||
| ALT_SCREEN_ACTIVE=0 | ||
| fi |
The --details per-run table showed the generic "branch"/"compare" labels in the ver column. Use the backing branch name when known (current git branch for the branch binary, --branch name for the compare binary), falling back to the generic labels. Widen the ver column to fit. Co-authored-by: Cursor <cursoragent@cursor.com>
Use the working directory the script is invoked from instead of the script location, so it targets whatever DDS checkout you run it in. Add an early guard that fails fast when the current directory is not a DDS checkout root (no MODULE.bazel / library/tests/BUILD.bazel). Co-authored-by: Cursor <cursoragent@cursor.com>
Both flags are now repeatable and can be combined; binaries are benchmarked in specification order with the first as baseline. With one spec the current checkout is still included as the baseline; with two or more explicit specs the checkout is not auto-included. The summary shows one average column per binary, adding a ratio and faster/slower note only for a two-binary comparison; with three or more binaries the note (and ratio) are dropped. Co-authored-by: Cursor <cursoragent@cursor.com>
Previously a lone --branch or --compare was compared against the current checkout. Now the checkout is included only when neither flag is specified; one or more specs benchmark exactly those binaries (first is baseline). --reverse now requires at least two binaries. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.