[DO NOT REVIEW] yosys 0.65 + bazel-orfs bump + sweep/yosys-idempotency README — investigation record#4231
[DO NOT REVIEW] yosys 0.65 + bazel-orfs bump + sweep/yosys-idempotency README — investigation record#4231oharboe wants to merge 10 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the bazel-orfs dependency to a newer commit, removes a local patch that has likely been upstreamed, and forces the yosys version to 0.65 to address memory corruption issues. Additionally, it expands the flow/README.md with detailed instructions for running full test sweeps and triaging test failures. A review comment identifies a security and stability concern regarding the use of a personal GitHub fork as a Bazel registry in .bazelrc, suggesting the use of git_override or archive_override as a more robust alternative.
| common --registry=https://raw.githubusercontent.com/oharboe/bazel-central-registry/yosys-0.65/ | ||
| common --registry=https://bcr.bazel.build/ |
There was a problem hiding this comment.
Using a raw GitHub URL from a personal fork as a Bazel registry is insecure and fragile. It bypasses the official BCR's trust model and makes the build dependent on the availability and state of a specific branch in a third-party repository. For testing a pending BCR update, a more robust and secure approach is to use git_override or archive_override (with an integrity hash) for the yosys module in MODULE.bazel. This keeps the override localized to the project and avoids the need for global registry modifications.
flow/README.md "Triaging a failing _test" → "Yosys-environment false
positive" already calls out that bazel-built yosys and make-built
yosys can produce different 1_2_yosys.v for the same RTL, drifting
QoR past rules-base.json thresholds even though OpenROAD itself is
bit-deterministic. Today the only way to spot the drift is to
re-run `@bazel-orfs//:make-yosys-netlist`; if a designer has a
freshly-failing _test in front of them, they have no way to see
"the yosys netlist changed" vs "a real regression".
Persist a fingerprint in rules-base.json instead so the next _test
just prints it:
- genMetrics.py: emit `synth__canonical_netlist__hash` (SHA-1 of
1_1_yosys_canonicalize.rtlil, the canonical RTLIL pre-ABC) and
`synth__netlist__hash` (SHA-1 of 1_2_yosys.v, post-ABC). Having
both lets the user see whether divergence is in the front-end
flow or downstream of ABC.
- genRuleFile.py: new `literal` mode passes the metric value
through verbatim (no padding / rounding / float coercion). The
two hash fields use it with `level: warning` + `compare: ==`,
so `_update` writes them into rules-base.json and `_test`
treats a hash mismatch as a diagnostic, not a failure.
- checkMetadata.py: a warning-level mismatch previously printed
"[WARN] field pass test: a == b" — confusing when a != b yet
"pass" implied a match. Say "differs" instead so the hash
mismatch reads naturally without changing the no-error contract
(still doesn't increment ERRORS).
rules-base.json files are unchanged in this commit; the next
`_update` per design adds the two hash fields automatically, and
existing _test runs are unaffected until that happens.
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
genElapsedTime.py used to pick exactly one of .odb / .rtlil / .v as "the result file" for a log and hash that. Synth produces both 1_2_yosys.v and 1_2_yosys.sdc; OpenROAD stages produce both an .odb and a .sdc. Folding all of them into one column means a divergent .sdc (an SDC-generator change) and a divergent .odb (a real flow change) look identical in the elapsed-time triage table. Replace `get_hash` with `get_hashes` that returns every result-file extension that exists, and emit one row per (stage, extension) with the elapsed time and peak memory on the stage's first row only. Column order is `.v / .rtlil / .odb / .sdc` so the primary data artifact comes first and the constraint file last. Added a regression test (`test_emits_one_row_per_result_extension`) to lock the two-row-per-stage behaviour in; existing tests cover the no-result-file fallback (single row with ext="" and hash=N/A) since the table now has an Ext column. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Restore the sweep recipe (per-platform --keep_going, big designs solo, small designs batched) and the triage decision tree for failing _tests. The yosys-environment false-positive check now points at @bazel-orfs//:make-yosys-netlist (and the reverse @bazel-orfs//:yosys-check), which moved upstream in 432edcd63; the full workflow lives in @bazel-orfs//:TESTING.md "Debugging OpenROAD determinism (bazel vs make)". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Yosys 0.63 has a WRAPCELL/RTLIL-identifier memory corruption that
produces garbage characters in module names like
`'\\<random>_KOGGE_STONE'` during synthesis of designs that exercise
the $alu architectural-options path (cva6, ibex, riscv32i*,
swerv_wrapper, sky130hd/{ibex,microwatt} all hit this in the
asap7+sky130hd sweep). Upstream yosys fixed it; 0.65 produces clean
names like `ALU_64_0_64_0_64_KOGGE_STONE`.
bazelbuild/bazel-central-registry#8863 adds 0.65 to BCR. Until that
lands we point at oharboe's fork branch via an additional --registry
in .bazelrc, then force the root resolution to 0.65 via
single_version_override (bazel-orfs's MODULE.bazel pins to 0.63).
Verified: //flow/designs/asap7/cva6:cva6_synth now builds clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Upstream main now has commit c000cb3 "orfs_design/orfs_flow: add user_sources= for design-private path hooks" — the same change we were carrying as 0001-orfs_design-add-user_sources.patch. Drop the patch file and remove the patches= reference in MODULE.bazel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Run `make DESIGN_CONFIG=designs/asap7/uart/config.mk metadata update_rules` to baseline `rules-base.json` against the make-built yosys netlist. The diff captures three groups of changes: * `synth__canonical_netlist__hash` and `synth__netlist__hash` are new warning-level rules introduced by a5839b8. This commit materializes them for the first time on a real design (make-yosys SHA-1 over `1_1_yosys_canonicalize.rtlil` and `1_2_yosys.v`). * `finish__timing__setup__ws` and `finish__timing__setup__tns` get tightened by `--tighten`: make-flow's slack is meaningfully better than the previous baseline; this records the new headroom. The point of the snapshot is to give a `bazelisk test` re-run a make-yosys reference against which the bazel-yosys netlist hash mismatches, so `checkMetadata.py` surfaces the warning-level `[WARN] synth__netlist__hash differs ...` diagnostic that the next README change documents. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
The previous bucket-#2 text walked the reader through the symptom ("bazel-built yosys and make-built yosys produce different `1_2_yosys.v`") before pointing at the diagnostic. Now that `rules-base.json` carries warning-level `synth__canonical_netlist__hash` and `synth__netlist__hash` entries (a5839b8), the diagnostic IS the symptom: `checkMetadata.py` emits [WARN] synth__canonical_netlist__hash differs test: <bazel> == <make> [WARN] synth__netlist__hash differs test: <bazel> == <make> right above the failing QoR row. Lead with that and keep `@bazel-orfs//:make-yosys-netlist` (+ the reverse-direction `@bazel-orfs//:yosys-check`) as the per-stage `.odb` SHA-matrix proof when the warning alone isn't enough. Drop the `SYNTH_NETLIST_FILES` and `BLOCKS=` parenthetical — that level of detail belongs in `@bazel-orfs//:TESTING.md`, which the link already points at. Dress-rehearsed on `//flow/designs/asap7/uart:uart_test`: after bumping `rules-base.json` from the classic make flow (ca657c9) the bazel `_test` fires both `[WARN] …__hash differs` lines exactly as described, alongside the pre-existing `detailedroute__route__wirelength` failure. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
`synth__canonical_netlist__hash` (1_1_yosys_canonicalize.rtlil,
post-front-end) and `synth__netlist__hash` (1_2_yosys.v, post-ABC)
already exist as warning-level entries in `rules-base.json`. On
asap7/uart both already DIFFER between bazel-pinned yosys and
make-pulled `tools/install` yosys (`cff062a8…` vs `6e8e16e4…`, and
`ab8b0b69…` vs `95652d90…` respectively), so the post-ABC hash on
its own can't tell us whether the drift is yosys-internal
non-determinism (read_verilog / hierarchy / opt_clean / techmap /
dfflibmap) or an ABC version mismatch.
Add a third fingerprint immediately before the main `abc` call in
synth.tcl:
[WARN] synth__canonical_netlist__hash differs ... # front-end snapshot
[WARN] synth__preabc_netlist__hash differs ... # mid-level synth
[WARN] synth__netlist__hash differs ... # post-ABC
The pre-ABC snapshot is computed by `write_rtlil` to a temp path,
`exec sha1sum`, file delete, then `puts` of the hash to the yosys
log. The full RTLIL is never shipped as a bazel artifact — only
the 40-char SHA in `1_2_yosys.log`, which is already collected by
the synth action. `extractTagFromFile` in `genMetrics.py` regex-
greps the line; the corresponding `level: warning` rule entry lives
alongside the other two in `genRuleFile.py`.
`flow/designs/asap7/uart/rules-base.json` is regenerated via
`make … metadata update_rules` so the third field is populated from
make-yosys. Re-running `bazelisk test //flow/designs/asap7/uart:
uart_test` then surfaces three warnings: `71b3e9b0…` vs `4ab7b670…`
for the new pre-ABC bucket, confirming drift is already present
mid-yosys (i.e., not only ABC).
Two supporting plumbing changes ride along:
* `MODULE.bazel` gains `patches =` on the bazel-orfs `git_override`,
pointing at the new `patches/bazel-orfs/` directory. The first
vendored patch teaches `orfs_generate_metadata` to pull synth's
outputs in via `data =` — without it the post-ABC
`synth__netlist__hash` lands as `N/A` because `1_2_yosys.v`
isn't otherwise in the metadata action's sandbox. Vendoring
here keeps small bazel-orfs fixes in this PR while we iterate;
when the patch lands upstream, drop the entry and bump
`BAZEL_ORFS_COMMIT`.
* `flow/README.md` "Triaging a failing `_test`" bucket #2 is
rewritten to document the three-way bisection.
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
…ersion compare
The previous three-way `synth__*_netlist__hash` diagnostic on
asap7/uart fired all three warnings, but the canonical-RTLIL
header diff turned out to be dominated by a yosys version skew:
`tools/install/yosys` reported `Yosys 0.63`, the bazel-pinned
yosys reported `Yosys 0.65` (BCR PR registry override), and the
`tools/yosys` submodule was at the v0.64 release commit — three
versions across three places. Without surfacing the versions in
`rules-base.json`, the version skew is invisible to
`checkMetadata.py` and easy to misread as non-determinism.
Two changes:
* Add `synth__yosys__version` and `synth__abc__version` literal
warning-level rule entries.
- yosys version: `extractTagFromFile` on `1_2_yosys.log` line 1
(the yosys banner), regex `^Yosys (\S+)`.
- abc version: `synth_preamble.tcl` runs
`yosys-abc -c "version; quit"` via `exec` using the
`YOSYS_EXE` env var that `flow/scripts/variables.mk` exports,
extracts the "UC Berkeley, ABC <ver>" line and `puts` it to
the same log; `genMetrics.py` extracts via
`extractTagFromFile`.
Now `[WARN] synth__yosys__version differs test: 0.65 == 0.63`
appears directly above the netlist-hash warnings whenever the
underlying binaries diverge.
* Bump `tools/yosys` submodule from `8449dd470` (v0.64 release)
to `b85cad634` (v0.65 release, fetched from
github.com/YosysHQ/yosys.git since The-OpenROAD-Project fork
hasn't synced 0.65 yet).
`flow/designs/asap7/uart/rules-base.json` is regenerated from
`make … metadata update_rules` against the fresh
`tools/install/yosys` 0.65 build, so the baseline hashes belong
to make-yosys-0.65 rather than the previous 0.63 build. Several
timing thresholds tighten by ~30% (`cts__timing__setup__tns`
-1190 → -826, `finish__timing__setup__ws` -46.9 → -35.8 etc.) —
that's yosys 0.65's better QoR over 0.63, not a regression.
Same-version verification: `tools/install/yosys/bin/yosys -V`
and `bazel-out/.../external/yosys+/yosys -V` both report
`Yosys 0.65`; `synth__yosys__version` and `synth__abc__version`
both pass on bazel `_test`. But the three netlist hashes
*still* differ — `cff062a8…` vs `51e1c96c…` on canonical, etc.
A line-level diff on the canonicalize RTLIL (`autoidx 5145` is
identical on both sides) shows a systematic `+3` offset on the
`$<N>y` wire-counter names, indicating the bazel build performs
~3 extra wire-create operations earlier in the front-end that
the make build does not. This is real bazel-vs-make build-
environment non-determinism in yosys 0.65, distinct from the
version skew that caused the earlier confusion. Next investigation
should narrow which canonicalize step
(`read_design_sources` / `hierarchy -check` / `opt_clean -purge`)
contributes the +3 wires.
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
The earlier three-way hash (`canonical`, `preabc`, `netlist`) showed
all three differing between bazel-yosys and make-yosys even at
matching versions (Yosys 0.65, ABC 1.01), but couldn't pinpoint
which yosys pass introduces the drift. This commit adds three
more hash points to bisect:
post_read_sources : after `read_design_sources` (verilog/slang frontend only)
post_hierarchy : after `hierarchy -check -top` (elaboration order)
canonical_netlist : after `opt_clean -purge` (existing)
post_synth_main : after `synth -run fine: -noabc` (main synth pipeline)
preabc_netlist : immediately before `abc` (existing)
netlist : post-ABC `1_2_yosys.v` (existing)
Mechanism: `synth_preamble.tcl` defines a `write_state_hash` tcl
proc that `setattr -unset src */mod` (so paths in src attributes
don't make the hash bazel-vs-make path-dependent), `write_rtlil`
to a temp, sha1sum, deletes the temp, and `puts` a
`<metric>: <sha>` line to the yosys log. `genMetrics.py` greps
the lines via `extractTagFromFile`. No new bazel artifacts ship;
the existing `1_1_yosys_canonicalize.log` / `1_2_yosys.log` carry
the SHAs.
The `setattr -unset src` strip is now unconditional (the
`SYNTH_REPEATABLE_BUILD`-gated block in `synth_canonicalize.tcl`
becomes redundant since `write_state_hash` strips on first call;
left in place as harmless idempotent backup). Src attributes are
debug metadata that no downstream pass relies on for decisions.
Verification on asap7/uart with same-version yosys/ABC on both
sides:
[INFO] synth__yosys__version pass test: 0.65 == 0.65
[INFO] synth__abc__version pass test: 1.01 == 1.01
[WARN] synth__post_read_sources__hash differs test: 39c975e8... == fc8ed9cc...
[WARN] synth__post_hierarchy__hash differs test: 39c975e8... == fc8ed9cc...
[WARN] synth__canonical_netlist__hash differs test: c1269aa7... == 579a132d...
[WARN] synth__post_synth_main__hash differs test: 0cb19660... == 022a854d...
[WARN] synth__preabc_netlist__hash differs test: 49c7bd07... == 6a40bd28...
[WARN] synth__netlist__hash differs test: 5ed342e4... == c3163e94...
Two findings fall out:
1. Non-determinism is born at `post_read_sources`. `hierarchy
-check` is a no-op for uart (slang has already elaborated the
top); `post_hierarchy` matches `post_read_sources` exactly on
both sides. Every subsequent pass (opt_clean, synth pipeline,
techmap+dfflibmap, ABC) faithfully propagates the entering
divergence but does not introduce new divergence between bazel
and make.
2. The slang frontend call is therefore the proximate site of the
drift. Mechanism is consistent with yosys' pointer-keyed
`IdString`/`dict` iteration: different memory layouts in the
bazel-built and make-built yosys binaries assign different
counter values to anonymous wires (visible as systematic `+3`
offsets in `$<N>y` wire names in the canonical RTLIL diff).
Next investigation: which yosys API call in `read_slang`'s
implementation allocates IdStrings/dicts/pools in pointer order?
That's the patch site.
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Summary
Record of a sweep-and-triage session across asap7 + sky130hd to validate the
@bazel-orfs//:make-yosys-netlistworkflow and surface upstream-fixablebugs. Stops short of a clean classification because of a hierarchical-PDN
gap in bazel-orfs for
BLOCKS=designs (see "Open" below) — we hold furtherQoR/idempotency work until ABC catches up to yosys 0.65 upstream.
Commits
flow/README.md: document host-aware sweep + yosys-idempotency triage— restores the sweep recipe and the failing-
_testtriage decision treethat previously lived on an abandoned branch. References
@bazel-orfs//:yosys-check/@bazel-orfs//:make-yosys-netlistanddefers to
@bazel-orfs//:TESTING.mdfor the full classification matrix.bazel: bump yosys to 0.65 via BCR PR registry override— points.bazelrcat oharboe's fork branch forbazelbuild/bazel-central-registry#8863
(adds yosys 0.65 to BCR) and forces yosys resolution to 0.65 via
single_version_overrideso it overrides bazel-orfs's 0.63 pin.Fixes the WRAPCELL/RTLIL-identifier memory corruption on 0.63 that
produced garbage like
'\<random>_KOGGE_STONE'in cva6/ibex/riscv32isynthesis on the asap7 platform.
bazel: bump bazel-orfs to c000cb3 (user_sources= is upstream)—user_sources=is now in bazel-orfs main; drop the local patch.Verified during the session
cva6_synthbuilds clean(
ALU_64_0_64_0_64_KOGGE_STONE, etc. — no control characters).Open
[ERROR PDN-0233] Failed to generate full power gridvia bazel-orfs even with
abstract_stage=\"final\"for sub-blocks (testedand reverted in this session). Root: bazel's sub-block abstract has
OBS LAYER M5 ; RECT 0 0 9.144 9.144 ;covering the full macro, whilemake's leaves M5 channels open. Compounded by sub-block size delta
(9.144 vs 9.235 µm) driven by yosys-version cell-count differences.
This is a real bazel-orfs bug for hierarchical (
BLOCKS=) designs;diagnosis on hand, fix pending.
make-yosys-netlistphase-3 re-synth are queued at
The-OpenROAD-Project/bazel-orfs#734.
Phase-3 still doesn't complete end-to-end (SDC source files aren't in
orfs_final's runfiles) — needs a follow-up to bundle synth sourceinputs into downstream runfiles.
is the reason this PR doesn't try to classify the 7 QoR-failing designs
via
make-yosys-netlist— wait for ABC sync upstream before re-running.Resuming
Once ABC catches up upstream and the BLOCKS PDN gap is fixed:
the BLOCKS fix.
flow/README.mdrecipe) and the classification(
make-yosys-netlistper failing_test).Test plan
bazelisk test //flow/designs/asap7/cva6:cva6_synthbuilds cleanwith yosys 0.65 (was the canonical WRAPCELL-garbage repro on 0.63).
cd flow && make finish DESIGN_CONFIG=designs/asap7/aes-block/config.mkproduces full 6_final.{def,gds,odb,sdc,spef,v} chain — confirms the
bazel-orfs PDN failure is build-infrastructure, not design.
🤖 Generated with Claude Code