feat(genesis): canonicalize hash so per-node connection_url doesn't diverge#855
Conversation
…iverge Problem ======= Two peers running the same consensus configuration but with their own identity listed as `localhost` in `data/genesis.json validators[]` produced different `genesisDataHash` values. `peerBootstrap` rejected the pairing as a hash mismatch even though no consensus-significant field differed. `connection_url` is network topology, not consensus state. The active validator set, stakes, statuses, and activation windows are consensus- significant; how a peer is reached on the wire is not. Fix === New module `src/libs/blockchain/genesis/normalizeGenesisForHash.ts` exposing `hashGenesisData(...)`. It: 1. Deep-clones the input (no mutation of caller's object). 2. Strips `connection_url` from every `validators[]` entry. 3. Sorts validators by `address` (insertion-order independent). 4. Stably stringifies with lex-sorted keys at every depth. 5. SHA-256s the result. Two genesis files that differ ONLY in `validators[*].connection_url` (or in the array order, or in object-key authoring order) now produce the same hash. Call sites converted: - `src/libs/peer/routines/peerBootstrap.ts` (baseline + post-fetch re-hash) - `src/libs/network/handlers/blockHandlers.ts:getGenesisDataHash` (RPC response to peers) Note: `connection_url` is still kept as an authoring/seeding aid (operators may pre-populate per-node URLs) and is still readable from the DB — `seedValidators` writes whatever the file says. The canonical form is only used for cross-node hashing. Tests ===== `testing/genesis/normalizeGenesisForHash.test.ts` — 16 cases: - connection_url stripped - validators sorted by address regardless of input order - input object never mutated - missing/non-array/null/primitive input tolerated - non-validators top-level fields preserved verbatim - stableStringify lex-sort at every depth + array element-wise - two divergent connection_url sets produce same hash - consensus-significant changes (stake, currency) DO change hash - determinism across repeated calls - output is 64 lowercase hex chars Full suite: 120 pass / 9 skip / 0 fail (was 104, +16 new tests).
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Caution Review failedPull request was closed or merged during review WalkthroughThis PR introduces deterministic genesis-data hashing by extracting consensus-relevant fields into a canonical form, applying stable JSON serialization with lexicographically sorted keys, then computing SHA-256 hashes. The new utilities are integrated into existing block handler and peer bootstrap code paths and are covered by comprehensive tests. ChangesDeterministic Genesis Hashing and Integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryIntroduces a canonical genesis-data normalizer (
Confidence Score: 3/5The genesis normalization logic is correct and well-tested, but the hash scheme change is a hard cut-over that breaks peering between nodes on different versions of this code. The new hashGenesisData function correctly solves the localhost divergence problem and is well-tested. However, the switch from raw JSON.stringify+SHA-256 to the normalized hash is not backward-compatible: any node still running the old binary will compute a different digest and be rejected by nodes running the new code. A partial rollout or a single lagging node would reproduce the exact peering failure the PR is fixing, just for a different reason. src/libs/peer/routines/peerBootstrap.ts — contains the deployment-incompatible hash cut-over and the unnecessary file re-read in the auto-heal path. Important Files Changed
Sequence DiagramsequenceDiagram
participant N2 as Node2 (new code)
participant N3 as Node3 (new code)
Note over N2: peerBootstrap()<br/>hashGenesisData(genesisData)<br/>strips connection_url<br/>sorts validators<br/>stableStringify + SHA-256
N2->>N3: RPC getGenesisDataHash
N3-->>N2: hashGenesisData(genesisData) normalized
Note over N2,N3: Hashes match even though<br/>each node has localhost in its own entry
N2->>N3: sayHelloToPeer peers paired
Note over N2: Auto-heal path if hashes differ<br/>downloads /genesis from N3<br/>hashGenesisData(res.data) compare
Reviews (1): Last reviewed commit: "feat(genesis): canonicalize hash so per-..." | Re-trigger Greptile |
| const genesisFile = "data/genesis.json" | ||
| const genesisData = JSON.parse(fs.readFileSync(genesisFile, "utf8")) | ||
| ourGenesisDataHash = Hashing.sha256(JSON.stringify(genesisData)) | ||
| ourGenesisDataHash = hashGenesisData(genesisData) |
There was a problem hiding this comment.
Breaking change for mixed-version deployments
Nodes running the old code compute ourGenesisDataHash as Hashing.sha256(JSON.stringify(genesisData)). Nodes running this PR compute it via hashGenesisData(genesisData). The two schemes produce different hex digests for the same genesis file, so any peer running the old binary will fail the hash equality check at line 62 against any peer running the new binary. During a rolling upgrade, every node that hasn't been updated yet will be rejected with "Genesis data hash does not match" — reintroducing the exact peering failure this PR is meant to fix. All nodes in the cluster must be stopped and updated together; a partial rollout leaves the network non-functional.
| const ourNewGenesisDataHash = hashGenesisData( | ||
| JSON.parse(fs.readFileSync(genesisFile, "utf8")), | ||
| ) |
There was a problem hiding this comment.
Unnecessary disk round-trip in auto-heal: the file is written on line 81 and immediately re-read on line 86 purely to compute a hash. If a write error or a concurrent file access occurs between the two syscalls, the hash will reflect a stale or corrupt file. Passing
res.data (which is already the parsed object) directly to hashGenesisData removes the extra I/O and the TOCTOU window.
| const ourNewGenesisDataHash = hashGenesisData( | |
| JSON.parse(fs.readFileSync(genesisFile, "utf8")), | |
| ) | |
| const ourNewGenesisDataHash = hashGenesisData(res.data) |
| export function stableStringify(value: unknown): string { | ||
| if (value === null || typeof value !== "object") { | ||
| return JSON.stringify(value) | ||
| } | ||
|
|
||
| if (Array.isArray(value)) { | ||
| const parts = value.map(v => stableStringify(v)) | ||
| return "[" + parts.join(",") + "]" | ||
| } | ||
|
|
||
| const obj = value as Record<string, unknown> | ||
| const keys = Object.keys(obj).sort() | ||
| const parts = keys.map(k => JSON.stringify(k) + ":" + stableStringify(obj[k])) | ||
| return "{" + parts.join(",") + "}" |
There was a problem hiding this comment.
stableStringify drops undefined-valued keys non-deterministically
JSON.stringify(undefined) returns the JS value undefined (not the string "undefined"). When an object key holds undefined, the parts.map line produces "\"key\":undefined" (invalid JSON). Standard JSON.stringify silently omits those keys. Genesis data from JSON.parse will never carry undefined, so this is dormant today, but any TypeScript caller passing an object with optional fields set to undefined will receive a non-parseable digest. Adding if (obj[k] === undefined) continue before pushing to parts aligns behaviour with JSON.stringify.
Problem
Two peers running the same consensus configuration but with their own identity listed as
localhostindata/genesis.json validators[]produced differentgenesisDataHashvalues.peerBootstraprejected the pairing as a hash mismatch even though no consensus-significant field differed.Empirically reproduced on devnet between node2 and node3:
http://localhost:53552, node3's URL ashttp://dev.node3.demos.sh:53550http://localhost:53552, node2's URL ashttp://dev.node2.demos.sh:53552connection_urldiffers.connection_urlis network topology metadata, not consensus state. The active validator set, stakes, statuses, and activation windows ARE consensus-significant; how a peer is reached on the wire is not.Fix
New module
src/libs/blockchain/genesis/normalizeGenesisForHash.tsexposinghashGenesisData(...). It:connection_urlfrom everyvalidators[]entry.address(insertion-order independent).Two genesis files that differ ONLY in
validators[*].connection_url, validator array order, or object-key authoring order now produce the same hash.Call sites converted
src/libs/peer/routines/peerBootstrap.ts— baselineourGenesisDataHash+ post-fetch re-hash after auto-heal.src/libs/network/handlers/blockHandlers.ts:getGenesisDataHash— RPC response to peers.What is NOT changed
connection_urlis still kept in the JSON file as an authoring/seeding aid; operators may still pre-populate per-node URLs.seedValidatorscontinues to writeconnection_urlinto the DB verbatim.Tests
testing/genesis/normalizeGenesisForHash.test.ts— 16 cases covering: stripping, sorting, deep-clone safety, missing/non-array/null/primitive tolerance, top-level preservation, lex-sort at every depth, hash equivalence across divergent URLs, consensus-significant changes still detected, determinism, output shape.Full suite: 120 pass / 9 skip / 0 fail (was 104, +16 new).
Why this approach over removing connection_url from
data/genesis.jsonentirelyAlternative considered: set all
connection_urltonullin every node's genesis file. Equally fixes the hash divergence with zero code. Drawback: operators lose the convenient genesis-driven seeding ofvalidators.connection_urlin the DB (currently the only DB-write path for that field at boot before any stake tx lands). Pushes URL maintenance entirely intodemos_peerlist.json, which is a separate file with separate sync ergonomics.The normalizer approach keeps the convenience for operators (URLs in genesis are still useful for
validatorstable seeding + later RPC introspection via/peerlist) while making consensus hashing robust to the per-node-localhost reality.Test plan
bun test testing/genesis/reports 16 pass.validators[*].connection_urldiffering only in localhost-vs-public-host produce identical hashes fromhashGenesisData.validators[*].staked_amount,address, orstatusDOES change the hash.bun test testing/state-snapshot/ testing/forks/restore/ testing/consensus/ testing/genesis/= 120 pass / 9 skip / 0 fail.data/genesis.json validators[], peerBootstrap pairs cleanly, noGenesis data hash does not matchERROR in logs.Summary by CodeRabbit
Release Notes
New Features
Improvements
Tests