Implement full UUID instance/node IDs (#129)#238
Conversation
pinodeca
left a comment
There was a problem hiding this comment.
We have not yet released v0.2.3, so my primary request is to reverse the version bump in Cargo.toml and to fold the 0.2.3->0.2.4 upgrade script into 0.2.2->0.2.3
Opus 4.7 also suggested these changes, which we can discuss individually and potentially address as follow-ups:
Minor suggestions (non-blocking)
-
explain.rs UUID detection is too permissive. explain.rs uses
uuid::Uuid::parse_str(trimmed).is_ok(), which accepts simple (32-char no hyphens), hyphenated, braced ({...}), and URN (urn:uuid:...) forms. Only canonical hyphenated UUIDs ever appear indf.instances.id, so non-canonical forms will silently fall through to "not found". Tightening to require length 36 +parse_strwould makedf.explain('garbage32hexcharsthatlookslikeauuid')behave as a DSL-parse error rather than an instance lookup miss. Trivial impact. -
Schema snapshot doesn't compare CHECK bodies. Per the PR's own constraint-drift note, test-upgrade.sh doesn't pin
pg_get_constraintdef()/convalidated. The new id-format regex is duplicated byte-for-byte in lib.rs and sql/pg_durable--0.2.3--0.2.4.sql; a future drift in one but not the other would only be caught by pgspot + the functional B1/B2 path. Hardening the snapshot to includepg_get_constraintdef()is filed implicitly in the docs — worth a follow-up issue. -
No CHANGELOG.md entry. CHANGELOG.md exists at repo root but isn't in the diff. The upgrade-testing.md "v0.2.3 → v0.2.4" section is great, but a CHANGELOG line for the version bump is conventional.
-
full_uuid()could just beUuid::new_v4().to_string()inline. It's a one-liner used in exactly one place (new_id). The doc comment is valuable, but the wrapper is borderline. Optional. -
Existing-row revalidation. The upgrade re-adds all six format CHECKs as
NOT VALIDto skip the table scan, which is appropriate for a maintenance-window upgrade. Worth a brief operator-doc line that anyone wanting full constraint validation later can runALTER TABLE … VALIDATE CONSTRAINT …(rows already conform since 8-char IDs match the relaxed regex).
Suggested follow-ups (separate PRs)
- Tighten explain.rs UUID detection to canonical form only.
- Extend test-upgrade.sh schema snapshot to include
pg_get_constraintdef()andconvalidated. - Add a CHANGELOG.md entry
cc4f375 to
74ad427
Compare
|
Thanks for the review, @pinodeca! Addressed the blocking item plus the cheap niceties. Blocking — version fold
Minor suggestions (numbered as in your review)
Re-ran the full CONTRIBUTING dev workflow locally: |
Replace the collision-prone 8-character short IDs (32 bits of entropy, ~50% PK-collision probability near 65k instances) with full canonical UUIDs (122 bits) for df.instances and df.nodes IDs. - types.rs: add full_uuid(); retain short_id() for the not-yet-upgraded pre-0.2.3 schema - dsl.rs: gate ID format on full_uuid_ids_enabled() (extversion >= 0.2.3) and thread the choice through start()/insert_nodes() so node IDs and node references also become UUIDs - lib.rs: widen the six id columns to TEXT and relax the *_format_chk regexes to accept a legacy 8-char id or a canonical UUID; add a node-id format test - explain.rs: classify both legacy 8-char and canonical 36-char UUID instance IDs - load_function_graph.rs: cast the id columns to text in the BGW's cached SELECTs so an ALTER EXTENSION UPDATE that widens them to TEXT under a live pooled connection cannot trip "cached plan must not change result type" - fold the UUID-widening DDL (drop FKs/checks, ALTER COLUMN TYPE text, re-add checks/FKs NOT VALID) into the existing, still-unreleased sql/pg_durable--0.2.2--0.2.3.sql upgrade script instead of minting a new version; no Cargo version bump - docs: USER_GUIDE, ARCHITECTURE, upgrade-testing (incl. an operator note on running VALIDATE CONSTRAINT later) and a CHANGELOG entry Backward compatible: the 0.2.3 .so keeps emitting 8-char IDs against the not-yet-upgraded 0.2.2 schema until ALTER EXTENSION UPDATE runs (B1); existing 8-char rows stay valid and coexist with UUIDs (B2). Verified locally: unit (171), e2e, and upgrade (Scenario A + B1 + B2) suites pass; pgspot clean.
74ad427 to
12b50e4
Compare
CI fix: hardened e2e test
|
Bring the issue microsoft#129 full-UUID branch up to date with upstream main (b4579ae, v0.2.3 changelog). Resolve the CHANGELOG.md conflict by folding the Full UUID instance/node IDs entry into the dated 0.2.3 section's Changed list.
Fixes #129.
Summary
Replaces the collision-prone 8-character short IDs (32 bits of entropy, ~50% PK-collision probability near 65k instances) with full canonical UUIDs (122 bits) for
df.instancesanddf.nodesIDs.Changes
full_uuid(); retainshort_id()for pre-0.2.3 schemasfull_uuid_ids_enabled()(installedextversion >= 0.2.3) and thread the choice throughstart()/insert_nodes()so node IDs and node references also become UUIDsTEXTand relax the*_format_chkregexes to accept a legacy 8-char id or a canonical UUID; add a node-id format testtextin the BGW's cachedSELECTs so anALTER EXTENSION UPDATEthat widens them toTEXTunder a live pooled connection cannot tripcached plan must not change result typeALTER COLUMN ... TYPE text, re-add checks/FKsNOT VALID) into the existing, still-unreleased upgrade scriptsql/pg_durable--0.2.2--0.2.3.sql; no extension version bumpBackward compatibility
The 0.2.3
.sokeeps emitting 8-char IDs against pre-0.2.3 schemas untilALTER EXTENSION UPDATEruns (Scenario B1); existing 8-char rows stay valid and coexist with UUIDs after the upgrade (Scenario B2). The gate flips to UUIDs only inside the same transaction that widens the columns.Testing
cargo fmt -p pg_durable -- --check: cleancargo clippy --features pg17: clean (only the pre-existingextract_hostdead-code warning)cargo pgrx test pg17): 171 passed / 0 failedscripts/test-e2e-local.sh): 33/33scripts/test-upgrade.sh): 20/20 — schema comparison + binary backward compatibility