fix(diffing): strip session-local sdBlockId from diff fingerprint (SD-3279)#3527
Conversation
…-3279) `editor.doc.diff.apply` checks that the editor it runs against has the same canonical fingerprint as the editor that produced the diff. Two SuperDoc editor instances loaded from the same DOCX assign different `sdBlockId` UUIDs at editor startup, so their canonical states diverge and the cross-editor handoff fails with PRECONDITION_FAILED. The attribute-diff path already treats `sdBlockId` as non-semantic (IGNORED_ATTRIBUTE_KEYS in attributes-diffing.ts), and the table-cell NO_OP comparator already lists `sdBlockId` + `sdBlockRev` + paraId + textId + rsid* as identity attrs. The bug was internal drift: the diff fingerprint normalization only stripped paraId / textId / rsid* from paragraph attrs and ignored sdBlockId / sdBlockRev entirely. Hoist the identity-attr list into a shared constant (`NON_SEMANTIC_BLOCK_ATTRS` in `extensions/diffing/algorithm/identity-attrs.ts`) and use it from both consumers. Apply the strip to every block node, not just paragraphs — `sdBlockId` lives on tables, rows, cells, and sections too. Tracked sdBlockRev as adjacent hardening (same identity-attr category, already in the table comparator's list); the diagnostic only observed sdBlockId divergence in the two-editor snapshot payloads. Tests: - Canonicalization: `normalizeDocJSON` now strips identity attrs from non-paragraph block nodes; two doc trees that differ only in sdBlockId values normalize to the same shape. - Fingerprint: `buildCanonicalDiffableState` produces the same fingerprint for body trees that differ only in identity attrs. - Cross-editor handoff: a diff produced by one editor instance applies cleanly to a second editor instance with the same content (the customer's preview-pane pattern from IT-1116). - Same-editor regression guard: capture / mutate / compare / apply on one editor still works. - Export → reimport regression marker: covers the second canonical divergence layer beyond sdBlockId. Currently asserts the known failure mode; flip to a success assertion when SD-3282 is fixed. Verified: - 252/252 diffing tests pass. - `pnpm check:types` clean. - 11 image-move conformance failures pre-existed on main; confirmed unrelated to this change by stashing and re-running.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Locks in the public Document API surface for the customer's preview-pane pattern (IT-1116). Three sessions, base and preview sharing identical content; target produces the snapshot; main session computes the diff; the diff is applied to the *separately-opened* preview session as tracked changes. Pre-SD-3279 this throws PRECONDITION_FAILED because the two sessions' canonical fingerprints diverge on session-local sdBlockId values. The service-level integration test in diff-service.test.ts already covers this; the story sits at the SDK/CLI dispatch tier that customers actually use, so a regression in the public path would be caught even if the internal helper kept passing. Verified end-to-end via vitest run against the rebuilt CLI + SDK on the fix branch.
The earlier story commit verified apply result + tracked changes total. Adds the customer-visible side: the preview session's body text now contains the target paragraph after the diff is applied. Same single test, one extra assertion. Catches a class of regression where the apply succeeds in tracking-changes accounting but never lands the underlying content.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d643338c2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
cubic analysis
No issues found across 7 files
Linked issue analysis
Linked issue: SD-3279: diff.apply can fingerprint-mismatch across editor instances loaded from the same DOCX
| Status | Acceptance criteria | Notes |
|---|---|---|
| ✅ | Strip session-local sdBlockId (and sdBlockRev) from the diff fingerprint normalization so fingerprint does not vary across editor sessions. | A shared NON_SEMANTIC_BLOCK_ATTRS constant now includes sdBlockId and sdBlockRev and is used by the normalization code; unit tests assert normalized output/fingerprint is identical when only sdBlockId/sdBlockRev differ. |
| ✅ | Strip identity-only attributes from every block node (paragraphs, tables, rows, cells, sections) rather than only paragraphs so fingerprints are stable for all block types. | normalizeDocNodeJSON now applies block-attr normalization to container nodes and tests verify identity attrs are stripped from table/row/cell/section nodes. |
| ✅ | Unify the identity-attr list into one shared constant consumed by both the diff fingerprint normalization and the cell-content NO_OP comparator to prevent drift. | New NON_SEMANTIC_BLOCK_ATTRS constant is added and consumed from identity-attrs.ts; table adapter now imports that set instead of duplicating the list. |
| ✅ | Cross-editor handoff: a diff produced by one editor instance can be applied to a second editor instance (same DOCX content) in tracked-change mode and produces tracked-change operations. | The PR adds tests that exercise capture/compare/apply across separate editor instances and the PR description and tests indicate the cross-editor handoff test passes post-fix. |
| ✅ | Preserve same-editor capture/compare/apply behavior (regression guard). | Tests explicitly include a same-editor regression guard and the PR notes same-editor path still passes; test asserts tracked changes are applied and content matches target. |
The previous SD-3279 commit changed the canonical normalization (stripped sdBlockId / sdBlockRev) without bumping the snapshot/payload version. Any DiffSnapshot persisted by a customer under the pre-fix algorithm would now be rejected as tampered when re-derivation runs through the new normalizer. Same risk for DiffPayloads in same-session reapply. Add a parallel legacy normalization path that reproduces the pre-fix behavior exactly (paragraph-only attribute stripping over the original 7-element set, no stripping on structural containers, image originalAttributes stripping unchanged). compareToSnapshot and applyDiffPayload now accept either the new or the legacy fingerprint: re-derive with the current normalizer first; on mismatch, retry with the legacy normalizer. Genuinely tampered artifacts fail both checks and still throw. Scope of recovery: - DiffSnapshot persisted under old algorithm: fully recovered. This is the customer's revision-history workflow (capture and store). - DiffPayload persisted under old algorithm: recovered only for same-editor reapply where the editor still holds the original sdBlockIds. Cross-session reapply of old payloads was already broken by per-session sdBlockId divergence — that's the exact bug SD-3279 fixes for new snapshots; the legacy fallback can't retroactively rescue it. Tests added: - diff-service.test.ts: legacy snapshot accepted by compareToSnapshot; legacy same-editor payload accepted by applyDiffPayload; genuinely tampered snapshot still rejected. - fingerprint.test.ts: legacy normalizer produces a different fingerprint than the current normalizer for docs with sdBlockId (otherwise the fallback would be dead code). - diff-service.test.ts: SD-3282 ticket reference added to the existing export → reimport regression marker comment. Verified: 257/257 diffing tests pass; pnpm check:types clean.
Replay previously called setNodeMarkup with the diff's full source attrs, overwriting the recipient editor's session-local sdBlockId / sdBlockRev with the originator's values. Merge the AttributesDiff onto the recipient node's existing attrs so any path the diff does not touch is preserved verbatim. Also pass NON_SEMANTIC_BLOCK_ATTRS as ignoreKeys for non-paragraph block attr diffs; paragraphs were already covered via normalizeParagraphAttrs.
There was a problem hiding this comment.
cubic analysis
No issues found across 16 files
Linked issue analysis
Linked issue: SD-3279: diff.apply can fingerprint-mismatch across editor instances loaded from the same DOCX
| Status | Acceptance criteria | Notes |
|---|---|---|
| ✅ | Remove session-local sdBlockId/sdBlockRev (and related identity attrs) from the diff/fingerprint normalization for all block nodes (paragraphs, tables, rows, cells, sections). | Normalization was extended to strip the non-semantic block attrs from every block node and tests validate normalized outputs are stable across differing sdBlockId values. |
| ✅ | Hoist the identity-attr list into a single shared constant and consume it from both the fingerprint normalizer and NO_OP comparators to prevent drift. | A single NON_SEMANTIC_BLOCK_ATTRS constant was added and imported where needed; previous duplicated lists were removed/replaced. |
| ✅ | Ensure replay/apply does not overwrite the recipient editor's session-local identity attrs when applying attribute diffs (preserve recipient sdBlockId/sdBlockRev). | An applyAttrsDiff helper was implemented and replay paths now merge diffs onto the recipient node attrs instead of replacing them; unit tests assert recipient sdBlockId is preserved. |
| ✅ | Fix the customer-visible bug: a diff produced by one editor instance can be applied to a separately-opened editor instance loaded from the same DOCX (cross-editor capture → compare → apply works). | End-to-end tests exercising cross-editor handoff were added and assert that a diff captured from one editor can be applied to another editor with the same content; this is the primary reported failure and the test passes per PR verification. |
| ✅ | Preserve backward-compatibility: accept snapshots and same-editor diff payloads captured under the pre-fix algorithm via a legacy normalizer fallback (without changing forward path behavior). | Legacy normalization functions and a legacy canonical-state builder were added; compareToSnapshot and applyDiffPayload retry fingerprint validation with the legacy canonicalizer before rejecting; tests cover accepting legacy snapshots and same-editor legacy payloads and rejecting tampered snapshots. |
| ✅ | Add a regression marker for the known export → reimport fingerprint divergence (document as SD-3282) without changing scope of this PR. | A regression marker test was intentionally added expecting the export→reimport path to still throw a fingerprint mismatch; the test documents the known limitation and is present in the test suite. |
Two SuperDoc editor instances loaded from the same DOCX assign different
sdBlockIdUUIDs at editor startup, so the diff fingerprint differs across instances andeditor.doc.diff.applyfrom one editor to another throwsPRECONDITION_FAILED. This blocks the cross-editor preview pattern that customer revision-history UIs want to build (IT-1116).The attribute-diff path already treats
sdBlockIdas non-semantic (IGNORED_ATTRIBUTE_KEYS), and the table-cell NO_OP comparator already listssdBlockId/sdBlockRev/paraId/textId/rsid*as identity attrs. The bug was internal drift: the fingerprint normalization only stripped paragraph-levelparaId/textId/rsid*and never touchedsdBlockId/sdBlockRev, on paragraphs or anywhere else.NON_SEMANTIC_BLOCK_ATTRSinextensions/diffing/algorithm/identity-attrs.ts), consumed by both the diff fingerprint normalization and the table-cell NO_OP comparator. One source of truth so the lists cannot drift again.sdBlockIdlives on all of those.Backward compatibility for persisted artifacts
The normalization change would invalidate any DiffSnapshot or DiffPayload already persisted under the pre-SD-3279 algorithm (the customer's revision-history workflow stores DiffSnapshot blobs). To avoid breaking existing artifacts:
compareToSnapshotandapplyDiffPayloadre-derive with the current normalizer first, then fall back to a legacy normalizer that reproduces the pre-fix algorithm exactly. Either match accepts the artifact. Genuinely tampered snapshots fail both checks.sdBlockIdvalues are still intact; cross-session reapply of old payloads was already broken pre-fix and the fallback can't retroactively rescue it.Verified: 257/257 diffing tests pass.
pnpm check:typesclean. Cross-editor handoff test fails pre-fix withfingerprint mismatch, passes post-fix. Same-editor regression guard still passes. Legacy snapshot accepted bycompareToSnapshot; legacy same-editor diff payload accepted byapplyDiffPayload; genuinely tampered snapshot still rejected. The 11 image-move conformance failures incontract-conformance.test.tspre-existed on main; confirmed unrelated to this change. Added doc-api story covering cross-session diff handoff throughdoc.diff.capture/compare/apply.