Skip to content

Build searchable-driven search-doc generation + parity harness + baseline#5351

Open
habdelra wants to merge 9 commits into
cs-11721-searchable-field-option-api-cache-into-fielddefinitionfrom
cs-11722-build-searchable-driven-search-doc-generation-parity-harness
Open

Build searchable-driven search-doc generation + parity harness + baseline#5351
habdelra wants to merge 9 commits into
cs-11721-searchable-field-option-api-cache-into-fielddefinitionfrom
cs-11722-build-searchable-driven-search-doc-generation-parity-harness

Conversation

@habdelra

Copy link
Copy Markdown
Contributor

Builds the searchable-driven search-doc generator and the parity tooling, without making it authoritative — store-driven searchDoc stays in charge of production indexing until the cutover (CS-11724). Stacked on #5350 (CS-11721).

Generator — searchDocFromFields(instance) (packages/base/card-api.gts)

Parallel to searchDoc. Derives link depth from the explicit searchable annotation instead of from what the render loaded, and loads the named link targets itself (targeted loading) rather than relying on store residency.

  • Routes are dotted paths rooted at the indexed card's link fields. Depth is governed entirely by the annotations on the card being indexed — a card pulled in as a link target does not re-consult its own searchable. true makes the immediate ("self") link searchable; a dotted path makes a deeper (n+1) link searchable; arrays combine.
  • The owner's store is threaded through the recursion (a contained FieldDef value may not be store-associated, but a link reached through it must still load against the owner's store). Targeted loading degrades a missing/broken/unloadable target to { id }.
  • Cycle clipping, { id } for unfollowed/broken/not-loaded links, linksToMany id normalization, and the query-backed-field skip are preserved from the store-driven path. Link targets enumerate their declared type, dropping unqueryable polymorphic-subtype bloat.

Parity

  • Differential test (packages/host/tests/integration/searchable-search-doc-test.gts): the searchable-driven path follows a searchable link to the same target, with the same contained data, the store-driven render loaded. Whole-doc byte-equality is intentionally not asserted here — the new spec keeps { id } for every relationship while the store-driven path omits unused links via usedLinksToFieldsOnly; reconciling that to an identical doc is the cutover's gate, after the migration (CS-11723) reproduces today's depth.
  • searchable-parity-diff.ts (packages/realm-server/scripts): the realm-scale post-migration validator — diffs a realm's live store-driven search docs (boxel_index) against the searchable-driven output per card, ignoring _cardType and (optionally) the intended shallow-link difference. Data-gathering documented inline.
  • Baseline: ctse/military-pigeon render-vs-searchDoc split captured from staging (≈ 933:1) and recorded on the ticket for the follow-on (prerender off the hot path) to measure against.

Tests (host integration, run in CI)

searchable-search-doc-test.gts — 9 cases: the routes-come-only-from-the-indexed-card trio (target's own searchable dormant when pulled in; honored only when itself indexed; a dotted route on the indexer drives the deeper expansion), self link, n+1 route, contains-routing through a contained value, cycle clip → { id }, missing/broken link → { id }, unannotated link → { id }, declared-type enumeration drops subtype bloat, and the differential expansion-parity check.

Verified: glint + tsc clean (base/host/runtime-common/realm-server), eslint clean, and all 9 integration tests green on a live test-services:host stack.

habdelra and others added 3 commits June 27, 2026 12:13
`searchDocFromFields(instance)` derives search-doc link depth from the explicit
`searchable` field annotation instead of from what the render loaded, and loads
the named link targets itself (targeted loading) rather than relying on store
residency. Parallel to `searchDoc`, which stays authoritative until the cutover.

Routes are dotted paths rooted at the indexed card's link fields; depth is
governed entirely by the annotations on the card being indexed — a card pulled
in as a link target does not re-consult its own `searchable`. `true` makes the
self link searchable, a dotted path makes a deeper link searchable, arrays
combine. Cycle clipping, `{ id }` for unfollowed / broken / not-loaded links,
linksToMany id normalization, and the query-backed-field skip are preserved
from the store-driven path; link targets enumerate their declared type, which
drops the unqueryable polymorphic-subtype bloat.

Integration tests verify the routes-come-only-from-the-indexed-card semantic
(a target's own searchable is dormant when pulled in; honored only when the
target is itself indexed; a dotted route on the indexer expands the deeper
link; an unannotated link stays `{ id }`).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ontains-routing

Thread the owner's store through `searchableQueryableValue` instead of
re-deriving it per value: a contained FieldDef value may not be
store-associated, but a link reached through it must still load against the
owner's store. Guard targeted loading against thrown rejections (not just
returned CardErrors) so a missing / unloadable target degrades to `{ id }`.

Tests add the cycle-clip, missing-target `{ id }`, and contains-routing cases
alongside the existing routes-only-from-the-indexed-card coverage.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…y-diff tool

Differential parity test: the searchable-driven path follows a searchable link
to the same target, with the same contained data, that the store-driven render
loaded. Whole-doc byte-equality is intentionally not asserted yet — the new
spec keeps `{ id }` for every relationship while the store-driven path omits
unused links via `usedLinksToFieldsOnly`; reconciling that to an identical doc
is the cutover's gate, after the migration reproduces today's depth.

Declared-type test: a `linksTo(SimpleAuthor)` whose instance is a FancyAuthor
subtype drops the subtype-only field — the generator enumerates the declared
target type.

searchable-parity-diff.ts: the realm-scale before/after validator — diffs a
realm's live store-driven search docs against the searchable-driven output per
card, ignoring `_cardType` and (optionally) the intended shallow-link
difference. Meaningful post-migration; the data-gathering is documented inline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1b054e4c39

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/base/card-api.gts Outdated
Comment thread packages/realm-server/scripts/searchable-parity-diff.ts
@github-actions

github-actions Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Preview deployments

Host Test Results

    1 files      1 suites   2h 32m 37s ⏱️
3 319 tests 3 304 ✅ 15 💤 0 ❌
3 338 runs  3 323 ✅ 15 💤 0 ❌

Results for commit 0f3ce37.

Realm Server Test Results

    1 files  ±0      1 suites  ±0   9m 48s ⏱️ +32s
1 666 tests ±0  1 666 ✅ ±0  0 💤 ±0  0 ❌ ±0 
1 745 runs  ±0  1 745 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 0f3ce37. ± Comparison against earlier commit 41f434e.

habdelra and others added 2 commits June 27, 2026 13:49
… links as ignorable

Resolve a not-loaded link's reference against the owner's relativeTo before the
targeted load/lookup (matching the lazy link getter): a relative `links.self`
like `./hassan` can't be `toURL`'d by the store, which would otherwise degrade
an expandable searchable link to `{ id }`.

In the staging parity-diff tool, recognize an array whose elements are all
shallow (and the empty plural) as a shallow link, so `--ignore-shallow-links`
also covers unrendered `linksToMany` relationships.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Covers the relative-`links.self` path — without resolving the reference first
the targeted load fails and the link degrades to { id }.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new searchable-driven search-doc generator (searchDocFromFields) alongside a parity harness (host integration tests + realm-scale diff script) to validate behavior before the eventual cutover from the existing store-driven searchDoc.

Changes:

  • Introduces searchDocFromFields(instance) in packages/base/card-api.gts, generating search docs by following field.searchable routes and targeted-loading link targets.
  • Adds a host integration test suite covering route semantics, cycle clipping, missing/broken links, declared-type link enumeration, and a differential parity check.
  • Adds a realm-server script to diff live store-driven docs against generated docs, with an option to ignore known shallow-link differences.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
packages/base/card-api.gts Adds searchDocFromFields and the searchable-driven recursion + targeted loading logic
packages/host/tests/integration/searchable-search-doc-test.gts Adds integration coverage for searchable-driven search-doc generation and parity expectations
packages/host/tests/helpers/base-realm.ts Exposes searchDoc and searchDocFromFields from the base realm test helper
packages/realm-server/scripts/searchable-parity-diff.ts Adds a CLI script to diff live boxel_index.search_doc output against generated docs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/base/card-api.gts Outdated
Comment thread packages/base/card-api.gts Outdated
Comment thread packages/realm-server/scripts/searchable-parity-diff.ts
habdelra and others added 2 commits June 27, 2026 15:11
Postgres jsonb and JS object construction can emit the same data with different
key order; a plain JSON.stringify comparison reported those as false
divergences. Serialize with sorted keys at every level before comparing (array
order preserved).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the generator unit matrix — an array `searchable` on a link expands
each named route on the target.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts Outdated
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts Outdated
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts Outdated
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts Outdated
Comment thread packages/realm-server/scripts/searchable-parity-diff.ts
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts Outdated
Comment thread packages/host/tests/integration/searchable-search-doc-test.gts
Comment thread packages/realm-server/scripts/searchable-parity-diff.ts Outdated
Comment thread packages/realm-server/scripts/searchable-parity-diff.ts Outdated
habdelra and others added 2 commits June 27, 2026 17:57
- Move searchDocFromFields + helpers out of card-api.gts into a dedicated
  packages/base/searchable.ts module; card-api re-exports it.
- Accept `searchable: false` (explicit "not searchable"); harden both the
  generator's route seeding and the definition-build validator against
  malformed annotations (false / null / non-string array entry / empty string
  / empty array) so they degrade to {id} instead of throwing.
- Rebuild the integration suite into 39 cases: the four contains/containsMany ×
  linksTo/linksToMany combinations, multi-segment and shared-ancestor routes,
  linksToMany broken/missing/empty/deep/cycle, multi-card cycles, query-backed
  skip, declared-type enumeration, and malformed/impossible paths.
- searchable-parity-diff: use safe-stable-stringify; still report a changed
  reference id under --ignore-shallow-links (only omit-vs-keep-{id} is
  suppressed); export the differ functions + add a unit test; guard main() so
  importing it for tests runs no file I/O.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CXfbXyUtT29nsUhSLtMQVX
- Remove the searchDocFromFields re-export from card-api. It pulled the
  non-authoritative generator into every card's dependency closure (caught by
  realm-indexing's dep-closure assertion). The generator is indexer-side
  tooling; consumers import it directly from ./searchable, which also removes
  the card-api↔searchable import cycle.
- Revert the searchable option to `true | string | string[]` — there is no
  "not searchable" value. `false` is treated purely as bad input: route
  seeding and the definition-build validator both degrade it to {id} without
  throwing.
- Test helper loads searchDocFromFields from the searchable module directly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CXfbXyUtT29nsUhSLtMQVX
@habdelra habdelra requested a review from a team June 27, 2026 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants