Skip to content

fix(db): canonicalize inArray value order in normalizeExpressionPaths#1566

Draft
v-anton wants to merge 1 commit into
TanStack:mainfrom
v-anton:fix/canonicalize-in-values
Draft

fix(db): canonicalize inArray value order in normalizeExpressionPaths#1566
v-anton wants to merge 1 commit into
TanStack:mainfrom
v-anton:fix/canonicalize-in-values

Conversation

@v-anton
Copy link
Copy Markdown

@v-anton v-anton commented Jun 2, 2026

Problem

inArray is set membership, but the predicate handed to a collection's loadSubset was not canonical in value order. normalizeExpressionPaths — the pass that canonicalizes predicates for subscriptions — only normalized ref paths, not inArray value order. And the join lazy-load predicate (inArray(joinKey, [...]), built in joins.ts) was passed to requestSnapshot without going through normalizeExpressionPaths at all.

So the same value set arriving in a different order — a join re-emitting its lazy-load keys across pipeline batches, or two subscriptions for the same data — serialized to a distinct predicate. Adapters that dedupe by serialized predicate / queryKey (e.g. @tanstack/query-db-collection) then treat it as a new subset and refetch identical data (and miss CDN cache cells keyed on the request URL).

Fix

  • normalizeExpressionPaths now also canonicalizes set-membership value order — it sorts in value arrays. in is unordered, and isPredicateSubset / index lookups already treat it order-independently, so sorting is safe.
  • The join lazy-load predicate is routed through normalizeExpressionPaths in joins.ts (using the target.alias already in scope), so its keys are canonicalized like every other predicate — fixing the inconsistency where it bypassed normalization entirely.

Tests

  • Added tests/query/canonicalize-in-values.test.ts: an on-demand collection queried with an out-of-order inArray literal asserts loadSubset receives the values sorted. Verified it fails without the change (loadSubset gets [3, 1, 2]) and passes with it.
  • Full @tanstack/db suite green (2372 tests).

Changeset included (patch).

Summary by CodeRabbit

  • Bug Fixes

    • Canonicalized in-filter value ordering so identical value sets produce the same query/cache keys, including for join lazy-load queries—eliminates duplicate queries and improves cache consistency.
  • Tests

    • Added tests verifying in filter values are consistently sorted in query requests.
  • Documentation

    • Clarified that in predicates are canonicalized by sorting their value lists.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0df06b98-f1a6-4431-95ec-b416a07e4d1a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Sorts values inside in expressions during normalization and routes join lazy-load inArray predicates through that normalization so equivalent value sets produce identical serialized predicates and loadSubset keys.

Changes

Predicate canonicalization for in-array queries

Layer / File(s) Summary
in-value canonicalization in normalizeExpressionPaths
packages/db/src/query/compiler/expressions.ts, .changeset/fix-canonicalize-in-values.md
JSDoc expanded and implementation added to detect in Func expressions, replace array-valued val args with sorted copies, and rebuild the Func. Changeset documents the fix.
Join lazy-load predicate normalization
packages/db/src/query/compiler/joins.ts
Import for normalizeExpressionPaths added; join lazy-loading now passes inArray(lazyJoinRef, joinKeys) through normalizeExpressionPaths(..., target.alias) when requesting snapshots so join-key filters are canonicalized.
Canonicalization test verification
packages/db/tests/query/canonicalize-in-values.test.ts
Adds tests that capture loadSubset calls from a collection whose sync seeds sample data; live queries using out-of-order inArray([3,1,2]) are preloaded and assertions verify the captured in filter values are sorted to [1,2,3].

🎯 2 (Simple) | ⏱️ ~12 minutes

🐰 I sorted the carrots in a row,
Now predicates match wherever they go,
No more duplicates chasing tails,
One canonical order on the trails,
Hooray — consistent queries hop along!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(db): canonicalize inArray value order in normalizeExpressionPaths' is clear, specific, and directly describes the primary change made in this PR.
Description check ✅ Passed The PR description comprehensively covers the problem, fix, and tests, but the author did not fill out the template checklist sections (Changes section uses custom format instead of template, and Checklist/Release Impact sections are absent).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/db/src/collection/subscription.ts (1)

492-495: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Canonicalize the paginated loadSubset path as well.

requestLimitedSnapshot() still forwards this.options.whereExpression unchanged into loadSubset. Any ordered/limited query with an inArray predicate can therefore keep producing different serialized predicates for the same set when the value order changes, so the duplicate-cache-key bug remains on this path.

Suggested fix
-    const where = this.options.whereExpression
+    const where = this.options.whereExpression
+    const canonicalWhere = where ? canonicalizeInValues(where) : undefined
     const whereFilterFn = where
       ? createFilterFunctionFromExpression(where)
       : undefined
@@
     const loadOptions: LoadSubsetOptions = {
-      where, // Main filter only, no cursor
+      where: canonicalWhere, // Main filter only, no cursor
       limit,
       orderBy,
       cursor: cursorExpressions, // Cursor expressions passed separately
       offset: offset ?? currentOffset, // Use provided offset, or auto-tracked offset
       subscription: this,

Also applies to: 646-653

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/src/collection/subscription.ts` around lines 492 - 495,
requestLimitedSnapshot currently passes this.options.whereExpression unchanged
into loadSubset causing non-canonical predicates (e.g., inArray order
variations) to produce different cache keys; canonicalize the where expression
before creating whereFilterFn and before passing into loadSubset by normalizing
any set-like predicates (such as inArray) to a deterministic order. Update the
logic around where / whereFilterFn (and the similar block later near the other
loadSubset call) to call a canonicalize function on this.options.whereExpression
(or produce a canonicalized copy) and use that canonical expression when
creating createFilterFunctionFromExpression and when calling
loadSubset/requestLimitedSnapshot so identical sets always serialize the same
way.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@packages/db/src/collection/subscription.ts`:
- Around line 492-495: requestLimitedSnapshot currently passes
this.options.whereExpression unchanged into loadSubset causing non-canonical
predicates (e.g., inArray order variations) to produce different cache keys;
canonicalize the where expression before creating whereFilterFn and before
passing into loadSubset by normalizing any set-like predicates (such as inArray)
to a deterministic order. Update the logic around where / whereFilterFn (and the
similar block later near the other loadSubset call) to call a canonicalize
function on this.options.whereExpression (or produce a canonicalized copy) and
use that canonical expression when creating createFilterFunctionFromExpression
and when calling loadSubset/requestLimitedSnapshot so identical sets always
serialize the same way.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2bfabfb6-b54b-4d60-b199-95d7ad2411eb

📥 Commits

Reviewing files that changed from the base of the PR and between be656be and 8013a4d.

📒 Files selected for processing (3)
  • .changeset/fix-canonicalize-in-values.md
  • packages/db/src/collection/subscription.ts
  • packages/db/tests/query/canonicalize-in-values.test.ts

@v-anton v-anton force-pushed the fix/canonicalize-in-values branch from 8013a4d to 361c719 Compare June 2, 2026 12:27
@v-anton v-anton changed the title fix(db): canonicalize inArray value order in loadSubset predicates fix(db): canonicalize inArray value order in normalizeExpressionPaths Jun 2, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/db/src/query/compiler/joins.ts (1)

313-319: ⚡ Quick win

Clarify target.path is collection-level; normalization alias-stripping only applies in rare name-collision cases.

getLazyLoadTargets constructs LazyLoadTarget.path by destructuring ref.path as [alias, ...path] and, for collectionRef, returning { alias: source.alias, collection: source.collection, path }, so the stored path excludes the ref alias and starts with the collection field name. That means normalizeExpressionPaths(..., target.alias) should not normally hit the path[0] === collectionAlias alias-stripping branch—only a rare collision is possible if a collection field name exactly equals target.alias (and the ref path has length > 1), in which case the stripping could incorrectly alter the ref used for loadSubset.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/src/query/compiler/joins.ts` around lines 313 - 319, The code is
mistakenly set up to potentially strip a leading alias from collection-level
paths: getLazyLoadTargets stores LazyLoadTarget.path without the ref alias
(collection-level), so normalizeExpressionPaths(..., target.alias) should not
perform the alias-stripping branch; update the call or function to avoid
removing the first segment for collection-level paths—either add a boolean
parameter (e.g., skipAliasStrip / pathIsCollectionLevel) to
normalizeExpressionPaths and pass it from the getLazyLoadTargets usage, or
change the call site to pass the correct alias/flag so the alias-stripping
branch is skipped; ensure the unchanged path is then used by loadSubset and
related code (references: getLazyLoadTargets, LazyLoadTarget.path,
normalizeExpressionPaths, loadSubset).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@packages/db/src/query/compiler/joins.ts`:
- Around line 313-319: The code is mistakenly set up to potentially strip a
leading alias from collection-level paths: getLazyLoadTargets stores
LazyLoadTarget.path without the ref alias (collection-level), so
normalizeExpressionPaths(..., target.alias) should not perform the
alias-stripping branch; update the call or function to avoid removing the first
segment for collection-level paths—either add a boolean parameter (e.g.,
skipAliasStrip / pathIsCollectionLevel) to normalizeExpressionPaths and pass it
from the getLazyLoadTargets usage, or change the call site to pass the correct
alias/flag so the alias-stripping branch is skipped; ensure the unchanged path
is then used by loadSubset and related code (references: getLazyLoadTargets,
LazyLoadTarget.path, normalizeExpressionPaths, loadSubset).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 13611c1d-d833-414a-b199-1962d5b95c7c

📥 Commits

Reviewing files that changed from the base of the PR and between 8013a4d and 361c719.

📒 Files selected for processing (4)
  • .changeset/fix-canonicalize-in-values.md
  • packages/db/src/query/compiler/expressions.ts
  • packages/db/src/query/compiler/joins.ts
  • packages/db/tests/query/canonicalize-in-values.test.ts
✅ Files skipped from review due to trivial changes (1)
  • .changeset/fix-canonicalize-in-values.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/db/tests/query/canonicalize-in-values.test.ts

@v-anton
Copy link
Copy Markdown
Author

v-anton commented Jun 2, 2026

Thanks @coderabbitai — the suggested diff references canonicalizeInValues, which was from an earlier revision; this PR moved the canonicalization into normalizeExpressionPaths instead. That means requestLimitedSnapshot is already covered: it forwards this.options.whereExpression, which is canonicalized at subscription creation (the in-value sort now happens inside normalizeExpressionPaths), so no per-call step is needed on the ordered/limited path. Added a regression test (sorts inArray values for ordered/limited queries too) that exercises the orderBy+limit path and asserts loadSubset receives sorted values.

@v-anton v-anton force-pushed the fix/canonicalize-in-values branch from 361c719 to 75ec05f Compare June 2, 2026 12:41
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Tip

For best results, initiate chat on the files or code changes.

@v-anton Thanks for the clear explanation — that makes complete sense. Canonicalizing inside normalizeExpressionPaths at subscription creation is the right level: every downstream consumer (requestSnapshot, requestLimitedSnapshot, coverage checks, index optimizations) gets stable predicate order automatically, with no per-call step needed. The regression test covering the orderBy+limit path is a great addition that closes the gap I was flagging. All good here!

🐇 ✨

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
packages/db/tests/query/canonicalize-in-values.test.ts (1)

47-91: ⚡ Quick win

Extract common assertion logic to reduce duplication.

Both test cases share identical assertion logic (extract filter, find in operator, assert sorted values). Consider extracting this into a helper function to improve maintainability.

♻️ Proposed refactor
+ function assertInArraySorted(
+   loadSubsetCalls: Array<LoadSubsetOptions>,
+   expectedValues: Array<number>
+ ) {
+   expect(loadSubsetCalls.length).toBeGreaterThan(0)
+   const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
+   const inFilter = filters.find((f) => f.operator === `in`)
+   expect(inFilter).toBeDefined()
+   expect(inFilter!.value).toEqual(expectedValues)
+ }

  it(`sorts inArray values in the predicate passed to loadSubset`, async () => {
    const { collection: children, loadSubsetCalls } =
      createChildrenCollectionWithTracking()

    const liveQuery = createLiveQueryCollection((q) =>
      q.from({ c: children }).where(({ c }) => inArray(c.parentId, [3, 1, 2])),
    )

    await liveQuery.preload()

-   expect(loadSubsetCalls.length).toBeGreaterThan(0)
-
-   const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
-   const inFilter = filters.find((f) => f.operator === `in`)
-
-   expect(inFilter).toBeDefined()
-   expect(inFilter!.value).toEqual([1, 2, 3])
+   assertInArraySorted(loadSubsetCalls, [1, 2, 3])
  })

  it(`sorts inArray values for ordered/limited queries too`, async () => {
    const { collection: children, loadSubsetCalls } =
      createChildrenCollectionWithTracking()

    const liveQuery = createLiveQueryCollection((q) =>
      q
        .from({ c: children })
        .where(({ c }) => inArray(c.parentId, [3, 1, 2]))
        .orderBy(({ c }) => c.id)
        .limit(2),
    )

    await liveQuery.preload()

-   expect(loadSubsetCalls.length).toBeGreaterThan(0)
-
-   const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
-   const inFilter = filters.find((f) => f.operator === `in`)
-
-   expect(inFilter).toBeDefined()
-   expect(inFilter!.value).toEqual([1, 2, 3])
+   assertInArraySorted(loadSubsetCalls, [1, 2, 3])
  })

As per coding guidelines: "Extract common logic into utility functions when identical or near-identical code blocks appear in multiple places."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/tests/query/canonicalize-in-values.test.ts` around lines 47 - 91,
Extract the repeated assertion logic (calling extractSimpleComparisons on
loadSubsetCalls[0]!.where, finding the filter with operator `in`, and asserting
its value equals [1,2,3]) into a small helper function (e.g.,
assertInFilterSorted) and call it from both tests; update both tests (`sorts
inArray values in the predicate passed to loadSubset` and `sorts inArray values
for ordered/limited queries too`) to call that helper with the relevant
loadSubsetCalls array, keeping references to extractSimpleComparisons and the
`in` operator lookup so the helper encapsulates the filter extraction and
expectation steps.
packages/db/src/query/compiler/expressions.ts (1)

53-60: 💤 Low value

Consider edge cases for empty and single-element arrays.

While empty arrays ([]) and single-element arrays ([x]) sort correctly, add a guard to skip sorting when unnecessary for clarity and to avoid allocating a new array for trivial cases.

⚡ Optimization for trivial cases
  if (
    whereClause.name === `in` &&
    args.length === 2 &&
    args[1]?.type === `val` &&
-   Array.isArray(args[1].value)
+   Array.isArray(args[1].value) &&
+   args[1].value.length > 1
  ) {
    args[1] = new Value([...args[1].value].sort(...))
  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/src/query/compiler/expressions.ts` around lines 53 - 60, The
current branch that normalizes IN-list values blindly creates a new sorted array
even for empty or single-element lists; update the condition in the
whereClause.name === `in` branch (the check that inspects args and
args[1].value) to only perform the allocation and sorting when args[1].value is
an array with length > 1, otherwise leave args[1] untouched, so use
Array.isArray(args[1].value) && args[1].value.length > 1 before replacing
args[1] with new Value([...args[1].value].sort()).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/db/src/query/compiler/expressions.ts`:
- Around line 53-60: The current canonicalization uses Array.prototype.sort()
which sorts lexicographically and breaks numeric ordering; update the branch
that checks whereClause.name === `in` to sort the args[1].value with a
type-aware comparator before wrapping in new Value: if all items are numbers
sort numerically (a - b), if all are strings sort lexically (localeCompare),
otherwise convert items to a consistent string representation and sort
deterministically; use a copied array (e.g., [...args[1].value]) and then pass
the sorted array into new Value(...) so the canonical form for inArray (handled
whereClause.name/args and Value) is stable for numeric IDs and mixed types.

In `@packages/db/tests/query/canonicalize-in-values.test.ts`:
- Around line 16-92: Add tests covering empty and single-element inArray edge
cases inside the existing describe(`canonicalize inArray value order`) block:
use createChildrenCollectionWithTracking() and createLiveQueryCollection((q) =>
q.from({ c: children }).where(({ c }) => inArray(c.parentId, []))) and similarly
with [1], call liveQuery.preload(), assert loadSubsetCalls.length > 0, then use
extractSimpleComparisons(loadSubsetCalls[0]!.where) to find the `in` filter and
assert its value equals [] for the empty case and [1] for the single-element
case; duplicate both assertions for the ordered/limited path (add .orderBy(({ c
}) => c.id).limit(2)) to cover the requestLimitedSnapshot branch.
- Around line 47-66: The test currently only uses single-digit values so it
misses a numeric-sorting bug; add a second assertion case that calls
createLiveQueryCollection(... where inArray(c.parentId, [10, 2, 1])) (or similar
multi-digit set like [12,3,4]), preload it, extract the in-filter from
loadSubsetCalls (using extractSimpleComparisons and find where operator ===
'in') and assert that inFilter!.value equals the numerically sorted array
[1,2,10] (or [3,4,12]); this will catch the bug in expressions.ts where .sort()
is used without a numeric comparator while still using the existing helpers
loadSubset, createLiveQueryCollection, and inArray.

---

Nitpick comments:
In `@packages/db/src/query/compiler/expressions.ts`:
- Around line 53-60: The current branch that normalizes IN-list values blindly
creates a new sorted array even for empty or single-element lists; update the
condition in the whereClause.name === `in` branch (the check that inspects args
and args[1].value) to only perform the allocation and sorting when args[1].value
is an array with length > 1, otherwise leave args[1] untouched, so use
Array.isArray(args[1].value) && args[1].value.length > 1 before replacing
args[1] with new Value([...args[1].value].sort()).

In `@packages/db/tests/query/canonicalize-in-values.test.ts`:
- Around line 47-91: Extract the repeated assertion logic (calling
extractSimpleComparisons on loadSubsetCalls[0]!.where, finding the filter with
operator `in`, and asserting its value equals [1,2,3]) into a small helper
function (e.g., assertInFilterSorted) and call it from both tests; update both
tests (`sorts inArray values in the predicate passed to loadSubset` and `sorts
inArray values for ordered/limited queries too`) to call that helper with the
relevant loadSubsetCalls array, keeping references to extractSimpleComparisons
and the `in` operator lookup so the helper encapsulates the filter extraction
and expectation steps.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f932c254-b9ca-4070-a2c3-fa6c8d0edad3

📥 Commits

Reviewing files that changed from the base of the PR and between 361c719 and 75ec05f.

📒 Files selected for processing (4)
  • .changeset/fix-canonicalize-in-values.md
  • packages/db/src/query/compiler/expressions.ts
  • packages/db/src/query/compiler/joins.ts
  • packages/db/tests/query/canonicalize-in-values.test.ts
✅ Files skipped from review due to trivial changes (1)
  • .changeset/fix-canonicalize-in-values.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/db/src/query/compiler/joins.ts

Comment on lines +53 to +60
if (
whereClause.name === `in` &&
args.length === 2 &&
args[1]?.type === `val` &&
Array.isArray(args[1].value)
) {
args[1] = new Value([...args[1].value].sort())
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Default .sort() breaks canonicalization for multi-digit numbers.

JavaScript's default .sort() uses lexicographic (string) comparison, which produces incorrect ordering for numbers. For example, [10, 2, 1].sort() yields [1, 10, 2] instead of [1, 2, 10]. Since inArray is commonly used with numeric IDs and foreign keys, two semantically identical predicates inArray(id, [1, 2, 10]) and inArray(id, [1, 10, 2]) would canonicalize to different orderings, defeating the entire purpose of this PR (cache hits, deduplication).

🔧 Proposed fix with type-aware comparison
  if (
    whereClause.name === `in` &&
    args.length === 2 &&
    args[1]?.type === `val` &&
    Array.isArray(args[1].value)
  ) {
-   args[1] = new Value([...args[1].value].sort())
+   args[1] = new Value([...args[1].value].sort((a, b) => {
+     // Handle primitive types with type-aware comparison
+     if (typeof a === 'number' && typeof b === 'number') {
+       return a - b
+     }
+     if (typeof a === 'string' && typeof b === 'string') {
+       return a.localeCompare(b)
+     }
+     // Fallback: stringify for comparison (matches serialization behavior)
+     const aStr = String(a)
+     const bStr = String(b)
+     return aStr < bStr ? -1 : aStr > bStr ? 1 : 0
+   }))
  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/src/query/compiler/expressions.ts` around lines 53 - 60, The
current canonicalization uses Array.prototype.sort() which sorts
lexicographically and breaks numeric ordering; update the branch that checks
whereClause.name === `in` to sort the args[1].value with a type-aware comparator
before wrapping in new Value: if all items are numbers sort numerically (a - b),
if all are strings sort lexically (localeCompare), otherwise convert items to a
consistent string representation and sort deterministically; use a copied array
(e.g., [...args[1].value]) and then pass the sorted array into new Value(...) so
the canonical form for inArray (handled whereClause.name/args and Value) is
stable for numeric IDs and mixed types.

Comment on lines +16 to +92
describe(`canonicalize inArray value order`, () => {
function createChildrenCollectionWithTracking() {
const loadSubsetCalls: Array<LoadSubsetOptions> = []

const collection = createCollection<Child>({
id: `canon-children`,
getKey: (child) => child.id,
syncMode: `on-demand`,
autoIndex: `eager`,
defaultIndexType: BasicIndex,
sync: {
sync: ({ begin, write, commit, markReady }) => {
begin()
for (const child of sampleChildren) {
write({ type: `insert`, value: child })
}
commit()
markReady()
return {
loadSubset: vi.fn((options: LoadSubsetOptions) => {
loadSubsetCalls.push(options)
return Promise.resolve()
}),
}
},
},
})

return { collection, loadSubsetCalls }
}

it(`sorts inArray values in the predicate passed to loadSubset`, async () => {
const { collection: children, loadSubsetCalls } =
createChildrenCollectionWithTracking()

// Out-of-order literal — survives normalizeExpressionPaths unsorted, so
// without canonicalization loadSubset would receive [3, 1, 2].
const liveQuery = createLiveQueryCollection((q) =>
q.from({ c: children }).where(({ c }) => inArray(c.parentId, [3, 1, 2])),
)

await liveQuery.preload()

expect(loadSubsetCalls.length).toBeGreaterThan(0)

const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
const inFilter = filters.find((f) => f.operator === `in`)

expect(inFilter).toBeDefined()
expect(inFilter!.value).toEqual([1, 2, 3])
})

it(`sorts inArray values for ordered/limited queries too`, async () => {
const { collection: children, loadSubsetCalls } =
createChildrenCollectionWithTracking()

// orderBy + limit takes the requestLimitedSnapshot path; the where is still
// canonicalized because it is normalized at subscription creation.
const liveQuery = createLiveQueryCollection((q) =>
q
.from({ c: children })
.where(({ c }) => inArray(c.parentId, [3, 1, 2]))
.orderBy(({ c }) => c.id)
.limit(2),
)

await liveQuery.preload()

expect(loadSubsetCalls.length).toBeGreaterThan(0)

const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
const inFilter = filters.find((f) => f.operator === `in`)

expect(inFilter).toBeDefined()
expect(inFilter!.value).toEqual([1, 2, 3])
})
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add edge case coverage for empty and single-element arrays.

The test suite is missing corner cases that should be verified per coding guidelines: empty arrays (inArray(c.parentId, [])), single-element arrays (inArray(c.parentId, [1])), and potentially string values if applicable.

🧪 Proposed edge case tests
+ it(`handles empty inArray values`, async () => {
+   const { collection: children, loadSubsetCalls } =
+     createChildrenCollectionWithTracking()
+
+   const liveQuery = createLiveQueryCollection((q) =>
+     q.from({ c: children }).where(({ c }) => inArray(c.parentId, [])),
+   )
+
+   await liveQuery.preload()
+
+   expect(loadSubsetCalls.length).toBeGreaterThan(0)
+
+   const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
+   const inFilter = filters.find((f) => f.operator === `in`)
+
+   expect(inFilter).toBeDefined()
+   expect(inFilter!.value).toEqual([])
+ })
+
+ it(`handles single-element inArray values`, async () => {
+   const { collection: children, loadSubsetCalls } =
+     createChildrenCollectionWithTracking()
+
+   const liveQuery = createLiveQueryCollection((q) =>
+     q.from({ c: children }).where(({ c }) => inArray(c.parentId, [2])),
+   )
+
+   await liveQuery.preload()
+
+   expect(loadSubsetCalls.length).toBeGreaterThan(0)
+
+   const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
+   const inFilter = filters.find((f) => f.operator === `in`)
+
+   expect(inFilter).toBeDefined()
+   expect(inFilter!.value).toEqual([2])
+ })

As per coding guidelines: "Test corner cases including: empty arrays/sets, single-element collections."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/tests/query/canonicalize-in-values.test.ts` around lines 16 - 92,
Add tests covering empty and single-element inArray edge cases inside the
existing describe(`canonicalize inArray value order`) block: use
createChildrenCollectionWithTracking() and createLiveQueryCollection((q) =>
q.from({ c: children }).where(({ c }) => inArray(c.parentId, []))) and similarly
with [1], call liveQuery.preload(), assert loadSubsetCalls.length > 0, then use
extractSimpleComparisons(loadSubsetCalls[0]!.where) to find the `in` filter and
assert its value equals [] for the empty case and [1] for the single-element
case; duplicate both assertions for the ordered/limited path (add .orderBy(({ c
}) => c.id).limit(2)) to cover the requestLimitedSnapshot branch.

Comment on lines +47 to +66
it(`sorts inArray values in the predicate passed to loadSubset`, async () => {
const { collection: children, loadSubsetCalls } =
createChildrenCollectionWithTracking()

// Out-of-order literal — survives normalizeExpressionPaths unsorted, so
// without canonicalization loadSubset would receive [3, 1, 2].
const liveQuery = createLiveQueryCollection((q) =>
q.from({ c: children }).where(({ c }) => inArray(c.parentId, [3, 1, 2])),
)

await liveQuery.preload()

expect(loadSubsetCalls.length).toBeGreaterThan(0)

const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
const inFilter = filters.find((f) => f.operator === `in`)

expect(inFilter).toBeDefined()
expect(inFilter!.value).toEqual([1, 2, 3])
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Test data doesn't catch multi-digit number sorting bug.

The test uses single-digit values [3, 1, 2], which sort correctly even with JavaScript's default lexicographic sort. However, the implementation in expressions.ts uses .sort() without a comparator, which fails for multi-digit numbers (e.g., [10, 2, 1] would sort to [1, 10, 2] instead of [1, 2, 10]). Add a test case with multi-digit numeric IDs to catch this critical bug.

🧪 Proposed additional test case
+ it(`sorts multi-digit numeric inArray values correctly`, async () => {
+   const { collection: children, loadSubsetCalls } =
+     createChildrenCollectionWithTracking()
+
+   // Multi-digit numbers expose lexicographic vs numeric sort bugs
+   const liveQuery = createLiveQueryCollection((q) =>
+     q.from({ c: children }).where(({ c }) => inArray(c.parentId, [100, 20, 3])),
+   )
+
+   await liveQuery.preload()
+
+   expect(loadSubsetCalls.length).toBeGreaterThan(0)
+
+   const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
+   const inFilter = filters.find((f) => f.operator === `in`)
+
+   expect(inFilter).toBeDefined()
+   expect(inFilter!.value).toEqual([3, 20, 100])
+ })

As per coding guidelines: "Test corner cases including: empty arrays/sets, single-element collections, undefined vs null values, resolved promises, async race conditions, and limit/offset edge cases."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it(`sorts inArray values in the predicate passed to loadSubset`, async () => {
const { collection: children, loadSubsetCalls } =
createChildrenCollectionWithTracking()
// Out-of-order literal — survives normalizeExpressionPaths unsorted, so
// without canonicalization loadSubset would receive [3, 1, 2].
const liveQuery = createLiveQueryCollection((q) =>
q.from({ c: children }).where(({ c }) => inArray(c.parentId, [3, 1, 2])),
)
await liveQuery.preload()
expect(loadSubsetCalls.length).toBeGreaterThan(0)
const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
const inFilter = filters.find((f) => f.operator === `in`)
expect(inFilter).toBeDefined()
expect(inFilter!.value).toEqual([1, 2, 3])
})
it(`sorts inArray values in the predicate passed to loadSubset`, async () => {
const { collection: children, loadSubsetCalls } =
createChildrenCollectionWithTracking()
// Out-of-order literal — survives normalizeExpressionPaths unsorted, so
// without canonicalization loadSubset would receive [3, 1, 2].
const liveQuery = createLiveQueryCollection((q) =>
q.from({ c: children }).where(({ c }) => inArray(c.parentId, [3, 1, 2])),
)
await liveQuery.preload()
expect(loadSubsetCalls.length).toBeGreaterThan(0)
const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
const inFilter = filters.find((f) => f.operator === `in`)
expect(inFilter).toBeDefined()
expect(inFilter!.value).toEqual([1, 2, 3])
})
it(`sorts multi-digit numeric inArray values correctly`, async () => {
const { collection: children, loadSubsetCalls } =
createChildrenCollectionWithTracking()
// Multi-digit numbers expose lexicographic vs numeric sort bugs
const liveQuery = createLiveQueryCollection((q) =>
q.from({ c: children }).where(({ c }) => inArray(c.parentId, [100, 20, 3])),
)
await liveQuery.preload()
expect(loadSubsetCalls.length).toBeGreaterThan(0)
const filters = extractSimpleComparisons(loadSubsetCalls[0]!.where)
const inFilter = filters.find((f) => f.operator === `in`)
expect(inFilter).toBeDefined()
expect(inFilter!.value).toEqual([3, 20, 100])
})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/db/tests/query/canonicalize-in-values.test.ts` around lines 47 - 66,
The test currently only uses single-digit values so it misses a numeric-sorting
bug; add a second assertion case that calls createLiveQueryCollection(... where
inArray(c.parentId, [10, 2, 1])) (or similar multi-digit set like [12,3,4]),
preload it, extract the in-filter from loadSubsetCalls (using
extractSimpleComparisons and find where operator === 'in') and assert that
inFilter!.value equals the numerically sorted array [1,2,10] (or [3,4,12]); this
will catch the bug in expressions.ts where .sort() is used without a numeric
comparator while still using the existing helpers loadSubset,
createLiveQueryCollection, and inArray.

inArray is set membership, but normalizeExpressionPaths only canonicalized ref
paths, not value order, so the same value set in a different order produced a
distinct serialized predicate (and loadSubset queryKey/cache key for adapters
that key by it) and refetched identical data. normalizeExpressionPaths now also
sorts in value arrays, and the join lazy-load predicate — previously built and
combined without normalization — is routed through it so its keys are
canonicalized too.
@v-anton v-anton marked this pull request as draft June 2, 2026 12:57
@v-anton v-anton force-pushed the fix/canonicalize-in-values branch from 75ec05f to 08b140b Compare June 2, 2026 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant