Skip to content

Fixed smart count false positive on subquery JOINs#534

Open
kevinansfield wants to merge 3 commits intomainfrom
fix/smarter-counts-subquery-join
Open

Fixed smart count false positive on subquery JOINs#534
kevinansfield wants to merge 3 commits intomainfrom
fix/smarter-counts-subquery-join

Conversation

@kevinansfield
Copy link
Copy Markdown
Member

@kevinansfield kevinansfield commented Apr 14, 2026

Summary

  • hasMultiTableSource matched /\bjoin\b/i against the fully compiled SQL, so a where id in (select … inner join …) forced count(distinct posts.id) even when the outer query was still single-table. On a 267k-post Ghost site filtered by author this was a measured ~20% regression on the count aggregate (316ms vs 257ms) for zero correctness benefit — the weedout step already guaranteed distinct rows.
  • Switched detection to walk knex's query-builder AST (_statements + _single.table) instead of regex-matching compiled SQL. Joins, unions, derived tables and comma-separated FROM lists are all read as structured data, and JOINs nested in WHERE id IN (…) subqueries live under a where grouping and are correctly ignored.
  • Since bookshelf is built on knex, the builder always exposes _statements / _single — no SQL fallback needed. The whole check collapses to a ~30-line AST walk: no regexes, no toSQL() call, no paren-stripping, no derived helpers.

Tests

Added bookshelf@1.2.0, knex@3.2.9 and sqlite3@5.1.7 as devDependencies so the plugin can be tested against real knex/bookshelf instances instead of hand-rolled stubs. 25 tests total, 100% line/branch/function coverage on lib/bookshelf-pagination.js.

  • 14 stub tests (test/pagination.test.js) — pagination metadata math, limit='all', transacting passthrough, error wrapping (NotFound / BadRequest / unknown), and the parseOptions / formatResponse utilities. None of these need a real DB.
  • 9 direct hasMultiTableSource tests (test/pagination-integration.test.js) — call the function against real knex QueryBuilders (no DB connection) and cover every branch: plain single-table, subquery-in-WHERE, innerJoin, leftJoin/joinRaw, UNION, derived table in FROM, fromRaw with multiple tables, comma-string table, and CTE-only. These pin the plugin's assumptions about knex's AST shape to the version of knex actually installed, and cover edge cases that can't be driven through fetchPage end-to-end (UNION + aggregate is semantically weird; a comma-string table fails at sqlite execution).
  • 2 end-to-end fetchPage tests — the only two scenarios that carry unique weight on top of the direct suite:
    1. An inner join that duplicates rows (p1 × two tags → 3 physical rows for 2 published posts). This is the only shape where count(*) and count(distinct) actually produce different totals, so it's the only test that catches a "we picked the wrong aggregate" bug at the result level.
    2. A subquery JOIN in WHERE — the named regression this PR is about, kept as an explicit end-to-end guard that the fix reaches production via fetchPage.

Each e2e test listens on knex's query event to assert which count aggregate the plugin chose, and asserts pagination.total against a value that's only correct if that aggregate was picked — so "we picked count(*)" and "the result is right" are verified together.

Notes on upstream consumers

This is strictly safe for anyone already using useSmartCount: true: every case the old code flagged as multi-table is still flagged by the AST walk. The change only removes false positives (subquery JOINs) and adds one previously-missed case (derived tables in FROM). useBasicCount and the default distinct-count behaviour are untouched.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Replaces SQL-text/reg-ex parsing for multi-table detection with inspection of Knex query-builder internals. hasMultiTableSource(queryBuilder) now checks queryBuilder._statements for statements with grouping 'join' or 'union', and inspects queryBuilder._single.table for non-string sources or comma-containing table strings. Removes prior compiled-SQL regex helpers. Exports paginationUtils.hasMultiTableSource. Tests updated: unit tests stub builder internals; a new integration suite uses an in-memory SQLite DB to exercise detection and count-query selection. Dev dependencies updated for Bookshelf/Knex/SQLite.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Detailed analysis

  • Production code:
    • Removed SQL-regex helpers and compiled-SQL parsing.
    • Implemented AST/internal inspection via _statements and _single.table.
    • Exported paginationUtils.hasMultiTableSource.
    • Adjusted fetchPage to branch count-query behavior using the new detection.
  • Tests:
    • Unit tests changed to stub _statements/_single instead of overriding toSQL().
    • Added integration/E2E tests using an in-memory SQLite DB (Bookshelf + Knex) that capture compiled SQL and validate detection and count-selection behavior across many query shapes (joins, unions, derived FROMs, joinRaw, comma FROM, subqueries, CTEs).
    • Removed previous assertions tied to compiled-SQL regex patterns.
  • Package metadata:
    • Added devDependencies: bookshelf, knex, sqlite3.
  • Review focus:
    • Validate that all realistic Knex query-builder shapes (joins, unions, derived/raw/QueryBuilder FROMs, comma-separated table strings, CTE-only queries) are correctly detected by the new checks.
    • Confirm exported API change is intended and documented.
    • Ensure tests accurately model Knex internals and that integration tests are deterministic and performant in CI.
    • Verify no remaining code paths rely on compiled SQL parsing.
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: fixing a false positive in smart count detection when JOINs are nested in subqueries.
Description check ✅ Passed The pull request description clearly explains the problem (false positives on subquery JOINs causing performance regressions), the solution (switching from regex-based SQL inspection to AST-based detection using knex internals), and provides concrete examples and test coverage details.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/smarter-counts-subquery-join

Comment @coderabbitai help to get the list of available commands and usage tips.

@kevinansfield kevinansfield changed the title Fix bookshelf-pagination smart count false positive on subquery JOINs Fixed smart count false positive on subquery JOINs Apr 14, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.37%. Comparing base (00d3124) to head (a7b3832).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #534      +/-   ##
==========================================
+ Coverage   99.23%   99.37%   +0.14%     
==========================================
  Files         127      127              
  Lines        6404     6406       +2     
  Branches     1227     1231       +4     
==========================================
+ Hits         6355     6366      +11     
+ Misses         49       40       -9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/bookshelf-pagination/test/pagination.test.js (1)

364-381: Clarify why CTE-only queries don't require distinct count.

The test correctly expects count(*) for a CTE-only query, but the reasoning may not be obvious to future maintainers. CTEs (WITH clauses) define named subqueries but don't inherently duplicate rows in the outer query—the outer query still selects from a single table.

Consider adding a brief comment explaining this behavior, similar to other tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/bookshelf-pagination/test/pagination.test.js` around lines 364 -
381, Add a brief inline comment in the test "useSmartCount uses count(*) when
AST has only a CTE (with) grouping" explaining that a CTE (WITH clause) defines
a named subquery but does not duplicate rows in the outer query, so the outer
query still selects from a single table and a simple count(*) is sufficient;
place this comment near the model.query override or right before the fetchPage
call (references: model.query override, fetchPage({..., useSmartCount: true}),
and the final assertion modelState.rawCalls[0] === 'count(*) as aggregate') so
future maintainers understand why distinct counting isn't needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/bookshelf-pagination/test/pagination.test.js`:
- Around line 364-381: Add a brief inline comment in the test "useSmartCount
uses count(*) when AST has only a CTE (with) grouping" explaining that a CTE
(WITH clause) defines a named subquery but does not duplicate rows in the outer
query, so the outer query still selects from a single table and a simple
count(*) is sufficient; place this comment near the model.query override or
right before the fetchPage call (references: model.query override,
fetchPage({..., useSmartCount: true}), and the final assertion
modelState.rawCalls[0] === 'count(*) as aggregate') so future maintainers
understand why distinct counting isn't needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0dd597c9-0443-4307-8bb3-7b8e7020ac53

📥 Commits

Reviewing files that changed from the base of the PR and between 00d3124 and 0ea99b3.

📒 Files selected for processing (2)
  • packages/bookshelf-pagination/lib/bookshelf-pagination.js
  • packages/bookshelf-pagination/test/pagination.test.js

hasMultiTableSource ran a JOIN regex against the full compiled SQL, so
any query with `where id in (select ... inner join ...)` fell back to
`count(distinct posts.id)` even though the outer query was still single
table. On a 267k-post Ghost site filtered by author this caused a ~20%
regression in the count aggregate for no benefit — the weedout step
already guaranteed distinct rows.

Switched the detection to walk knex's query-builder AST (`_statements`
and `_single.table`) so joins, unions, derived/raw FROM sources and
comma-separated FROM lists are read as structured data. Subquery JOINs
live under a `where` grouping and are correctly ignored.

Since bookshelf is built on knex, the builder always exposes
`_statements` / `_single`, so no SQL fallback is needed: the whole
check collapses to a ~30-line AST walk with no regexes, no toSQL() call
and no paren-stripping.
@kevinansfield kevinansfield force-pushed the fix/smarter-counts-subquery-join branch from 47695a6 to 8179605 Compare April 14, 2026 15:39
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
packages/bookshelf-pagination/test/pagination.test.js (1)

80-97: Add one real query-builder smoke test alongside this stub helper.

These tests only emulate _statements / _single with plain objects, so they will not catch a Knex/Bookshelf upgrade that changes the actual private builder shape hasMultiTableSource() depends on. One smoke test built from a real builder would make this suite much less brittle.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/bookshelf-pagination/test/pagination.test.js` around lines 80 - 97,
Add a real query-builder smoke test (in addition to tests using stubCountQuery)
that constructs an actual Knex/Bookshelf query with a multi-table source (e.g.,
call model.query().from('tableA').join('tableB', ...) or otherwise produce
multiple sources) and then invoke the function under test (the helper that
checks multi-table sources, e.g., hasMultiTableSource or the pagination
entrypoint that forks count query) to assert it returns the expected
boolean/behavior; do not modify stubCountQuery—instead write a separate test
that does not replace model.query(), uses the real builder returned by
model.query(), builds a multi-table query, and asserts the correct detection of
multi-table sources so upgrades to Knex/Bookshelf will be caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/bookshelf-pagination/lib/bookshelf-pagination.js`:
- Around line 38-47: The code detects non-string/derived FROM sources via
queryBuilder._single.table (variable table) but later still emits count(distinct
${tableName}.${idAttribute}) in the useSmartCount branch, which produces invalid
SQL for derived tables; modify the useSmartCount logic so that when
queryBuilder._single.table is not a plain string (or contains commas) it does
not choose the distinct-by-table form—i.e., treat that case as multi-source and
instead fall back to the safe simple count path (disable useSmartCount or avoid
referencing tableName/idAttribute) so count SQL never references out-of-scope
columns for QueryBuilder/Raw FROMs.
- Around line 25-43: The hasMultiTableSource function assumes internals exist
and directly dereferences queryBuilder._statements and
queryBuilder._single.table which can throw; guard these accesses by first
checking that queryBuilder and queryBuilder._statements are arrays before
iterating and that queryBuilder._single exists (and has a table property) before
reading table, falling back to the safe multi-source result (i.e., treat as
multi-table) when those internals are missing or of unexpected types so the code
will then use the count(distinct ...) path instead; update checks around the
symbols queryBuilder._statements and queryBuilder._single.table in
hasMultiTableSource accordingly.

---

Nitpick comments:
In `@packages/bookshelf-pagination/test/pagination.test.js`:
- Around line 80-97: Add a real query-builder smoke test (in addition to tests
using stubCountQuery) that constructs an actual Knex/Bookshelf query with a
multi-table source (e.g., call model.query().from('tableA').join('tableB', ...)
or otherwise produce multiple sources) and then invoke the function under test
(the helper that checks multi-table sources, e.g., hasMultiTableSource or the
pagination entrypoint that forks count query) to assert it returns the expected
boolean/behavior; do not modify stubCountQuery—instead write a separate test
that does not replace model.query(), uses the real builder returned by
model.query(), builds a multi-table query, and asserts the correct detection of
multi-table sources so upgrades to Knex/Bookshelf will be caught.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 19cefd58-febf-4215-972c-39cddbe1a1ee

📥 Commits

Reviewing files that changed from the base of the PR and between 80b7894 and 47695a6.

📒 Files selected for processing (2)
  • packages/bookshelf-pagination/lib/bookshelf-pagination.js
  • packages/bookshelf-pagination/test/pagination.test.js

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
packages/bookshelf-pagination/lib/bookshelf-pagination.js (2)

25-43: ⚠️ Potential issue | 🟠 Major

Guard private Knex fields before dereferencing them.

Line 26 and Line 42 assume _statements / _single always exist. If model.query() returns a wrapped builder, a partial test stub, or a future Knex version reshapes these internals, fetchPage() now throws synchronously before the existing promise error handling runs. Default this helper to the conservative path instead of iterating private fields unconditionally.

🛡️ Minimal hardening
 function hasMultiTableSource(queryBuilder) {
+    if (!queryBuilder || !Array.isArray(queryBuilder._statements) || !queryBuilder._single) {
+        return true;
+    }
+
     for (const statement of queryBuilder._statements) {
         // Any outer join duplicates the base row set.
         if (statement.grouping === 'join') {
             return true;
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/bookshelf-pagination/lib/bookshelf-pagination.js` around lines 25 -
43, The helper hasMultiTableSource currently dereferences private fields
queryBuilder._statements and queryBuilder._single without checks; modify
hasMultiTableSource to first verify that queryBuilder._statements is an array
before looping (if not, treat as multi-source and return true), and similarly
check that queryBuilder._single exists and that queryBuilder._single.table is
present and a string before using it (otherwise return true). Update the logic
in hasMultiTableSource to take the conservative path (return true) whenever
those private internals are missing or of unexpected types so we avoid
synchronous crashes.

38-47: ⚠️ Potential issue | 🟠 Major

Derived FROM sources need a different count strategy than joins/unions.

After Line 43 returns true, Line 237 still falls back to count(distinct ${tableName}.${idAttribute}). That works for normal joins, but FROM (subquery) AS alias usually puts ${tableName} out of scope, so useSmartCount can turn a valid query into a SQL error/BadRequestError. This helper needs to distinguish “count(*) is unsafe” from “the base table is still addressable” instead of collapsing both into one boolean.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/bookshelf-pagination/lib/bookshelf-pagination.js` around lines 38 -
47, The current helper sets a single boolean when the FROM is non-standard
(queryBuilder._single.table) which later causes useSmartCount logic to emit
count(distinct ${tableName}.${idAttribute}) even for derived FROMs (FROM
(subquery) AS alias) where ${tableName} is out of scope; change the helper to
return a richer signal (e.g., { multiSource: true, derivedFrom: true } or two
booleans) or add a separate function/isDerivedFrom flag that specifically
detects a derived FROM (table is non-string or a Raw that wraps a subquery),
then update the downstream logic in useSmartCount (and the code path that builds
count(distinct ${tableName}.${idAttribute})) to check the derivedFrom flag and
avoid emitting table-scoped counts for derived FROMs (use count(*) or wrap the
query instead) while still allowing distinct id counts for multi-source/joins
when derivedFrom is false.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/bookshelf-pagination/lib/bookshelf-pagination.js`:
- Around line 25-43: The helper hasMultiTableSource currently dereferences
private fields queryBuilder._statements and queryBuilder._single without checks;
modify hasMultiTableSource to first verify that queryBuilder._statements is an
array before looping (if not, treat as multi-source and return true), and
similarly check that queryBuilder._single exists and that
queryBuilder._single.table is present and a string before using it (otherwise
return true). Update the logic in hasMultiTableSource to take the conservative
path (return true) whenever those private internals are missing or of unexpected
types so we avoid synchronous crashes.
- Around line 38-47: The current helper sets a single boolean when the FROM is
non-standard (queryBuilder._single.table) which later causes useSmartCount logic
to emit count(distinct ${tableName}.${idAttribute}) even for derived FROMs (FROM
(subquery) AS alias) where ${tableName} is out of scope; change the helper to
return a richer signal (e.g., { multiSource: true, derivedFrom: true } or two
booleans) or add a separate function/isDerivedFrom flag that specifically
detects a derived FROM (table is non-string or a Raw that wraps a subquery),
then update the downstream logic in useSmartCount (and the code path that builds
count(distinct ${tableName}.${idAttribute})) to check the derivedFrom flag and
avoid emitting table-scoped counts for derived FROMs (use count(*) or wrap the
query instead) while still allowing distinct id counts for multi-source/joins
when derivedFrom is false.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 82cc28a9-a75e-4ea2-b0f7-bbfc65ee0fc3

📥 Commits

Reviewing files that changed from the base of the PR and between 47695a6 and 8179605.

📒 Files selected for processing (2)
  • packages/bookshelf-pagination/lib/bookshelf-pagination.js
  • packages/bookshelf-pagination/test/pagination.test.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/bookshelf-pagination/test/pagination.test.js

The stub-based smart-count tests were testing our assumptions about
knex's internal AST (`_statements` / `_single.table`), not knex itself —
if knex shipped a version that reorganised those fields, the tests
would keep passing and production would break. This replaces the 7
AST-stub tests with two layers of real-knex coverage:

1. Direct `hasMultiTableSource` tests against real knex QueryBuilders
   (no DB connection) — cover every branch including edge cases like
   UNION and a comma-containing `_single.table` string that can't be
   exercised cleanly end-to-end.
2. End-to-end `fetchPage` tests against a real bookshelf model backed
   by an in-memory sqlite database. Each test asserts both the chosen
   count aggregate (via knex's `query` event) and that the resulting
   total matches what the fetch actually returns — so "we picked
   count(*)" and "count(*) gives the right answer on this shape" are
   verified together.

`hasMultiTableSource` is now exposed on `paginationUtils` so the direct
tests can call it; it's used the same way `parseOptions` and
`formatResponse` already are.

Adds `bookshelf@1.2.0`, `knex@3.2.9` (matches the monorepo) and
`sqlite3@5.1.7` as devDependencies. 31 tests, 100% line/branch/function
coverage on `lib/bookshelf-pagination.js`.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
packages/bookshelf-pagination/lib/bookshelf-pagination.js (2)

38-47: ⚠️ Potential issue | 🟠 Major

Derived/raw FROM sources still assume the outer source is named posts.

Marking every non-string table as multi-source sends these queries down the count(distinct posts.id) path, but that identifier is only valid when the derived/raw source exposes the model table name. The new integration case uses .as('posts'), so it doesn't cover .as('sub') or raw aliases where posts.id is out of scope.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/bookshelf-pagination/lib/bookshelf-pagination.js` around lines 38 -
47, The code currently treats any non-string queryBuilder._single.table as
multi-source and later emits count(distinct posts.id) which wrongly hardcodes
the outer source name; instead, when table is a derived/raw object check whether
it exposes an alias matching the model's table name (use the model's tableName,
e.g. modelClass.prototype.tableName or queryBuilder._single.model &&
queryBuilder._single.model.prototype.tableName) and only treat it as
multi-source if no such alias exists — otherwise use count(distinct
<model.tableName>.id) dynamically; update the logic around
queryBuilder._single.table and the count-generation to reference the model's
actual tableName rather than the literal "posts".

25-47: ⚠️ Potential issue | 🟠 Major

Harden hasMultiTableSource() against non-Knex queryBuilder objects.

hasMultiTableSource() unconditionally dereferences _statements and _single.table before the useBasicCount branch is evaluated. Callers who explicitly set useBasicCount: true to skip smart-count logic will still fail if the queryBuilder doesn't expose these Knex internals (e.g., stubs, mocks, or custom query objects).

Use optional chaining and defensive checks. If either _statements or _single.table cannot be safely accessed, return true to treat the query as multi-source, which is the safer default:

Suggested fix
 function hasMultiTableSource(queryBuilder) {
-    for (const statement of queryBuilder._statements) {
+    const statements = Array.isArray(queryBuilder?._statements) ? queryBuilder._statements : null;
+    if (!statements) {
+        return true;
+    }
+
+    for (const statement of statements) {
         // Any outer join duplicates the base row set.
         if (statement.grouping === 'join') {
             return true;
         }
         // UNION combines multiple SELECTs — the outer row set comes
@@
-    const table = queryBuilder._single.table;
+    const table = queryBuilder?._single?.table;
+    if (typeof table === 'undefined') {
+        return true;
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/bookshelf-pagination/lib/bookshelf-pagination.js` around lines 25 -
47, The hasMultiTableSource function dereferences queryBuilder._statements and
queryBuilder._single.table without guarding against non-Knex objects; modify
hasMultiTableSource to first check that queryBuilder._statements is an array
before iterating (if not, return true) and that queryBuilder._single exists and
queryBuilder._single.table is safely readable (if not, return true), and use
optional chaining (or explicit checks) so callers who set useBasicCount: true
but pass mocks/stubs won't throw — keep the original join/union and "table
includes comma" logic but only run them after these defensive checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/bookshelf-pagination/lib/bookshelf-pagination.js`:
- Around line 38-47: The code currently treats any non-string
queryBuilder._single.table as multi-source and later emits count(distinct
posts.id) which wrongly hardcodes the outer source name; instead, when table is
a derived/raw object check whether it exposes an alias matching the model's
table name (use the model's tableName, e.g. modelClass.prototype.tableName or
queryBuilder._single.model && queryBuilder._single.model.prototype.tableName)
and only treat it as multi-source if no such alias exists — otherwise use
count(distinct <model.tableName>.id) dynamically; update the logic around
queryBuilder._single.table and the count-generation to reference the model's
actual tableName rather than the literal "posts".
- Around line 25-47: The hasMultiTableSource function dereferences
queryBuilder._statements and queryBuilder._single.table without guarding against
non-Knex objects; modify hasMultiTableSource to first check that
queryBuilder._statements is an array before iterating (if not, return true) and
that queryBuilder._single exists and queryBuilder._single.table is safely
readable (if not, return true), and use optional chaining (or explicit checks)
so callers who set useBasicCount: true but pass mocks/stubs won't throw — keep
the original join/union and "table includes comma" logic but only run them after
these defensive checks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 19b57a72-92b3-4597-9a5a-db4528e077e4

📥 Commits

Reviewing files that changed from the base of the PR and between 8179605 and f44a861.

⛔ Files ignored due to path filters (1)
  • yarn.lock is excluded by !**/yarn.lock, !**/*.lock
📒 Files selected for processing (4)
  • packages/bookshelf-pagination/lib/bookshelf-pagination.js
  • packages/bookshelf-pagination/package.json
  • packages/bookshelf-pagination/test/pagination-integration.test.js
  • packages/bookshelf-pagination/test/pagination.test.js
✅ Files skipped from review due to trivial changes (1)
  • packages/bookshelf-pagination/package.json

The direct hasMultiTableSource describe block already exercises every
query shape against real knex — the e2e suite was 1-for-1 duplication
for six of its eight tests. Kept only the two that carry unique weight:

1. An inner join that duplicates rows (p1 × two tags → 3 physical rows
   for 2 published posts). This is the only shape where count(*) and
   count(distinct) actually produce different totals, so it's the only
   e2e test that would catch a "we picked the wrong aggregate" bug at
   the result level.
2. A subquery JOIN in WHERE — the named regression this PR is about,
   kept as an explicit end-to-end guard.

Dropped the plain-filter, joinRaw, derived-table, CTE, useBasicCount
and default-distinct cases from the e2e suite; all are already covered
more thoroughly by the direct hasMultiTableSource tests above.
@kevinansfield kevinansfield requested a review from 9larsons April 14, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants