Skip to content

Fix subqueries in WHERE clause causing parse error#31

Merged
nicosuave merged 4 commits intomainfrom
nicosuave/fix-subquery-where
Apr 11, 2026
Merged

Fix subqueries in WHERE clause causing parse error#31
nicosuave merged 4 commits intomainfrom
nicosuave/fix-subquery-where

Conversation

@nicosuave
Copy link
Copy Markdown
Member

Closes #26

Summary

  • The fallback qualifier functions (qualify_where_for_inner_fallback, qualify_where_for_outer_fallback, qualify_where_for_inner_with_dimensions) had no subquery awareness, so WHERE year IN (SELECT year FROM sales) was rewritten as _inner.year IN (_inner.select _inner.year _inner.from _inner.sales)
  • Added try_consume_subquery_parens() helper that detects when a ( opens a subquery (next token is SELECT or WITH) and passes the entire subquery through verbatim
  • All three fallback qualifier functions now use this helper
  • Added tests covering IN (SELECT ...), IN (WITH ...), nested subqueries, and mixed conditions with subqueries

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6cd2a5452d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Collect first token
let mut first_token = String::new();
while let Some(&c) = lookahead.peek() {
if c.is_alphabetic() || c == '_' {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Read full identifier before classifying subquery parentheses

In try_consume_subquery_parens, the lookahead token stops at the first non-letter (is_alphabetic/_), so an expression like (select1 > 0) is read as first token SELECT and incorrectly treated as a subquery. On the fallback path (the same path this patch targets when parser_ffi::qualify_expression fails), that causes the entire parenthesized expression to bypass qualification, leaving select1/with1 unqualified and producing wrong or failing SQL where _inner.select1 (or outer alias) is required.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e1a4ad6d4b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +3432 to +3436
'\'' => {
// Handle string literals inside subquery
while let Some(next) = chars.next() {
content.push(next);
if next == '\'' {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle quoted identifiers when scanning subquery parentheses

The new fallback subquery scanner tracks (/) depth but only skips single-quoted literals, so parentheses inside double-quoted identifiers are treated as structural. In fallback mode (when parser_ffi::qualify_expression fails), an expression like id in (select "a)" from t) AND region = 'US' can terminate the subquery early and then qualify the remaining text as outer WHERE content, yielding malformed SQL or misqualified predicates. This should also ignore double-quoted identifiers (and ideally other quoted/comment forms) while counting depth.

Useful? React with 👍 / 👎.

The fallback qualifier functions (qualify_where_for_inner_fallback,
qualify_where_for_outer_fallback, qualify_where_for_inner_with_dimensions)
process SQL text character-by-character to prefix column references with
_inner./_outer. aliases. They had no concept of parenthesis depth or
subquery scope, so tokens inside subqueries like
`year IN (SELECT year FROM sales)` were incorrectly prefixed, producing
`_inner.year IN (_inner.select _inner.year _inner.from _inner.sales)`.

Add try_consume_subquery_parens() helper that detects when an opening
paren starts a subquery (next token is SELECT or WITH) and consumes the
entire subquery verbatim, preserving nested parens and string literals.
All three fallback functions now use this helper.
The lookahead in try_consume_subquery_parens used is_alphabetic to
collect the first token, so "select1" was read as "select" (stopping
at the digit) and falsely matched the SELECT keyword check. Changed
to is_alphanumeric so the full identifier is collected before
comparison. Added tests for this edge case.
Double-quoted identifiers like "a)" could cause the subquery depth
tracker to prematurely close, misidentifying the end of the subquery.
Now try_consume_subquery_parens skips both single-quoted strings and
double-quoted identifiers when tracking parenthesis depth.
@nicosuave nicosuave force-pushed the nicosuave/fix-subquery-where branch from e1a4ad6 to 0e293df Compare April 11, 2026 02:19
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0e293df068

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +3468 to +3472
'(' => depth += 1,
')' => {
depth -= 1;
if depth == 0 {
return Some(content);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip comments when tracking subquery parenthesis depth

In fallback mode (qualify_where_for_*_fallback), try_consume_subquery_parens decrements depth on every ) but never skips -- ... or /* ... */ comments, so a ) inside a comment can terminate the subquery scan early. For example, id in (select id /* ) */ from t) AND region = 'US' will return from the helper too soon and the remainder is then re-qualified as outer WHERE text, producing malformed or misqualified SQL.

Useful? React with 👍 / 👎.

A closing paren inside a line comment (-- ...) or block comment
(/* ... */) could prematurely terminate the subquery scan in
try_consume_subquery_parens. Now both comment forms are consumed
without affecting depth tracking.
@nicosuave nicosuave merged commit fbb77a7 into main Apr 11, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subqueries in where clause cause error

1 participant