Fix subqueries in WHERE clause causing parse error#31
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6cd2a5452d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
yardstick-rs/src/sql/measures.rs
Outdated
| // Collect first token | ||
| let mut first_token = String::new(); | ||
| while let Some(&c) = lookahead.peek() { | ||
| if c.is_alphabetic() || c == '_' { |
There was a problem hiding this comment.
Read full identifier before classifying subquery parentheses
In try_consume_subquery_parens, the lookahead token stops at the first non-letter (is_alphabetic/_), so an expression like (select1 > 0) is read as first token SELECT and incorrectly treated as a subquery. On the fallback path (the same path this patch targets when parser_ffi::qualify_expression fails), that causes the entire parenthesized expression to bypass qualification, leaving select1/with1 unqualified and producing wrong or failing SQL where _inner.select1 (or outer alias) is required.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e1a4ad6d4b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| '\'' => { | ||
| // Handle string literals inside subquery | ||
| while let Some(next) = chars.next() { | ||
| content.push(next); | ||
| if next == '\'' { |
There was a problem hiding this comment.
Handle quoted identifiers when scanning subquery parentheses
The new fallback subquery scanner tracks (/) depth but only skips single-quoted literals, so parentheses inside double-quoted identifiers are treated as structural. In fallback mode (when parser_ffi::qualify_expression fails), an expression like id in (select "a)" from t) AND region = 'US' can terminate the subquery early and then qualify the remaining text as outer WHERE content, yielding malformed SQL or misqualified predicates. This should also ignore double-quoted identifiers (and ideally other quoted/comment forms) while counting depth.
Useful? React with 👍 / 👎.
The fallback qualifier functions (qualify_where_for_inner_fallback, qualify_where_for_outer_fallback, qualify_where_for_inner_with_dimensions) process SQL text character-by-character to prefix column references with _inner./_outer. aliases. They had no concept of parenthesis depth or subquery scope, so tokens inside subqueries like `year IN (SELECT year FROM sales)` were incorrectly prefixed, producing `_inner.year IN (_inner.select _inner.year _inner.from _inner.sales)`. Add try_consume_subquery_parens() helper that detects when an opening paren starts a subquery (next token is SELECT or WITH) and consumes the entire subquery verbatim, preserving nested parens and string literals. All three fallback functions now use this helper.
The lookahead in try_consume_subquery_parens used is_alphabetic to collect the first token, so "select1" was read as "select" (stopping at the digit) and falsely matched the SELECT keyword check. Changed to is_alphanumeric so the full identifier is collected before comparison. Added tests for this edge case.
Double-quoted identifiers like "a)" could cause the subquery depth tracker to prematurely close, misidentifying the end of the subquery. Now try_consume_subquery_parens skips both single-quoted strings and double-quoted identifiers when tracking parenthesis depth.
e1a4ad6 to
0e293df
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0e293df068
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| '(' => depth += 1, | ||
| ')' => { | ||
| depth -= 1; | ||
| if depth == 0 { | ||
| return Some(content); |
There was a problem hiding this comment.
Skip comments when tracking subquery parenthesis depth
In fallback mode (qualify_where_for_*_fallback), try_consume_subquery_parens decrements depth on every ) but never skips -- ... or /* ... */ comments, so a ) inside a comment can terminate the subquery scan early. For example, id in (select id /* ) */ from t) AND region = 'US' will return from the helper too soon and the remainder is then re-qualified as outer WHERE text, producing malformed or misqualified SQL.
Useful? React with 👍 / 👎.
A closing paren inside a line comment (-- ...) or block comment (/* ... */) could prematurely terminate the subquery scan in try_consume_subquery_parens. Now both comment forms are consumed without affecting depth tracking.
Closes #26
Summary
qualify_where_for_inner_fallback,qualify_where_for_outer_fallback,qualify_where_for_inner_with_dimensions) had no subquery awareness, soWHERE year IN (SELECT year FROM sales)was rewritten as_inner.year IN (_inner.select _inner.year _inner.from _inner.sales)try_consume_subquery_parens()helper that detects when a(opens a subquery (next token isSELECTorWITH) and passes the entire subquery through verbatimIN (SELECT ...),IN (WITH ...), nested subqueries, and mixed conditions with subqueries