Skip to content

[CALCITE-7505] RelToSqlConverter produces duplicate FROM aliases for correlated subqueries when hasImplicitTableAlias() is true#5036

Open
Dwrite wants to merge 3 commits into
apache:mainfrom
Dwrite:CALCITE-7505
Open

[CALCITE-7505] RelToSqlConverter produces duplicate FROM aliases for correlated subqueries when hasImplicitTableAlias() is true#5036
Dwrite wants to merge 3 commits into
apache:mainfrom
Dwrite:CALCITE-7505

Conversation

@Dwrite

@Dwrite Dwrite commented Jun 21, 2026

Copy link
Copy Markdown

Jira Link
CALCITE-7505

This PR fixes incorrect SQL generation for correlated sub-queries in a Filter's WHERE clause.

The root cause is that Filter did not preserve the correlation variables it binds, so RelToSqlConverter could not reliably determine when the current node was the binding point of a correlation variable. As a result, correlation alias handling could be applied at the wrong level, leading to incorrect correlated references in generated SQL.

The fix adds variablesSet to Filter / LogicalFilter and only resets the correlation alias when the current Filter / Project actually binds the variable, the input row type matches, and the input is not a BiRel.

A follow-up change will cover similar correlated cases in Join conditions.

…correlated subqueries when hasImplicitTableAlias() is true
/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-7505">[CALCITE-7505]
* RelToSqlConverter produces duplicate FROM aliases for correlated subqueries</a>. */
@Test void testExistsSubQueryAliasConflict() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this using hasImplicitTableAlias()?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out — I’ve updated the PR description. The previous wording was inaccurate: this change is not really about hasImplicitTableAlias(). The actual issue is that Filter did not preserve the correlation variables it binds, so RelToSqlConverter could not correctly determine the binding scope for correlated references.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to change the JIRA issue title and commit/PR message as well?

}

@Override public Void visitCorrelVariable(RexCorrelVariable v) {
if (correlIds.contains(v.id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JavaDoc above says: "a correlated variable defined by the given input"
Where is this check validating this fact?
(I am not sure how "defined by an input" is defined)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that's my english fault, The wording is inaccurate. The method does not actually verify that the correlation variable is defined by the given input; it only checks for a correlated reference with a matching row type. I'll update the Javadoc to reflect what it actually does.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be many variables with the same type, please explain why your approach is correct.

@Dwrite Dwrite Jun 28, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mihaibudiu

Thanks for the question. During our test case execution, we observed that planner rules like FilterIntoJoinRule( test case with testFilterCorrelateMissingVariableCor) can reposition or push down a Filter. This optimization often causes the Filter's variablesSet to become stale, meaning the variable's rowType no longer matches the schema of the new underlying input.

Therefore, relying solely on correlIds.contains(v.id) is insufficient. We need the rowType equality check to ensure that this Filter remains the true binding point of the correlation variable post-optimization. If the types mismatch, it indicates the Filter has been relocated, and we should not trigger resetAliasForCorrelation.

Regarding the structure check, rather than a generic BiRel, I specifically restrict this validation to Join and LogicalCorrelate operators. These two operators represent the exact semantic boundaries where correlation aliases are defined and resolved. Checking for Join and LogicalCorrelate allows us to correctly determine whether the alias needs to be appended, ensuring correctness without interfering with other relational nodes.

Let me know if this clarifies the design choice!

@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants