[CALCITE-6104] Aggregate function that references outer column should be evaluated in outer query by xuzifu666 · Pull Request #5052 · apache/calcite

xuzifu666 · 2026-06-26T13:49:57Z

jira: https://issues.apache.org/jira/browse/CALCITE-6104

… be evaluated in outer query

mihaibudiu · 2026-06-26T17:03:48Z


 !ok

+# [CALCITE-6104] Aggregate function that references outer column should be evaluated in outer query


Please add a comment that these are validated using an independent database.

Can we also have a test with 2 outer aggregates?
Showing the plan could also help.

Thanks for the reminder, I had added comment about it and the test case had been added.
But agg.iq to verify the correctness of the numerical results calculated after the Calcite fix within PostgreSQL, the original SQL must be rewritten into standard SQL supported by PostgreSQL.
Validated these sql in https://onecompiler.com/postgresql/44tgmwq6p and result is as expected.

mihaibudiu · 2026-06-26T17:31:15Z

+
+!ok
+
+SELECT (SELECT sum(sal) FROM dept) AS sum_sal


The blog post quoted by @julianhyde has more examples, please add all of the relevant ones

Yes, I review the blog overall and add new case for it. The correspondence is as follows:

SELECT (SELECT sum(1) FROM xx LIMIT 1) FROM aa

Aggregate with no column references aggregates at the innermost level.

SELECT (SELECT sum(a) FROM xx LIMIT 1) FROM aa

Aggregate function that references outer column should be evaluated in outer query.

SELECT (SELECT sum(x) FROM xx LIMIT 1) FROM aa

Aggregate referencing only inner column aggregates at the inner level.

mihaibudiu · 2026-06-26T17:32:15Z

+   * <p>For example,
+   * <blockquote><pre>SELECT (SELECT sum(a) FROM t LIMIT 1) FROM aa</pre></blockquote>
+   * is rewritten to
+   * <blockquote><pre>SELECT sum(a) * (SELECT 1 FROM t LIMIT 1) FROM aa</pre></blockquote>


From this example it's not obvious to me at all what the general rewrite rule is.
Can you describe the algorithm?

Okay, I’ve added the logic for the overall process; I hope that clarifies things.

mihaibudiu · 2026-06-27T02:29:05Z

+   * </ol>
+   *
+   * <p>For example,
+   * <blockquote><pre>SELECT (SELECT sum(a) FROM t LIMIT 1) FROM aa</pre></blockquote>


We cannot read this example because we don't know where column a is.
Can you also add tests where one or both tables are empty or contain only nulls in the aggregated column?

OK, The Javadoc examples for rewriteOuterAggregatesInSelectList have been improved, explicitly stating that a comes from aa and x comes from t.
Boundary tests have been added to agg.iq: Inner table empty → 1 row NULL
Outer table empty → 1 row NULL (because the outer aggregation has no input rows)
and postgresql test also updated (result is also keep the same with agg.iq, empty results for VALUE are not currently supported, so I am using the SQL statement SELECT 1 FROM emp WHERE 1 = 0 as a substitute.) https://onecompiler.com/postgresql/44tgmwq6p

mihaibudiu · 2026-06-27T03:46:01Z

 !ok

+# [CALCITE-6104] Aggregate function that references outer column should be evaluated in outer query
+# The expected results below have been verified in PostgreSQL by running


I am confused, I thought these queries are standard SQL, but, indeed, I cannot run them on postgres or other databases I tried. Isn't this PR supposed to implement a standard SQL feature? Why can't these queries use standard SQL then?

Yes, this is indeed a confusing point. These correlated-aggregate queries are not standard SQL, and CALCITE-6104 should not implement a SQL standard feature. Instead, it makes Calcite support a non-standard extension that some databases already implement.

The SQL standard does not allow aggregate functions to reference outer columns
In standard SQL, an aggregate function like sum(a) can only reference columns from the SELECT level where it appears. So a query like:

SELECT (SELECT sum(a) FROM xx LIMIT 1) FROM aa;

will fail on PostgreSQL, Oracle, and MySQL/MariaDB because a belongs to the outer table aa, not the sub-query xx.

But PostgreSQL and others support this extension.
These databases allow an aggregate to "climb up" to the nearest SELECT that contains its free variables. Calcite previously handled this pattern incorrectly (it aggregated inside the sub-query, grouped by the outer rows). The fix makes Calcite behave consistently with those databases.
This test is provided in Jira (julianhyde@3f4cab5) is supported—at least with the current PR applied; these are not standard SQL either.
Additionally, I tested converting these SQL statements to standard SQL; they passed even without this PR, which is why I hold the view mentioned above.

Thank you for the clarification. Maybe this SQL should not be accepted by default? There are these conformance modes, I think only modes like LENIENT or BABEL should accept this construct.

You say that postgres accepts these queries, then why do you have to modify the queries to run them on postgres? I couldn't run them as written in Postgres.

Yes, thanks for pointing this out. The current comment is incorrect, should state "These SQL queries are non-standard and their successful execution is only guaranteed within Calcite for now" makes perfect sense.

Regarding the limitation where correlated aggregate rewriting is only available under modes like LENIENT or BABEL, could log a separate jira to resolve this later（If ok I would create a new jira）? This way, we can track these changes independently and easily gather any additional feedback. From my perspective, this limitation bears little relevance to the current Jira ticket. I’d like to hear if you agree with my view on this.

I think this will require 3 lines of code and should be done as part of this PR.

Okay, I'll implement this restriction directly in this PR.

mihaibudiu · 2026-06-27T22:47:12Z

+    // Rewrite scalar sub-queries whose single select item is an aggregate
+    // over outer columns. The aggregate belongs to the outer query per SQL
+    // standard.
+    rewriteOuterAggregatesInSelectList(select);


call this handleOuterAggregate, and make the function reject it if not in a suitable conformance mode.

Yes, the latest commit implements controls in this area.

mihaibudiu · 2026-06-27T22:48:09Z

 !ok

+# [CALCITE-6104] Aggregate function that references outer column should be evaluated in outer query
+# The expected results below have been verified in PostgreSQL by running


You say that postgres accepts these queries, then why do you have to modify the queries to run them on postgres? I couldn't run them as written in Postgres.

mihaibudiu · 2026-06-27T22:49:08Z

+   * <blockquote><pre>
+   * WITH aa(a) AS (VALUES 1, 2, 3),
+   *      t(x) AS (VALUES 10, 20, 30)
+   * SELECT (SELECT sum(a) FROM t LIMIT 1) FROM aa


Is the LIMIT 1 needed here?

Not need, I had updated it.

mihaibudiu · 2026-06-27T22:52:32Z

+   * <blockquote><pre>
+   * WITH aa(a) AS (VALUES 1, 2, 3),
+   *      t(x) AS (VALUES 10, 20, 30)
+   * SELECT sum(a) * (SELECT 1 FROM t LIMIT 1) FROM aa


This rewrite will only work for numeric aggregates on types which have a multiplication operation

I've added this message to the comments first.

sonarqubecloud · 2026-06-28T00:17:53Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
80.8% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

[CALCITE-6104] Aggregate function that references outer column should…

106b216

… be evaluated in outer query

xuzifu666 force-pushed the calcite-6104 branch from d2ff7dd to 106b216 Compare June 26, 2026 14:10

xuzifu666 added 2 commits June 26, 2026 22:41

Addressed

cd8346a

Addressed

47e152f

mihaibudiu reviewed Jun 26, 2026

View reviewed changes

xuzifu666 added 2 commits June 27, 2026 09:55

Addressed

7555c70

Addressed

e3c977e

mihaibudiu reviewed Jun 27, 2026

View reviewed changes

xuzifu666 added 2 commits June 27, 2026 11:19

Addressed

c9fe712

Addressed

7f81a12

mihaibudiu reviewed Jun 27, 2026

View reviewed changes

Addressed

d92b678

mihaibudiu reviewed Jun 27, 2026

View reviewed changes

xuzifu666 added 2 commits June 28, 2026 07:42

Addressed

fb6ac70

Addressed

3b449e4


		!ok

		# [CALCITE-6104] Aggregate function that references outer column should be evaluated in outer query

Uh oh!

Conversation

xuzifu666 commented Jun 26, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuzifu666 Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuzifu666 Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented Jun 28, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xuzifu666 Jun 27, 2026 •

edited

Loading

xuzifu666 Jun 27, 2026 •

edited

Loading