Add DecomposeAggregate optimizer to rewrite AVG as SUM/COUNT#21613
Add DecomposeAggregate optimizer to rewrite AVG as SUM/COUNT#21613Dandandan wants to merge 8 commits intoapache:mainfrom
Conversation
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (faea6b4) to 29c5dd5 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (faea6b4) to 29c5dd5 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (faea6b4) to 29c5dd5 (merge-base) diff using: tpch File an issue against this benchmark runner |
Rewrites AVG(x) into CAST(SUM(x) AS Float64) / CAST(COUNT(*) AS Float64) to reduce accumulator overhead. The AVG accumulator stores both sum and count per group internally; splitting into separate SUM and COUNT accumulators is more efficient and enables sharing with existing COUNT(*) aggregates in the same query via CommonSubexprEliminate. Uses COUNT(*) when the AVG argument is non-nullable (the common case for integer/float columns), falling back to COUNT(x) for nullable arguments to preserve correct NULL semantics. Only applies to AVG with Float64 return type; skips DISTINCT, filtered, ordered, and decimal/duration AVGs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ROLLUP/CUBE/GROUPING SETS expand group_expr into more schema fields than group_expr.len(), which breaks the index arithmetic used to locate aggregate field types. Skip these cases to avoid FieldNotFound errors in downstream optimizer rules like optimize_projections. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bbe9058 to
4b81d40
Compare
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
New test from upstream that includes AVG — update expected EXPLAIN output to reflect the AVG decomposition. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
run benchmark tpch |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (b32f34d) to 6db3899 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (b32f34d) to 6db3899 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (b32f34d) to 6db3899 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (b32f34d) to 6db3899 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
run benchmark tpch tpch10 |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (b32f34d) to 6db3899 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (b32f34d) to 6db3899 (merge-base) diff using: tpch10 File an issue against this benchmark runner |
|
run benchmark clickbench_partitioned |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing decompose-avg-optimization (b32f34d) to 6db3899 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch10 — base (merge-base)
tpch10 — branch
File an issue against this benchmark runner |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
Instead of SUM(CAST(x AS Float64)), cast to SUM's native types: Int8/Int16/Int32 → CAST(x AS Int64), UInt8/UInt16/UInt32 → CAST(x AS UInt64), Int64/UInt64/Float64 → no cast. This avoids Float64 arithmetic overhead during aggregation and produces more precise results for large integers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of manually mapping each integer type to SUM's native type, just strip the CAST(x AS Float64) added by AVG's type coercion and let SUM apply its own coercion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove cast-stripping logic from the optimizer rule and pass AVG's args (with their existing Float64 cast) directly to SUM. Add coerce_types to SUM so it handles small integer/float types via the normal planning path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Which issue does this close?
Closes: #19637
Summary
DecomposeAggregateoptimizer rule that rewritesAVG(x)intoCAST(SUM(x) AS Float64) / CAST(COUNT(*) AS Float64)COUNT(*)for non-nullable args (enables sharing with existingCOUNT(*)via CSE), falls back toCOUNT(x)for nullable argsTest plan
cargo fmtandcargo clippyclean🤖 Generated with Claude Code