perf(optimizer): EliminateCrossJoin fast-path for join-free plans#22612
perf(optimizer): EliminateCrossJoin fast-path for join-free plans#22612zhuqi-lucas wants to merge 3 commits into
Conversation
Closes apache#22583. The rule's body only does real work when the root (or its Filter child) is an inner Join; in every other case it falls through to rewrite_children, which recurses into the plan, processes uncorrelated subqueries, and rewrites every direct child via map_children (clone-on-write), then calls recompute_schema on the way back. This is paid by every query in the logical optimizer pipeline — including simple point queries with no joins anywhere in the tree. Add a read-only apply scan at the top of rewrite that bails out with Transformed::no(plan) when no LogicalPlan::Join node exists in the tree. Mirrors the plan_has_subqueries fast-path landed in apache#22298 — same shape (cheap up-front scan, early stop on first match) and follows the precedent the reviewers approved there. Four unit tests cover: - plan_has_joins detects a root Join - plan_has_joins detects a nested Join - plan_has_joins returns false on join-free plans - rewrite short-circuits with Transformed::no on join-free plans
There was a problem hiding this comment.
Pull request overview
This PR adds a fast-path to the EliminateCrossJoin logical optimizer rule to avoid a full recursive rewrite walk when the plan contains no joins, reducing overhead for join-free queries in the optimizer pipeline.
Changes:
- Added a
plan_has_joins(&LogicalPlan) -> boolhelper to detect whether a plan contains anyLogicalPlan::Joinnodes. - Added an early return in
EliminateCrossJoin::rewriteto short-circuit withTransformed::no(plan)when no joins are present. - Added unit tests for the helper and for the short-circuit behavior.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Copilot review found a correctness gap: `LogicalPlan::apply` does not descend into the subquery plans referenced from `Expr::ScalarSubquery` / `Expr::InSubquery` / `Expr::Exists` / `Expr::SetComparison`. A query whose outer plan has no Join but contains `IN (SELECT ... FROM a CROSS JOIN b)`-style predicates would short-circuit and skip optimizing the subquery's join. `rewrite_children` reaches those subqueries via `map_uncorrelated_subqueries`, so `plan_has_joins` has to consider them too. Extend the helper to recurse into each subquery plan through `apply_expressions` + an `Expr::apply` walk. New regression test `plan_has_joins_detects_join_inside_subquery` exercises the `IN (SELECT ... CROSS JOIN ...)` shape.
|
Filed #22616 to track the deeper "generic in-place mutable rewrite" direction @neilconway and @alamb raised on #22583. Per-rule fast-paths like this one and #22298 become redundant once that lands; until then they're contained, cheap wins on the common no-match path. Happy to scope #22616 in a follow-up PR. |
alamb
left a comment
There was a problem hiding this comment.
Looks good to me -- thanks @zhuqi-lucas
| if found { | ||
| return Ok(TreeNodeRecursion::Stop); | ||
| } | ||
| let _ = expr.apply(|e| { |
There was a problem hiding this comment.
I am surprised there isn'y already soem tranversal code that recurses into subqueries
Perhaps this one?
https://docs.rs/datafusion/latest/datafusion/logical_expr/enum.LogicalPlan.html#method.apply_with_subqueries
(though maybe you can't use that because it is for LogicalPlans not Exprs 🤔 )
|
run benchmarks sql_planner |
|
🤖 Criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing perf/eliminate-cross-join-fast-path (0b4e30c) to a7c2f7d (merge-base) diff File an issue against this benchmark runner |
|
🤖 Criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
|
This PR makes sense intuitively, but considering that this PR adds overhead in some cases and improves efficiency in other cases, I wonder if there's a way to quantify the expected win more precisely. The sql_planner benchmark would be an obvious candidate, but it seems that this PR is a win for some cases but a loss for others, and overall slightly slower (although probably within the noise). |
Which issue does this PR close?
Closes #22583.
Rationale for this change
EliminateCrossJoin::rewriteis called on every plan during logical optimization. The rule's body only does real work when the root (or itsFilterchild) is an innerJoin; in every other case it falls through torewrite_children, which recurses into the plan, processes uncorrelated subqueries, and rewrites every direct child viamap_children(clone-on-write), then callsrecompute_schemaon the way back.This is paid by every query in the logical optimizer pipeline — including simple point queries with no joins anywhere in the tree.
Discussion on the issue
@neilconway raised the valid concern that a fast-path scan still does some up-front work in the case where the rewrite does fire, and that the deeper fix is mutable tree rewrites (avoiding the clone-on-write of
TreeNode::rewriteentirely). @alamb agreed and pointed at the in-placemap_children_mut/plan_has_subqueriesinfrastructure adriangb landed in #22298 as the existing precedent.This PR follows that precedent directly:
plan_has_subqueries— a read-onlyapplyscan, early-stops on the first matching node, allocates nothing.map_childrenclone-on-write the rewrite would otherwise do.rewrite_childrenhere recurses viaoptimizer.rewrite(input, config)per child — a different shape frommap_children_mut's&muttraversal. Adapting that is a larger refactor and worth its own follow-up; this PR doesn't block it.What changes are included in this PR?
plan_has_joins(&LogicalPlan) -> boolhelper ineliminate_cross_join.rs—applywalk that returnstrueon the firstLogicalPlan::Joinit sees.EliminateCrossJoin::rewrite:if !plan_has_joins(&plan) { return Ok(Transformed::no(plan)); }. Everything else is unchanged.Are these changes tested?
Four new unit tests in the existing
mod tests:plan_has_joins_detects_root_joinplan_has_joins_detects_nested_join(Join under Filter/Projection)plan_has_joins_returns_false_for_join_free_planrewrite_short_circuits_when_plan_has_no_joins— end-to-end: rule'srewritereturnsTransformed::noand the plan comes back identical (schema + display) on join-free input.The existing 20
EliminateCrossJointests + the full 708-testdatafusion-optimizer --libsuite still pass.cargo clippy -p datafusion-optimizer --all-targets -- -D warningsclean.Are there any user-facing changes?
No semantic change. Pure perf optimization, no new config knobs.
Follow-ups
map_children_mutpattern from Optimize logical optimizer: skip map_subqueries + in-place rewriting #22298) is worth considering — see issue perf(optimizer): EliminateCrossJoin walks the full plan tree even when there are no joins #22583 comments for discussion. Out of scope here.