Centralize relation-scope binding and enforce duplicate alias validation in SQL planner#21617
Open
kosiew wants to merge 4 commits intoapache:mainfrom
Open
Centralize relation-scope binding and enforce duplicate alias validation in SQL planner#21617kosiew wants to merge 4 commits intoapache:mainfrom
kosiew wants to merge 4 commits intoapache:mainfrom
Conversation
… management - Added a private relation binding scope in `planner.rs` to catch duplicate relation aliases/names at SQL planning time with a planner diagnostic. - Wrapped each `FROM` list in a fresh relation scope in `select.rs`. - Registered relation bindings for base and joined relations in `join.rs`. - Extracted alias/table bindings while preserving full `TableReference` display for unaliased tables in `relation/mod.rs`. - Cleared inherited relation scopes for nested query planning in `query.rs`. - Added regression tests for various scenarios, including duplicate explicit join aliases, distinct aliases, and nested subquery scope isolation in `sql_integration.rs`.
- Used `HashMap::entry` and `Default` for relation-scope insertion in `planner.rs` - Simplified relation alias binding with a private helper and removed repetitive `display_name.clone()` calls in `relation/mod.rs` - Replaced the nested-join planning branch with a private helper in `relation/mod.rs` - Consolidated duplicate alias tests with a small assertion helper in `sql_integration.rs`
- Trimmed RelationBinding to store only the span. - Updated duplicate diagnostic to use the occupied map key for prior binding name. - Renamed nested-join helper flag from `has_alias` to `is_aliased_nested_join`. - Added regression tests for: - Unaliased relation colliding with an alias: `person JOIN orders person`. - Qualified same-leaf relation names not colliding: `public.orders JOIN other.orders`.
…nner-level error for DataFusion - Updated expectations to reflect the new error message: "DataFusion error: Error during planning: duplicate relation alias or name 't1'"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
The SQL planner previously performed duplicate relation alias validation inconsistently across different code paths. In particular:
This led to confusing or missing errors for users and inconsistent behavior depending on query structure.
This PR introduces a centralized relation-scope binding mechanism within the planner to ensure consistent validation and improved diagnostics.
What changes are included in this PR?
Introduced
RelationScopeandRelationBindingto track relation names and aliases within scoped FROM clausesAdded relation scope management to
PlannerContext, including:with_new_relation_scopeclear_relation_scopesinsert_relation_bindingImplemented duplicate relation detection using scoped bindings with span-aware diagnostics
Registered relation bindings during planning for:
Ensured nested joins and subqueries do not leak relation scopes
Updated query planning to reset relation scopes for subqueries
Refactored nested join handling to respect scope boundaries
Added detailed diagnostics with source span references for duplicate alias errors
Are these changes tested?
Yes. This PR includes comprehensive tests covering:
These tests ensure both correctness and regression coverage for the new behavior.
Are there any user-facing changes?
Yes.
There are no breaking API changes, but stricter validation may cause previously accepted invalid queries to fail.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.
Additional Notes
This change lays the groundwork for further centralization of name resolution and validation logic in the SQL planner, potentially extending to CTEs and other relation sources in future work.