Allow safe Cypher reserved keywords in alias positions (#2355)#2415
Conversation
c183374 to
44fb476
Compare
There was a problem hiding this comment.
Pull request overview
This PR expands the Cypher parser to allow a curated set of “safe” reserved keywords (the existing safe_keywords set) to be used as aliases only in RETURN, WITH, YIELD, and UNWIND ... AS <alias> positions, addressing #2355 (RETURN 1 AS count previously errored).
Changes:
- Introduces a new
var_name_aliasgrammar rule and uses it at alias-binding sites (RETURN/WITH/YIELD/UNWIND). - Adds a new regression test (
reserved_keyword_alias) plus expected output to validate accepted/rejected alias cases. - Registers the new regression test in the top-level
Makefile’sREGRESSlist.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/backend/parser/cypher_gram.y |
Adds var_name_alias and switches alias-binding productions to accept safe_keywords in ... AS <alias> positions. |
regress/sql/reserved_keyword_alias.sql |
New regression test covering keyword aliases in RETURN/WITH/UNWIND plus negative cases. |
regress/expected/reserved_keyword_alias.out |
Expected output for the new regression test. |
Makefile |
Adds reserved_keyword_alias to REGRESS test ordering. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@crprashant Please address these minor issues - Recommendation
|
44fb476 to
49a424e
Compare
|
@jrgemignani thanks for the careful review — pushed 49a424e addressing all four points:
Validation re-run on the amended commit:
Happy to keep iterating if anything else stands out. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@crprashant Can you address the copilot messages above about CRLF? These files are for linux and need to conform to that standard.
|
Cypher productions in `cypher_gram.y` that bind an alias via the AS
keyword (RETURN/WITH/YIELD ... AS x and UNWIND ... AS x) only accepted
plain identifiers. As a result, completely valid Cypher such as
SELECT * FROM cypher('g', $$ RETURN 1 AS count $$) AS (a agtype);
failed with `syntax error at or near "count"`, even though `count` is
already accepted in other identifier positions (it appears in the
existing `safe_keywords` list and is permitted in `func_name`;
`schema_name` accepts the broader `reserved_keyword` set).
This patch introduces a dedicated `var_name_alias` non-terminal used
only in the three alias-binding sites (yield_item, return_item,
unwind). It accepts everything `var_name` accepts, plus the entire
`safe_keywords` set, so the 49 non-conflicting reserved keywords
(count, exists, coalesce, match, return, where, order, limit, distinct,
optional, detach, contains, starts, ends, in, is, not, ...) are now
usable as aliases.
The change is intentionally scoped to alias positions:
* `var_name` itself (used by pattern-variable bindings like
`(x:Label)`, edge bindings, named paths, and `expr_var` references)
is unchanged. Allowing safe_keywords there triggers 156 shift/reduce
conflicts because keyword tokens collide with their roles inside
expressions and patterns.
* `conflicted_keywords` (END, NULL, TRUE, FALSE) remain rejected in
every position; they are genuinely ambiguous with literal/CASE
productions.
Reading a keyword-named alias back through `expr_var` still fails (e.g.
`WITH 1 AS count RETURN count`) because `expr_var` reads through
`var_name`. That asymmetry is captured as a known limitation in the
regression suite and tracked separately in apache#2416.
Regression coverage lives in `regress/sql/reserved_keyword_alias.sql`
and `regress/expected/reserved_keyword_alias.out`, exercising:
* the original repro,
* representative safe_keywords across RETURN/WITH/UNWIND,
* multiple keyword aliases in one projection,
* a backtick-quoted alias positive case,
* the known read-back limitation as a negative test, and
* explicit negatives proving END/NULL/TRUE/FALSE and pattern-position
keywords still error out.
Closes apache#2355.
49a424e to
f686e93
Compare
|
@jrgemignani good catch — thanks! Pushed f686e93 with both new regression files normalized to LF-only line endings, matching the rest of the Verified: ( The expected output was regenerated from a fresh |
PR: Allow safe Cypher reserved keywords in alias positions (#2355)
Branch:
crprashant:fix/2355-reserved-keywords-as-aliases→apache:masterOpen URL: https://github.com/crprashant/age/pull/new/fix/2355-reserved-keywords-as-aliases
Closes: #2355
Summary
RETURN 1 AS count(and equivalents using any of 49 non-conflicting reservedkeywords) failed with
syntax error at or near "count". This PR fixes it byintroducing a tightly-scoped
var_name_aliasgrammar rule used only at thealias-binding sites of
RETURN,WITH,YIELD, andUNWIND. The set ofkeywords newly accepted as aliases is exactly
safe_keywords— the same setalready accepted in
func_nameandschema_name.Reproducer (from the issue)
ERROR: syntax error at or near "count"1(1 row)Root cause
In
src/backend/parser/cypher_gram.ythe productionsreferenced
var_name, which expands only tosymbolic_name(i.e. plainIDENTIFIER). Reserved keywords — even those that pose no parsing ambiguity —were therefore rejected as aliases, although
schema_namealready includesreserved_keywordandfunc_namealready includessafe_keywords.Fix
A new non-terminal
var_name_aliasis introduced and used only in the threealias-binding sites above:
var_name_alias: var_name | safe_keywords { $$ = pstrdup((char *) $1); } ;var_nameitself is not broadened, intentionally. Allowingsafe_keywordseverywherevar_nameis used produces 156 shift/reduceconflicts in bison (verified locally) because keyword tokens collide with
their syntactic roles inside expressions and patterns. Restricting the
broadening to
AS-bound alias positions removes that ambiguity entirely —the build is conflict-free.
conflicted_keywords(END,NULL,TRUE,FALSE) remain rejectedeverywhere; they are genuinely ambiguous with literal / CASE productions.
Behavior matrix
RETURN 1 AS countcount∈ safe_keywordsRETURN 1 AS exists / coalesce / match / where / order / limit / distinct / optional / detach / contains / starts / ends / in / is / not / yield / call / ...RETURN 1 AS count, 2 AS exists, 3 AS whereWITH 1 AS count RETURN 1 AS xUNWIND [1,2,3] AS row RETURN 1 AS xRETURN 1 AS null / true / false / endMATCH (count) RETURN 1 AS xRETURN 1 AS my_aliasKnown limitation (intentional, scope-bounded)
Even with this PR, referencing an alias whose name is a keyword (e.g.
WITH 1 AS count RETURN count) still fails, becauseexpr_varreads throughvar_name, which is unchanged. Broadeningexpr_varto accept keywordsre-introduces the 156 grammar conflicts. The literal repro from the issue
(
RETURN 1 AS count) is resolved; reading keyword-named aliases back is adeeper grammar refactor and out of scope here. This is documented inline
in the regression test.
Testing
make PG_CONFIG=/usr/lib/postgresql/18/bin/pg_config -j$(nproc)—clean, no new bison warnings, no shift/reduce conflicts.
make installcheckon PostgreSQL 18.Result: 32/34 tests pass, including the new
reserved_keyword_aliastest.The remaining failure (
age_upgrade) is pre-existing on master andunrelated to this change.
PG-version-agnostic. The fix lives entirely in
cypher_gram.yand thegenerated
cypher_gram.c; no PG-version-specific APIs are touched. Thereporter's environment was AGE 1.16.0 on PostgreSQL 16.13. Building
masteragainst PG 16 currently fails for unrelated reasons(
cypher_set.c/age.cuse PG18-only APIs such asindex_closeandage_shmem_startup_hook); this is independent of Usind reserved keywords as aliases in Cypher queries #2355 and is notaddressed here.
New regression test
regress/sql/reserved_keyword_alias.sqlcovers:RETURN ... AS <kw>.WITH ... AS <kw>alias binding.UNWIND ... AS <kw>alias binding.END / NULL / TRUE / FALSE(must still error).Risk analysis
build had no
-Wconflicts-srwarning).var_name_alias(yield_item, return_item, unwind). All other 6 use sites of
var_name(named paths, pattern variables, edge bindings,
expr_var,var_name_opt) are untouched.language. Accepted-before queries continue to behave identically.
Files changed
src/backend/parser/cypher_gram.yvar_name_aliasrule; alias-binding productions switched to it.Makefilereserved_keyword_aliasadded toREGRESSbetweensecurityanddrop.regress/sql/reserved_keyword_alias.sqlregress/expected/reserved_keyword_alias.out