Skip to content

feat(cysql): support CREATE and UNWIND#62

Open
zinic wants to merge 6 commits intoSpecterOps:mainfrom
zinic:unwind
Open

feat(cysql): support CREATE and UNWIND#62
zinic wants to merge 6 commits intoSpecterOps:mainfrom
zinic:unwind

Conversation

@zinic
Copy link
Copy Markdown
Contributor

@zinic zinic commented Apr 16, 2026

Description

Implement Cypher-to-PgSQL translation for UNWIND, mapping it to PostgreSQL's unnest() function. This supports UNWIND as a reading clause in both multi-part queries (preceded by WITH/MATCH) and standalone single-part queries.

  • Add UnwindClause model and tracking on QueryPart
  • Add prepareUnwind/translateUnwind handlers in the AST translator
  • Emit unnest() FROM clauses in both inline and tail projections
  • Handle nil reference frames in projection and WITH translation for standalone UNWIND variables that have no preceding CTE
  • Lazily push a scope frame in prepareUnwind when none exists to prevent nil pointer dereference on standalone UNWIND queries
  • Add translation test cases covering UNWIND with WITH, MATCH, WHERE, ORDER BY, LIMIT, DISTINCT, aggregation, and standalone usage

Resolves: <TICKET_OR_ISSUE_NUMBER>

Type of Change

  • Chore (a change that does not modify the application functionality)
  • Bug fix (a change that fixes an issue)
  • New feature / enhancement (a change that adds new functionality)
  • Refactor (no behaviour change)
  • Test coverage
  • Build / CI / tooling
  • Documentation

Testing

  • Unit tests added / updated
  • Integration tests added / updated
  • Manual integration tests run (go test -tags manual_integration ./integration/...)

Driver Impact

  • PostgreSQL driver (drivers/pg)
  • Neo4j driver (drivers/neo4j)

Checklist

  • Code is formatted
  • All existing tests pass
  • go.mod / go.sum are up to date if dependencies changed

Summary by CodeRabbit

  • New Features

    • Added CREATE support for graph nodes and edges, including returning created entities and paths.
    • Full UNWIND support enabling array iteration into query flows (filtering, ordering, distinct, counting, and feeding matches).
  • Database Schema

    • Reworked path/frontier routines and frontier-swap functions for reduced scans and clearer frontier accounting.
  • Tests

    • Many new SQL and integration tests covering CREATE and UNWIND scenarios.
  • Style

    • Minor formatting and whitespace cleanups.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 16, 2026

Warning

Rate limit exceeded

@zinic has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 54 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 10 minutes and 54 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5c347805-49fb-4fa6-b73a-3b9325c95f10

📥 Commits

Reviewing files that changed from the base of the PR and between 2a417c1 and 5fdae0a.

📒 Files selected for processing (26)
  • cmd/benchmark/main.go
  • cypher/models/pgsql/format/format.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/test/query_test.go
  • cypher/models/pgsql/test/testcase.go
  • cypher/models/pgsql/test/translation_cases/create.sql
  • cypher/models/pgsql/test/translation_cases/multipart.sql
  • cypher/models/pgsql/test/translation_cases/nodes.sql
  • cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
  • cypher/models/pgsql/test/translation_cases/unwind.sql
  • cypher/models/pgsql/translate/create.go
  • cypher/models/pgsql/translate/format.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/node.go
  • cypher/models/pgsql/translate/pattern.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/relationship.go
  • cypher/models/pgsql/translate/tracking.go
  • cypher/models/pgsql/translate/translator.go
  • cypher/models/pgsql/translate/unwind.go
  • cypher/models/pgsql/translate/with.go
  • cypher/models/pgsql/visualization/visualizer_test.go
  • drivers/pg/transaction.go
  • integration/cypher_test.go
  • integration/testdata/cases/create_inline.json
  • integration/testdata/cases/unwind_inline.json

Walkthrough

Adds Cypher CREATE and UNWIND translation into PostgreSQL IR, projection and scope handling for UNWIND, node/edge creation CTE emission, schema/index and frontier optimizations, and new SQL and integration tests for UNWIND and CREATE. Minor formatting and formatter/model interface tweaks included.

Changes

Cohort / File(s) Summary
Benchmark formatting
cmd/benchmark/main.go
Whitespace alignment adjusted on flag variable declarations; no behavior change.
UNWIND SQL tests
cypher/models/pgsql/test/translation_cases/unwind.sql
New translation test file with multiple UNWIND cases exercising aliasing, aggregation, filtering, ordering, dedupe, counts, and UNWIND→MATCH flows.
CREATE SQL tests & integration cases
cypher/models/pgsql/test/translation_cases/create.sql, integration/testdata/cases/create_inline.json
New comprehensive CREATE translation tests and integration case definitions for node/edge creation scenarios.
Model: query/mutations & from-clause builder
cypher/models/pgsql/translate/model.go
Extended QueryPart with unwind state and creation flag; added NodeCreate/EdgeCreate types; mutations now track Creations and EdgeCreations; added FromClauseBuilder to dedupe FROM entries.
Translator: visitor changes
cypher/models/pgsql/translate/translator.go, cypher/models/pgsql/translate/unwind.go, cypher/models/pgsql/translate/create.go
Translator now prepares/translates UNWIND (prepareUnwind/translateUnwind) and handles CREATE via translateCreate; new logic creates/exports bindings, appends UnwindClause, and emits node/edge INSERT CTEs.
Projection & WITH adjustments
cypher/models/pgsql/translate/projection.go, cypher/models/pgsql/translate/with.go
Projection builders now incorporate unnest(...) FROM clauses per unwind clause; reference-frame qualification adjusted so UNWIND variables avoid CTE/frame qualification; WITH projection handling distinguishes UNWIND variables (no LastProjection).
CREATE collection in pattern translation
cypher/models/pgsql/translate/node.go, cypher/models/pgsql/translate/relationship.go
When in creation mode, node/relationship translation records NodeCreate/EdgeCreate mutations, populates traversal steps appropriately, and defers created-edge right-node resolution; pattern binding dependencies updated.
PGSQL model & formatter tweaks
cypher/models/pgsql/model.go, cypher/models/pgsql/format/format.go
Insert now implements Expression and SetExpression interfaces; formatter accepts pgsql.Insert in set-expression formatting.
Schema & pathfinding optimizations
drivers/pg/query/sql/schema_up.sql
Refactored edges_to_path CTE and volatility, changed swap_forward/backward_front to return row counts, removed/adjusted indexes/PKs, added per-harness visited tables and dedup/materialized CTEs, and depth-aware frontier meeting logic.
Minor cleanup
integration/cypher_test.go
Trailing blank line removed; no behavioral change.

Sequence Diagram

sequenceDiagram
    participant Visitor as Cypher AST Visitor
    participant Translator as Translator
    participant Scope as Scope Frame
    participant QueryPart as QueryPart
    participant Projection as Projection Builder

    Visitor->>Translator: Enter(*cypher.Unwind)
    Translator->>Scope: prepareUnwind(create unset binding, attach frame)
    Scope-->>QueryPart: ensure frame present / declare binding

    Visitor->>Translator: Exit(*cypher.Unwind)
    Translator->>Translator: translateUnwind(pop var & array expr)
    Translator->>Scope: resolve binding
    Translator->>QueryPart: append UnwindClause(expression, binding)
    Translator->>Scope: export binding from frame

    Translator->>Translator: translateCreate(collect NodeCreate/EdgeCreate)
    Translator->>QueryPart: add INSERT CTEs for nodes and edges (materialize & export)

    Projection->>QueryPart: buildProjection / buildInlineProjection
    loop for each UnwindClause
      Projection->>Projection: add unnest(unwind.Expression) AS unwind.Binding to FROM
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through binds and unnest streams,
I stitched new creates into CTE dreams,
Unwind and insert in tidy array,
Projections dance, frontiers trim their way,
A rabbit's patch of SQL magic gleams.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The PR description is well-structured following the template, includes all major sections, and provides comprehensive details about the changes and testing.
Title check ✅ Passed The title 'feat(cysql): support CREATE and UNWIND' accurately reflects the main changes in the PR, which implement translation support for both Cypher CREATE and UNWIND clauses to PostgreSQL.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
drivers/pg/query/sql/schema_up.sql (1)

461-464: Prefer bigint counters for frontier sizes.

count(*) is bigint in PostgreSQL; storing in int4 can overflow on large traversals. Consider widening remaining, forward_front_size, and backward_front_size to int8.

Also applies to: 493-496, 700-702

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/pg/query/sql/schema_up.sql` around lines 461 - 464, The
procedure/function currently declares and returns 32-bit ints (e.g. "returns
int4 as" and variables remaining, forward_front_size, backward_front_size) which
can overflow for large counts; change the function return type from int4 to int8
and widen the local variables remaining, forward_front_size, and
backward_front_size to int8 (bigint) and update any related casts/usages of
count(*) to use bigint to match; apply the same change to the other occurrences
noted (the other function blocks around the other ranges).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cypher/models/pgsql/test/translation_cases/unwind.sql`:
- Around line 32-33: The test's expected SQL preserves duplicates because it
uses "select i1 as x from s0, unnest(i0) as i1" but the Cypher was "return
distinct x"; update the expected SQL to enforce distinctness by using SELECT
DISTINCT (e.g., "select distinct i1 as x ...") so the fixture asserts
duplicate-removal; locate the clause around s0 / i0 / i1 and change the SELECT
to DISTINCT to match the original "return distinct x" semantics.

In `@cypher/models/pgsql/translate/projection.go`:
- Around line 443-453: The unnest/UNWIND sources are being appended to
sqlSelect.From only at projection time (in the block that uses
part.unwindClauses, pgsql.FunctionUnnest, pgsql.AliasedExpression and
unwind.Binding.Identifier), which is too late for downstream clause translation;
instead, materialize those unnest sources into the query part's FROM pipeline so
later MATCH/WHERE/SUBPART translation can bind to the new aliases. Move the
logic that converts part.unwindClauses -> pgsql.FromClause (the
pgsql.FunctionCall(FunctionUnnest) wrapped in pgsql.AliasedExpression with
unwind.Binding.Identifier) into the earlier FROM-building stage for the query
part (the same pipeline that produces sqlSelect.From for other sources), and
apply the same change for the other occurrence noted (the similar block around
lines 493-503) so UNWINDs are present before subsequent clause translation.

In `@cypher/models/pgsql/translate/unwind.go`:
- Around line 18-26: The current PushFrame call in unwind.go creates a real
Frame that downstream tail-projection treats as having a FROM source; change the
synthetic UNWIND bookkeeping frame so it cannot be mistaken for a real SQL
source: when you create the frame in the s.scope.PushFrame() branch, mark it as
synthetic/bookkeeping (e.g., set a Frame.Synthetic or Frame.IsBookkeeping flag
or keep its Source/Relation nil) and update the tail-projection logic (the code
that inspects s.query.CurrentPart().Frame to decide to synthesize a FROM/CTE) to
skip creating a relation when that flag is set (or when Frame.Source is nil).
Ensure references include the PushFrame call site and
s.query.CurrentPart().Frame so the synthetic frame is recognized and not emitted
as a SQL source.

In `@drivers/pg/query/sql/schema_up.sql`:
- Around line 577-583: The deduplication and visited-tracking currently collapse
paths across different roots because deduped uses DISTINCT ON
(next_front.next_id) and visited is keyed by next_id alone; modify the dedupe
and all visited table usages to include root_id so they operate on (root_id,
next_id) pairs: update the CTE deduped to use DISTINCT ON (next_front.root_id,
next_front.next_id) and adjust all references/insertions/SELECTs against the
visited table and any joins with next_front (the visited-marking logic around
visited, and any EXISTS/NOT EXISTS checks) to include root_id alongside next_id
so visitation and pruning are scoped per root_id. Ensure any ORDER BY or GROUP
BY that relied on next_id is updated to include root_id where appropriate.

---

Nitpick comments:
In `@drivers/pg/query/sql/schema_up.sql`:
- Around line 461-464: The procedure/function currently declares and returns
32-bit ints (e.g. "returns int4 as" and variables remaining, forward_front_size,
backward_front_size) which can overflow for large counts; change the function
return type from int4 to int8 and widen the local variables remaining,
forward_front_size, and backward_front_size to int8 (bigint) and update any
related casts/usages of count(*) to use bigint to match; apply the same change
to the other occurrences noted (the other function blocks around the other
ranges).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 10ed1a64-19ad-4bfb-ade7-f60dec38327c

📥 Commits

Reviewing files that changed from the base of the PR and between 2a417c1 and f6c737e.

📒 Files selected for processing (9)
  • cmd/benchmark/main.go
  • cypher/models/pgsql/test/translation_cases/unwind.sql
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/translator.go
  • cypher/models/pgsql/translate/unwind.go
  • cypher/models/pgsql/translate/with.go
  • drivers/pg/query/sql/schema_up.sql
  • integration/cypher_test.go
💤 Files with no reviewable changes (1)
  • integration/cypher_test.go

Comment thread cypher/models/pgsql/test/translation_cases/unwind.sql Outdated
Comment thread cypher/models/pgsql/translate/projection.go Outdated
Comment thread cypher/models/pgsql/translate/unwind.go
Comment thread drivers/pg/query/sql/schema_up.sql Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cypher/models/pgsql/translate/model.go (1)

683-697: Minor typo in method name: AddIdentifer should be AddIdentifier.

The method name has a typo (Identifer instead of Identifier). While this doesn't affect functionality, it could cause confusion and inconsistency with standard naming.

✏️ Suggested fix
-// AddIdentifer appends a from clause for frameID if it has not been seen before.
-func (s *FromClauseBuilder) AddIdentifer(frameID pgsql.Identifier) {
+// AddIdentifier appends a from clause for frameID if it has not been seen before.
+func (s *FromClauseBuilder) AddIdentifier(frameID pgsql.Identifier) {

Note: Also update the call site in AddBinding (line 702) to use AddIdentifier.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/model.go` around lines 683 - 697, Rename the
method AddIdentifer to AddIdentifier to fix the typo: update the function
declaration on the FromClauseBuilder receiver and all call sites (notably the
call from AddBinding) to use the new AddIdentifier name; ensure the identifier's
visibility/signature remains the same (receiver type, parameter
pgsql.Identifier) so no other behavior changes are introduced and run
tests/compile to catch any missed references.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cypher/models/pgsql/translate/model.go`:
- Around line 683-697: Rename the method AddIdentifer to AddIdentifier to fix
the typo: update the function declaration on the FromClauseBuilder receiver and
all call sites (notably the call from AddBinding) to use the new AddIdentifier
name; ensure the identifier's visibility/signature remains the same (receiver
type, parameter pgsql.Identifier) so no other behavior changes are introduced
and run tests/compile to catch any missed references.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c66defab-d361-4923-9f44-e766b8bf889d

📥 Commits

Reviewing files that changed from the base of the PR and between f6c737e and 9bdbac2.

📒 Files selected for processing (10)
  • cypher/models/pgsql/format/format.go
  • cypher/models/pgsql/model.go
  • cypher/models/pgsql/test/translation_cases/create.sql
  • cypher/models/pgsql/translate/create.go
  • cypher/models/pgsql/translate/model.go
  • cypher/models/pgsql/translate/node.go
  • cypher/models/pgsql/translate/projection.go
  • cypher/models/pgsql/translate/relationship.go
  • cypher/models/pgsql/translate/translator.go
  • integration/testdata/cases/create_inline.json
✅ Files skipped from review due to trivial changes (1)
  • integration/testdata/cases/create_inline.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • cypher/models/pgsql/translate/projection.go

@zinic zinic changed the title feat(cysql): support UNWIND feat(cysql): support CREATE and UNWIND Apr 16, 2026
@zinic
Copy link
Copy Markdown
Contributor Author

zinic commented Apr 16, 2026

@CodeRabbit yo, blast me and review this again

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 16, 2026

@zinic: Sure thing! Kicking off a full review of this PR now. 🚀

✅ Actions performed

Full review triggered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant