feat(cysql): support CREATE and UNWIND by zinic · Pull Request #62 · SpecterOps/DAWGS

zinic · 2026-04-16T01:09:37Z

Description

Implement Cypher-to-PgSQL translation for UNWIND, mapping it to PostgreSQL's unnest() function. This supports UNWIND as a reading clause in both multi-part queries (preceded by WITH/MATCH) and standalone single-part queries.

Add UnwindClause model and tracking on QueryPart
Add prepareUnwind/translateUnwind handlers in the AST translator
Emit unnest() FROM clauses in both inline and tail projections
Handle nil reference frames in projection and WITH translation for standalone UNWIND variables that have no preceding CTE
Lazily push a scope frame in prepareUnwind when none exists to prevent nil pointer dereference on standalone UNWIND queries
Add translation test cases covering UNWIND with WITH, MATCH, WHERE, ORDER BY, LIMIT, DISTINCT, aggregation, and standalone usage

Resolves: <TICKET_OR_ISSUE_NUMBER>

Type of Change

Chore (a change that does not modify the application functionality)
Bug fix (a change that fixes an issue)
New feature / enhancement (a change that adds new functionality)
Refactor (no behaviour change)
Test coverage
Build / CI / tooling
Documentation

Testing

Unit tests added / updated
Integration tests added / updated
Manual integration tests run (go test -tags manual_integration ./integration/...)

Driver Impact

PostgreSQL driver (drivers/pg)
Neo4j driver (drivers/neo4j)

Checklist

Code is formatted
All existing tests pass
go.mod / go.sum are up to date if dependencies changed

Summary by CodeRabbit

New Features
- Added CREATE support for graph nodes and edges, including returning created entities and paths.
- Full UNWIND support enabling array iteration into query flows (filtering, ordering, distinct, counting, and feeding matches).
Database Schema
- Reworked path/frontier routines and frontier-swap functions for reduced scans and clearer frontier accounting.
Tests
- Many new SQL and integration tests covering CREATE and UNWIND scenarios.
Style
- Minor formatting and whitespace cleanups.

coderabbitai · 2026-04-16T01:09:49Z

Warning

Rate limit exceeded

@zinic has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 54 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 10 minutes and 54 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5c347805-49fb-4fa6-b73a-3b9325c95f10

📥 Commits

Reviewing files that changed from the base of the PR and between 2a417c1 and 5fdae0a.

📒 Files selected for processing (26)

cmd/benchmark/main.go
cypher/models/pgsql/format/format.go
cypher/models/pgsql/model.go
cypher/models/pgsql/test/query_test.go
cypher/models/pgsql/test/testcase.go
cypher/models/pgsql/test/translation_cases/create.sql
cypher/models/pgsql/test/translation_cases/multipart.sql
cypher/models/pgsql/test/translation_cases/nodes.sql
cypher/models/pgsql/test/translation_cases/stepwise_traversal.sql
cypher/models/pgsql/test/translation_cases/unwind.sql
cypher/models/pgsql/translate/create.go
cypher/models/pgsql/translate/format.go
cypher/models/pgsql/translate/model.go
cypher/models/pgsql/translate/node.go
cypher/models/pgsql/translate/pattern.go
cypher/models/pgsql/translate/projection.go
cypher/models/pgsql/translate/relationship.go
cypher/models/pgsql/translate/tracking.go
cypher/models/pgsql/translate/translator.go
cypher/models/pgsql/translate/unwind.go
cypher/models/pgsql/translate/with.go
cypher/models/pgsql/visualization/visualizer_test.go
drivers/pg/transaction.go
integration/cypher_test.go
integration/testdata/cases/create_inline.json
integration/testdata/cases/unwind_inline.json

Walkthrough

Adds Cypher CREATE and UNWIND translation into PostgreSQL IR, projection and scope handling for UNWIND, node/edge creation CTE emission, schema/index and frontier optimizations, and new SQL and integration tests for UNWIND and CREATE. Minor formatting and formatter/model interface tweaks included.

Changes

Cohort / File(s)	Summary
Benchmark formatting `cmd/benchmark/main.go`	Whitespace alignment adjusted on flag variable declarations; no behavior change.
UNWIND SQL tests `cypher/models/pgsql/test/translation_cases/unwind.sql`	New translation test file with multiple UNWIND cases exercising aliasing, aggregation, filtering, ordering, dedupe, counts, and UNWIND→MATCH flows.
CREATE SQL tests & integration cases `cypher/models/pgsql/test/translation_cases/create.sql`, `integration/testdata/cases/create_inline.json`	New comprehensive CREATE translation tests and integration case definitions for node/edge creation scenarios.
Model: query/mutations & from-clause builder `cypher/models/pgsql/translate/model.go`	Extended QueryPart with unwind state and creation flag; added NodeCreate/EdgeCreate types; mutations now track Creations and EdgeCreations; added FromClauseBuilder to dedupe FROM entries.
Translator: visitor changes `cypher/models/pgsql/translate/translator.go`, `cypher/models/pgsql/translate/unwind.go`, `cypher/models/pgsql/translate/create.go`	Translator now prepares/translates `UNWIND` (`prepareUnwind`/`translateUnwind`) and handles `CREATE` via `translateCreate`; new logic creates/exports bindings, appends UnwindClause, and emits node/edge INSERT CTEs.
Projection & WITH adjustments `cypher/models/pgsql/translate/projection.go`, `cypher/models/pgsql/translate/with.go`	Projection builders now incorporate unnest(...) FROM clauses per unwind clause; reference-frame qualification adjusted so UNWIND variables avoid CTE/frame qualification; WITH projection handling distinguishes UNWIND variables (no LastProjection).
CREATE collection in pattern translation `cypher/models/pgsql/translate/node.go`, `cypher/models/pgsql/translate/relationship.go`	When in creation mode, node/relationship translation records NodeCreate/EdgeCreate mutations, populates traversal steps appropriately, and defers created-edge right-node resolution; pattern binding dependencies updated.
PGSQL model & formatter tweaks `cypher/models/pgsql/model.go`, `cypher/models/pgsql/format/format.go`	`Insert` now implements Expression and SetExpression interfaces; formatter accepts `pgsql.Insert` in set-expression formatting.
Schema & pathfinding optimizations `drivers/pg/query/sql/schema_up.sql`	Refactored `edges_to_path` CTE and volatility, changed swap_forward/backward_front to return row counts, removed/adjusted indexes/PKs, added per-harness visited tables and dedup/materialized CTEs, and depth-aware frontier meeting logic.
Minor cleanup `integration/cypher_test.go`	Trailing blank line removed; no behavioral change.

Sequence Diagram

sequenceDiagram
    participant Visitor as Cypher AST Visitor
    participant Translator as Translator
    participant Scope as Scope Frame
    participant QueryPart as QueryPart
    participant Projection as Projection Builder

    Visitor->>Translator: Enter(*cypher.Unwind)
    Translator->>Scope: prepareUnwind(create unset binding, attach frame)
    Scope-->>QueryPart: ensure frame present / declare binding

    Visitor->>Translator: Exit(*cypher.Unwind)
    Translator->>Translator: translateUnwind(pop var & array expr)
    Translator->>Scope: resolve binding
    Translator->>QueryPart: append UnwindClause(expression, binding)
    Translator->>Scope: export binding from frame

    Translator->>Translator: translateCreate(collect NodeCreate/EdgeCreate)
    Translator->>QueryPart: add INSERT CTEs for nodes and edges (materialize & export)

    Projection->>QueryPart: buildProjection / buildInlineProjection
    loop for each UnwindClause
      Projection->>Projection: add unnest(unwind.Expression) AS unwind.Binding to FROM
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through binds and unnest streams,
I stitched new creates into CTE dreams,
Unwind and insert in tidy array,
Projections dance, frontiers trim their way,
A rabbit's patch of SQL magic gleams.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description is well-structured following the template, includes all major sections, and provides comprehensive details about the changes and testing.
Title check	✅ Passed	The title 'feat(cysql): support CREATE and UNWIND' accurately reflects the main changes in the PR, which implement translation support for both Cypher CREATE and UNWIND clauses to PostgreSQL.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

drivers/pg/query/sql/schema_up.sql (1)
461-464: Prefer bigint counters for frontier sizes.

count(*) is bigint in PostgreSQL; storing in int4 can overflow on large traversals. Consider widening remaining, forward_front_size, and backward_front_size to int8.

Also applies to: 493-496, 700-702
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@drivers/pg/query/sql/schema_up.sql` around lines 461 - 464, The
procedure/function currently declares and returns 32-bit ints (e.g. "returns
int4 as" and variables remaining, forward_front_size, backward_front_size) which
can overflow for large counts; change the function return type from int4 to int8
and widen the local variables remaining, forward_front_size, and
backward_front_size to int8 (bigint) and update any related casts/usages of
count(*) to use bigint to match; apply the same change to the other occurrences
noted (the other function blocks around the other ranges).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cypher/models/pgsql/test/translation_cases/unwind.sql`:
- Around line 32-33: The test's expected SQL preserves duplicates because it
uses "select i1 as x from s0, unnest(i0) as i1" but the Cypher was "return
distinct x"; update the expected SQL to enforce distinctness by using SELECT
DISTINCT (e.g., "select distinct i1 as x ...") so the fixture asserts
duplicate-removal; locate the clause around s0 / i0 / i1 and change the SELECT
to DISTINCT to match the original "return distinct x" semantics.

In `@cypher/models/pgsql/translate/projection.go`:
- Around line 443-453: The unnest/UNWIND sources are being appended to
sqlSelect.From only at projection time (in the block that uses
part.unwindClauses, pgsql.FunctionUnnest, pgsql.AliasedExpression and
unwind.Binding.Identifier), which is too late for downstream clause translation;
instead, materialize those unnest sources into the query part's FROM pipeline so
later MATCH/WHERE/SUBPART translation can bind to the new aliases. Move the
logic that converts part.unwindClauses -> pgsql.FromClause (the
pgsql.FunctionCall(FunctionUnnest) wrapped in pgsql.AliasedExpression with
unwind.Binding.Identifier) into the earlier FROM-building stage for the query
part (the same pipeline that produces sqlSelect.From for other sources), and
apply the same change for the other occurrence noted (the similar block around
lines 493-503) so UNWINDs are present before subsequent clause translation.

In `@cypher/models/pgsql/translate/unwind.go`:
- Around line 18-26: The current PushFrame call in unwind.go creates a real
Frame that downstream tail-projection treats as having a FROM source; change the
synthetic UNWIND bookkeeping frame so it cannot be mistaken for a real SQL
source: when you create the frame in the s.scope.PushFrame() branch, mark it as
synthetic/bookkeeping (e.g., set a Frame.Synthetic or Frame.IsBookkeeping flag
or keep its Source/Relation nil) and update the tail-projection logic (the code
that inspects s.query.CurrentPart().Frame to decide to synthesize a FROM/CTE) to
skip creating a relation when that flag is set (or when Frame.Source is nil).
Ensure references include the PushFrame call site and
s.query.CurrentPart().Frame so the synthetic frame is recognized and not emitted
as a SQL source.

In `@drivers/pg/query/sql/schema_up.sql`:
- Around line 577-583: The deduplication and visited-tracking currently collapse
paths across different roots because deduped uses DISTINCT ON
(next_front.next_id) and visited is keyed by next_id alone; modify the dedupe
and all visited table usages to include root_id so they operate on (root_id,
next_id) pairs: update the CTE deduped to use DISTINCT ON (next_front.root_id,
next_front.next_id) and adjust all references/insertions/SELECTs against the
visited table and any joins with next_front (the visited-marking logic around
visited, and any EXISTS/NOT EXISTS checks) to include root_id alongside next_id
so visitation and pruning are scoped per root_id. Ensure any ORDER BY or GROUP
BY that relied on next_id is updated to include root_id where appropriate.

---

Nitpick comments:
In `@drivers/pg/query/sql/schema_up.sql`:
- Around line 461-464: The procedure/function currently declares and returns
32-bit ints (e.g. "returns int4 as" and variables remaining, forward_front_size,
backward_front_size) which can overflow for large counts; change the function
return type from int4 to int8 and widen the local variables remaining,
forward_front_size, and backward_front_size to int8 (bigint) and update any
related casts/usages of count(*) to use bigint to match; apply the same change
to the other occurrences noted (the other function blocks around the other
ranges).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 10ed1a64-19ad-4bfb-ade7-f60dec38327c

📥 Commits

Reviewing files that changed from the base of the PR and between 2a417c1 and f6c737e.

📒 Files selected for processing (9)

cmd/benchmark/main.go
cypher/models/pgsql/test/translation_cases/unwind.sql
cypher/models/pgsql/translate/model.go
cypher/models/pgsql/translate/projection.go
cypher/models/pgsql/translate/translator.go
cypher/models/pgsql/translate/unwind.go
cypher/models/pgsql/translate/with.go
drivers/pg/query/sql/schema_up.sql
integration/cypher_test.go

💤 Files with no reviewable changes (1)

integration/cypher_test.go

coderabbitai

🧹 Nitpick comments (1)

cypher/models/pgsql/translate/model.go (1)
683-697: Minor typo in method name: AddIdentifer should be AddIdentifier.

The method name has a typo (Identifer instead of Identifier). While this doesn't affect functionality, it could cause confusion and inconsistency with standard naming.
✏️ Suggested fix
-// AddIdentifer appends a from clause for frameID if it has not been seen before.
-func (s *FromClauseBuilder) AddIdentifer(frameID pgsql.Identifier) {
+// AddIdentifier appends a from clause for frameID if it has not been seen before.
+func (s *FromClauseBuilder) AddIdentifier(frameID pgsql.Identifier) {
Note: Also update the call site in AddBinding (line 702) to use AddIdentifier.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cypher/models/pgsql/translate/model.go` around lines 683 - 697, Rename the
method AddIdentifer to AddIdentifier to fix the typo: update the function
declaration on the FromClauseBuilder receiver and all call sites (notably the
call from AddBinding) to use the new AddIdentifier name; ensure the identifier's
visibility/signature remains the same (receiver type, parameter
pgsql.Identifier) so no other behavior changes are introduced and run
tests/compile to catch any missed references.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cypher/models/pgsql/translate/model.go`:
- Around line 683-697: Rename the method AddIdentifer to AddIdentifier to fix
the typo: update the function declaration on the FromClauseBuilder receiver and
all call sites (notably the call from AddBinding) to use the new AddIdentifier
name; ensure the identifier's visibility/signature remains the same (receiver
type, parameter pgsql.Identifier) so no other behavior changes are introduced
and run tests/compile to catch any missed references.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c66defab-d361-4923-9f44-e766b8bf889d

📥 Commits

Reviewing files that changed from the base of the PR and between f6c737e and 9bdbac2.

📒 Files selected for processing (10)

cypher/models/pgsql/format/format.go
cypher/models/pgsql/model.go
cypher/models/pgsql/test/translation_cases/create.sql
cypher/models/pgsql/translate/create.go
cypher/models/pgsql/translate/model.go
cypher/models/pgsql/translate/node.go
cypher/models/pgsql/translate/projection.go
cypher/models/pgsql/translate/relationship.go
cypher/models/pgsql/translate/translator.go
integration/testdata/cases/create_inline.json

✅ Files skipped from review due to trivial changes (1)

integration/testdata/cases/create_inline.json

🚧 Files skipped from review as they are similar to previous changes (1)

cypher/models/pgsql/translate/projection.go

zinic · 2026-04-16T14:46:47Z

@CodeRabbit yo, blast me and review this again

coderabbitai · 2026-04-16T14:47:01Z

@zinic: Sure thing! Kicking off a full review of this PR now. 🚀

✅ Actions performed

Full review triggered.

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

Comment thread cypher/models/pgsql/test/translation_cases/unwind.sql Outdated

Comment thread cypher/models/pgsql/translate/projection.go Outdated

Comment thread cypher/models/pgsql/translate/unwind.go

Comment thread drivers/pg/query/sql/schema_up.sql Outdated

coderabbitai bot reviewed Apr 16, 2026

View reviewed changes

zinic changed the title ~~feat(cysql): support UNWIND~~ feat(cysql): support CREATE and UNWIND Apr 16, 2026

zinic added 5 commits April 16, 2026 07:21

feat(cysql): add Cypher UNWIND clause translation support

8180ff9

feat(cysql): support cypher create

75cbcde

fix (cysql): avoid materializing synthetic frames in FROM statements

b143c47

fix (cysql): materialize unwind before downstream clauses

95c17c6

fix (cysql): rectify distinc propagation

be84f36

zinic force-pushed the unwind branch from 5fd8159 to 264416b Compare April 16, 2026 14:36

fix (cysql): pull through graph_id down to CREATE statements

5fdae0a

zinic force-pushed the unwind branch from 264416b to 5fdae0a Compare April 16, 2026 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cysql): support CREATE and UNWIND#62

feat(cysql): support CREATE and UNWIND#62
zinic wants to merge 6 commits intoSpecterOps:mainfrom
zinic:unwind

zinic commented Apr 16, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 16, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

zinic commented Apr 16, 2026

Uh oh!

coderabbitai bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zinic commented Apr 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Driver Impact

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

zinic commented Apr 16, 2026

Uh oh!

coderabbitai bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zinic commented Apr 16, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 16, 2026 •

edited

Loading