[CALCITE-7628] Interpreter gives wrong result for query with MINUS or INTERSECT with 3 or more inputs#5055
Open
julianhyde wants to merge 8 commits into
Open
[CALCITE-7628] Interpreter gives wrong result for query with MINUS or INTERSECT with 3 or more inputs#5055julianhyde wants to merge 8 commits into
julianhyde wants to merge 8 commits into
Conversation
…with 3 or more inputs return wrong result Add test cases.
… inputs return wrong result In the interpreter, a query where MINUS or INTERSECT has 3 or more inputs previously returned the wrong result, because SetOpNode evaluated only the first two inputs. It now evaluates all inputs. We add tests in a new file `interpreter.iq`, that runs SQL queries using the interpreter.
SetOpNode evaluates EXCEPT ALL by counting occurrences in a Map<Row, Integer>: each row from the first input increments the count for its value, each row from a later input decrements it, and a value whose count reaches zero is removed from the map. After all inputs have been read, each value remaining in the map is emitted as many times as its surviving count. The later inputs are streamed rather than buffered. testInterpretMinusAll no longer asserts row order, which the map-based output does not preserve. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Switch on setOp.kind to a method per operation (union, intersect, minus), with a shared helper that buffers the inputs for intersect and minus. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…terpreter Implement INTERSECT ALL in SetOpNode with a Map<Row, CountPair> that tracks, per value, the running minimum multiplicity and its count in the current input, reducing both in a single pass per intermediate input and emitting min(min, current) copies for values present in the last input. Use a mutable Count holder for the EXCEPT ALL occurrence counts. testBindableIntersect no longer asserts row order, which INTERSECT ALL does not define. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Avoid materializing every input into a List<Set<Row>>. EXCEPT DISTINCT streams each later input directly into Set.remove on the running result; INTERSECT DISTINCT processes one input at a time, streaming it into a new set of the rows it shares with the result so far. Peak memory is now bounded by the result set plus one input rather than the sum of all inputs. Replace readInputs() with a single-source read(Source) helper. Require at least two inputs in the constructor. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…empty When the running result (for the distinct paths) or the count map (for the ALL paths) becomes empty, the answer can no longer change, so stop reading the remaining inputs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
xiedeyantu
approved these changes
Jun 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



See CALCITE-7628.