Skip to content

Aggregating bounded min/max values has to not ignore null values#8169

Open
robert3005 wants to merge 1 commit into
developfrom
rk/fuzzer
Open

Aggregating bounded min/max values has to not ignore null values#8169
robert3005 wants to merge 1 commit into
developfrom
rk/fuzzer

Conversation

@robert3005
Copy link
Copy Markdown
Contributor

@robert3005 robert3005 commented May 31, 2026

Propagating bounded min/max is not the same as propagating nulls. In the case of
bounded min/max we have to conclude that null value could have come from trying
to produce the bounded value which means that there's no valid bounded value while there's still valid values in the array

handle #8166

Signed-off-by: Robert Kruszewski <github@robertk.io>
@robert3005 robert3005 added the changelog/fix A bug fix label May 31, 2026
@robert3005 robert3005 requested review from gatesn and onursatici May 31, 2026 23:28
Copy link
Copy Markdown
Contributor

@gatesn gatesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a very odd behaviour for truncated min/max to me, just leaving a request changes so I can read thoroughly on Monday.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 31, 2026

Merging this PR will degrade performance by 15.76%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

❌ 2 regressed benchmarks
✅ 1273 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_varbinview_opt_canonical_into[(1000, 10)] 187.7 µs 225 µs -16.56%
WallTime cuda/bitpacked_u8/unpack/3bw[100M] 299.6 µs 352.3 µs -14.94%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing rk/fuzzer (944755b) with develop (57e1784)

Open in CodSpeed

@robert3005
Copy link
Copy Markdown
Contributor Author

This comes from the fact that bounded min/max is not transitive if you exclude nulls. The problem we run into is when producing a bounded max returns null because the bound exceeds the limits we have defined which then is not included in the min/max of the file which then can lead to skipped values.

@robert3005
Copy link
Copy Markdown
Contributor Author

robert3005 commented Jun 1, 2026

talked to @joseph-isaacs. There's a simpler fix, when we overflow on max we store truncated_len + 1 field which would be bigger than any of the existing values this doesn't work, we need to increment

@joseph-isaacs
Copy link
Copy Markdown
Contributor

what is the counterexample?

@robert3005
Copy link
Copy Markdown
Contributor Author

take upper_bound([FF, FF, FF, 0], 2) then [FF, FF, FF] is not >= [FF, FF, FF, 0]

@joseph-isaacs
Copy link
Copy Markdown
Contributor

But 0xFF > 0xF?

@robert3005
Copy link
Copy Markdown
Contributor Author

yes but I don't follow

@joseph-isaacs
Copy link
Copy Markdown
Contributor

joseph-isaacs commented Jun 1, 2026

Okay we need to store a flag to specify MAX_VALUE, such that for x. x <= MAX_VALUE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/fix A bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants