Aggregating bounded min/max values has to not ignore null values#8169
Aggregating bounded min/max values has to not ignore null values#8169robert3005 wants to merge 1 commit into
Conversation
Signed-off-by: Robert Kruszewski <github@robertk.io>
gatesn
left a comment
There was a problem hiding this comment.
This seems like a very odd behaviour for truncated min/max to me, just leaving a request changes so I can read thoroughly on Monday.
Merging this PR will degrade performance by 15.76%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_varbinview_opt_canonical_into[(1000, 10)] |
187.7 µs | 225 µs | -16.56% |
| ❌ | WallTime | cuda/bitpacked_u8/unpack/3bw[100M] |
299.6 µs | 352.3 µs | -14.94% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing rk/fuzzer (944755b) with develop (57e1784)
|
This comes from the fact that bounded min/max is not transitive if you exclude nulls. The problem we run into is when producing a bounded max returns null because the bound exceeds the limits we have defined which then is not included in the min/max of the file which then can lead to skipped values. |
|
talked to @joseph-isaacs. |
|
what is the counterexample? |
|
take |
|
But |
|
yes but I don't follow |
|
Okay we need to store a flag to specify |
Propagating bounded min/max is not the same as propagating nulls. In the case of
bounded min/max we have to conclude that null value could have come from trying
to produce the bounded value which means that there's no valid bounded value while there's still valid values in the array
handle #8166