perf: correctly try execute parent in the iterative child execute loop #7386
perf: correctly try execute parent in the iterative child execute loop #7386joseph-isaacs merged 11 commits intodevelopfrom
Conversation
I believe this is a regression. A Filter(Slice(Ree)) is pretty common and would eagerly canonicalize its child preventing execute_parent kernels like RunEnd's FilterKernel from firing. This is an issue with dict encoding too. This commit executes its child one step so that execute_parent kernels may match. Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
Polar Signals Profiling ResultsLatest Run
Previous Runs (3)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.029x ➖ datafusion / vortex-file-compressed (1.029x ➖, 0↑ 1↓)
|
File Sizes: PolarSignals ProfilingNo file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.997x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.002x ➖, 0↑ 0↓)
datafusion / parquet (0.980x ➖, 1↑ 1↓)
datafusion / arrow (1.014x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.003x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.001x ➖, 0↑ 0↓)
duckdb / parquet (0.999x ➖, 1↑ 1↓)
duckdb / duckdb (1.000x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMENo file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.080x ➖, 0↑ 3↓)
datafusion / vortex-compact (1.043x ➖, 0↑ 1↓)
datafusion / parquet (1.068x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.028x ➖, 1↑ 3↓)
duckdb / vortex-compact (1.064x ➖, 0↑ 2↓)
duckdb / parquet (1.080x ➖, 0↑ 2↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeNo file size changes detected. |
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.024x ➖, 0↑ 2↓)
datafusion / vortex-compact (1.024x ➖, 0↑ 1↓)
datafusion / parquet (1.024x ➖, 0↑ 2↓)
duckdb / vortex-file-compressed (1.019x ➖, 1↑ 3↓)
duckdb / vortex-compact (1.022x ➖, 1↑ 4↓)
duckdb / parquet (1.013x ➖, 0↑ 1↓)
duckdb / duckdb (1.015x ➖, 0↑ 5↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMENo file size changes detected. |
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.993x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.995x ➖, 0↑ 0↓)
datafusion / parquet (0.993x ➖, 0↑ 0↓)
datafusion / arrow (0.990x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.995x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.997x ➖, 0↑ 0↓)
duckdb / parquet (0.985x ➖, 0↑ 0↓)
duckdb / duckdb (1.002x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMENo file size changes detected. |
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.910x ➖, 2↑ 1↓)
datafusion / vortex-compact (1.010x ➖, 0↑ 0↓)
datafusion / parquet (0.902x ➖, 2↑ 1↓)
duckdb / vortex-file-compressed (0.993x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.970x ➖, 0↑ 0↓)
duckdb / parquet (1.007x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: FineWeb S3Verdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.883x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.988x ➖, 0↑ 0↓)
datafusion / parquet (0.957x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.043x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.113x ➖, 0↑ 2↓)
duckdb / parquet (0.990x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Random AccessVortex (geomean): 0.859x ✅ unknown / unknown (0.945x ➖, 8↑ 0↓)
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (0.976x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.954x ➖, 1↑ 0↓)
duckdb / parquet (0.994x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsNo file size changes detected. |
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.886x ➖, 2↑ 0↓)
datafusion / vortex-compact (0.979x ➖, 0↑ 0↓)
datafusion / parquet (0.984x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.996x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.018x ➖, 0↑ 0↓)
duckdb / parquet (0.956x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.998x ➖, 1↑ 1↓)
datafusion / parquet (0.996x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.999x ➖, 6↑ 6↓)
duckdb / parquet (1.000x ➖, 2↑ 0↓)
duckdb / duckdb (1.029x ➖, 0↑ 4↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
BENCHMARK FAILEDBenchmark |
|
|
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Merging this PR will improve performance by 19.72%
Performance Changes
Comparing Footnotes
|
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk> # Conflicts: # vortex-array/src/executor.rs
5a1b340 to
bb09b8c
Compare
We run iterative execution for executing arrays (decompressing).
This PR add a execute_parent call when executing a child in a iterative fashion