Rich t kid/dictionary encoding hash optmize#21589
Rich t kid/dictionary encoding hash optmize#21589Rich-T-kid wants to merge 7 commits intoapache:mainfrom
Conversation
|
Currenly the CI is breaking because my current emit implementations returns the values dirrectly instead of returning a dictionary encoded column. & im not handling nulls. |
8ea71fd to
cc18a8f
Compare
|
run benchmarks |
|
Hi @Rich-T-kid, thanks for the request (#21589 (comment)). Only whitelisted users can trigger benchmarks. Allowed users: Dandandan, Fokko, Jefffrey, Omega359, adriangb, alamb, asubiotto, brunal, buraksenn, cetra3, codephage2020, comphead, erenavsarogullari, etseidl, friendlymatthew, gabotechs, geoffreyclaude, grtlr, haohuaijin, jonathanc-n, kevinjqliu, klion26, kosiew, kumarUjjawal, kunalsinghdadhwal, liamzwbao, mbutrovich, mzabaluev, neilconway, rluvaton, sdf-jkl, timsaucer, xudong963, zhuqi-lucas. File an issue against this benchmark runner |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing rich-t-kid/Dictionary-encoding-Hash-optmize (aa69892) to 37cd3de (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing rich-t-kid/Dictionary-encoding-Hash-optmize (aa69892) to 37cd3de (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing rich-t-kid/Dictionary-encoding-Hash-optmize (aa69892) to 37cd3de (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
Wrote some criterion benchmarks and they reflect what the query benchmarks show. I think storing Scalar values directly in the hash map is causing performance issues. Need to profile to see heap allocations. |
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?