Skip to content

Coalesce NDChunkCache chunk fills into one compute call#298

Open
yfukai wants to merge 7 commits into
royerlab:mainfrom
yfukai:perf/chunk-coalescing
Open

Coalesce NDChunkCache chunk fills into one compute call#298
yfukai wants to merge 7 commits into
royerlab:mainfrom
yfukai:perf/chunk-coalescing

Conversation

@yfukai

@yfukai yfukai commented May 31, 2026

Copy link
Copy Markdown
Contributor

Human summary

This is an optimization that may solve performance bottlenecks in real use. I'll investigate a bit more to see if this is appropriate.

Claude benchmark results

Aggregate the not-ready chunks intersecting a get() request into a single bounding-box slice and call compute_func once. For the GraphArrayView path this collapses N SQL queries (and N mask decompressions per mask) per cold frame into one.

Benchmark (benchmarks/perf_chunk_coalescing.py, 4x4 chunk grid per 512x512 frame, 50 frames x 50 nodes x 32x32 masks):

                              before             after

cold whole frame t=10 16 exec 122.4 ms 1 exec 13.4 ms
cached whole frame t=10 0 exec 0.4 ms 0 exec 0.3 ms
cold one chunk t=11 1 exec 10.0 ms 1 exec 9.1 ms
rest of t=11 15 exec 105.6 ms 1 exec 11.5 ms
3 cold frames t=12..14 48 exec 349.0 ms 3 exec 34.8 ms

Public API unchanged: NDChunkCache.get/invalidate signatures unchanged; GraphArrayView._fill_array signature unchanged.

Copilot summary

This pull request introduces a new benchmark script and optimizes chunk fetching in the NDChunkCache. The benchmark script validates the effectiveness of chunk-coalescing by measuring performance metrics, while the code change in NDChunkCache ensures that multiple per-chunk computations are collapsed into a single bounding-box computation, improving efficiency.

Key changes:

Benchmarking and Validation:

  • Added a standalone script benchmarks/perf_chunk_coalescing.py to benchmark and validate the NDChunkCache chunk-coalescing fix. This script measures wall time, number of SQL executes, and number of blosc2 tensor unpacks in various scenarios, and includes stubs for dependencies to allow standalone execution.

Chunk-Coalescing Optimization:

  • Updated the get method in src/tracksdata/array/_nd_chunk_cache.py to coalesce multiple unready chunk computations into a single bounding-box computation. This reduces the number of calls to compute_func from one per chunk to one per contiguous region, and marks all affected chunks as ready after the computation.

Aggregate the not-ready chunks intersecting a get() request into a
single bounding-box slice and call compute_func once. For the
GraphArrayView path this collapses N SQL queries (and N mask
decompressions per mask) per cold frame into one.

Benchmark (benchmarks/perf_chunk_coalescing.py, 4x4 chunk grid per
512x512 frame, 50 frames x 50 nodes x 32x32 masks):

                                  before              after
  cold whole frame t=10           16 exec 122.4 ms    1 exec  13.4 ms
  cached whole frame t=10          0 exec   0.4 ms    0 exec   0.3 ms
  cold one chunk t=11              1 exec  10.0 ms    1 exec   9.1 ms
  rest of t=11                    15 exec 105.6 ms    1 exec  11.5 ms
  3 cold frames t=12..14          48 exec 349.0 ms    3 exec  34.8 ms

Public API unchanged: NDChunkCache.get/invalidate signatures
unchanged; GraphArrayView._fill_array signature unchanged.
@codecov-commenter

codecov-commenter commented May 31, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.82%. Comparing base (a4f7607) to head (4572391).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #298      +/-   ##
==========================================
+ Coverage   87.75%   87.82%   +0.07%     
==========================================
  Files          57       57              
  Lines        4947     5003      +56     
  Branches      872      877       +5     
==========================================
+ Hits         4341     4394      +53     
- Misses        381      384       +3     
  Partials      225      225              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yfukai yfukai left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yfukai yfukai marked this pull request as ready for review June 7, 2026 22:18
@yfukai

yfukai commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

Faling CI is codecov-related (not the code issue).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants