perf(load): short-circuit is_chunked_array for numpy arrays by FBumann · Pull Request #11354 · pydata/xarray

FBumann · 2026-05-25T16:51:49Z

Description

Dataset.load() / DataArray.load() walk every variable and call is_chunked_array(...) on each — first inside the dict comprehension in Dataset.load (xarray/core/dataset.py:563), then again per variable via Variable.load -> to_duck_array. is_chunked_array itself called is_duck_array twice (once directly, once via is_duck_dask_array).

For the dominant case — a numpy.ndarray — none of that work is needed. This PR:

Adds an np.ndarray fast path returning False immediately.
Consolidates to a single is_duck_array(x) check feeding both the dask-collection branch (now via is_dask_collection directly) and the hasattr(x, "chunks") fallback.

Behavior is unchanged on non-numpy inputs: any duck-array with a dask graph or a chunks attribute is still reported as chunked.

Where this fires

is_chunked_array is called from ~25 places across xarray — every site that branches "dask vs. numpy". On numpy-backed data, all of them now skip the dispatch chain. Notable categories:

Materialization: ds.load(), da.load(), compute(), xr.load_dataset/dataarray/datatree, persist(), .values, .to_numpy(), .to_dataframe(), .to_pandas(), plotting, repr previews.
CF encode/decode — every variable read or written: coding/times.py:1037,1249, coding/strings.py:182,221, coding/common.py:104.
Numerical paths — apply_ufunc (computation/apply_ufunc.py:739,763,875), corr/cov/polyval/polyfit, interp, interpolate_na, reductions.
Structural ops — Dataset.chunk(), interp, vectorized indexer dispatch (isel/sel), Variable._shuffle, contains_only_chunked_or_numpy.
Groupby internals — groupers.py:121,242,427, groupby.py:351,557,679.
Backends — backends/common.py:400 (ArrayWriter.add).

Not affected: arithmetic on lazy/dask objects (stays lazy); arithmetic on numpy-backed objects (already pure numpy, never reached is_chunked_array).

Benchmark numbers

asv_bench/benchmarks/indexing.py::Indexing.time_indexing_basic_ds_large (added in #9003 for this exact concern), best of 5×50 iterations, GC off:

	per call
`main`	0.524 ms
this PR	0.335 ms
speedup	~1.56×

Scaling check on synthetic isel().load() workloads:

Speedup scales with variable count — 1.20× at 50 vars → 1.29× at 2000 vars — as expected for a per-variable saving.
Speedup is flat across per-var size (~1.24× from size=0 to size=10,000), confirming the saving is pure dispatch overhead, not work-per-element.

Profiler attribution previously showed ~25% of load() wall time inside the is_chunked_array dispatch chain; that portion is now near-free.

Checklist

Tests covering chunked paths preserved — any duck-array with a chunks attribute or a dask graph is still reported as chunked
pytest xarray/tests/test_variable.py — 544 passed, 69 skipped, 8 xfailed, 3 xpassed
doc/whats-new.rst entry under Internal Changes

Follow-up

A complementary PR — skipping the entire Variable.load dispatch on numpy data — will follow.

AI Disclosure

This PR contains AI-generated content.
- I have tested any AI-generated content in my PR.
- I take responsibility for any AI-generated content in my PR.

Tools: Claude (Claude Code)

[This is Claude Code on behalf of Felix Bumann]

For datasets with many variables, Dataset.load() called is_chunked_array once per variable in its dict comprehension, then again per variable via Variable.load() -> to_duck_array(). The function itself called is_duck_array twice (once directly, once via is_duck_dask_array). Add a numpy fast-path and consolidate the duck-array check to one call. For non-numpy inputs the behavior is unchanged: any duck-array with a dask graph or a `chunks` attribute is still reported as chunked. Measured on isel(...).load() of a 400-scalar-var dataset (asv_bench/benchmarks/indexing.py::Indexing.time_indexing_basic_ds_large): base: 0.524 ms / call (best of 5x50, GC off) branch: 0.335 ms / call ~1.56x Profile attribution previously showed ~25% of the load wall time inside the is_chunked_array dispatch chain; that portion is now near-free. Closes #2 on the fork. Co-authored-by: Claude <noreply@anthropic.com>

for more information, see https://pre-commit.ci

The previous `isinstance(x, np.ndarray)` short-circuit incorrectly returned False for ndarray subclasses with a `chunks` attribute (e.g. DummyChunkedArray in test_parallelcompat.py, or any third-party chunked array implementation that subclasses ndarray), breaking chunked-array detection on those types. Narrow the fast path to `isinstance + not hasattr("chunks")` so plain ndarrays and non-chunked subclasses (MaskedArray, np.matrix) still skip the duck-array dispatch, while subclasses that advertise chunks fall through to the full check. Co-authored-by: Claude <noreply@anthropic.com>

dcherian

Great change. Thanks!

Illviljan

I think this can be done with is_duck_array, it has the numpy short-circuit already.
Triggering isinstance twice for eager (and lazy) arrays seems wasteful too.

xarray/xarray/namedarray/utils.py

Lines 78 to 91 in d022da5

    
           def is_duck_array(value: Any) -> TypeGuard[duckarray[Any, Any]]: 
        
               # TODO: replace is_duck_array with runtime checks via _arrayfunction_or_api protocol on 
        
               # python 3.12 and higher (see https://github.com/pydata/xarray/issues/8696#issuecomment-1924588981) 
        
               if isinstance(value, np.ndarray): 
        
                   return True 
        
               return ( 
        
                   hasattr(value, "ndim") 
        
                   and hasattr(value, "shape") 
        
                   and hasattr(value, "dtype") 
        
                   and ( 
        
                       (hasattr(value, "__array_function__") and hasattr(value, "__array_ufunc__")) 
        
                       or hasattr(value, "__array_namespace__") 
        
                   ) 
        
               )

FBumann and others added 2 commits May 23, 2026 19:21

[pre-commit.ci] auto fixes from pre-commit.com hooks

d008f82

for more information, see https://pre-commit.ci

github-actions Bot added the topic-NamedArray Lightweight version of Variable label May 25, 2026

FBumann mentioned this pull request May 25, 2026

perf(load): skip Variable.load dispatch for numpy data #11355

Open

6 tasks

dcherian approved these changes May 26, 2026

View reviewed changes

Illviljan requested changes May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(load): short-circuit is_chunked_array for numpy arrays#11354

perf(load): short-circuit is_chunked_array for numpy arrays#11354
FBumann wants to merge 3 commits into
pydata:mainfrom
FBumann:perf/load-chunked-check-overhead

FBumann commented May 25, 2026

Uh oh!

dcherian left a comment

Uh oh!

Illviljan left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	def is_duck_array(value: Any) -> TypeGuard[duckarray[Any, Any]]:
	# TODO: replace is_duck_array with runtime checks via _arrayfunction_or_api protocol on
	# python 3.12 and higher (see https://github.com/pydata/xarray/issues/8696#issuecomment-1924588981)
	if isinstance(value, np.ndarray):
	return True
	return (
	hasattr(value, "ndim")
	and hasattr(value, "shape")
	and hasattr(value, "dtype")
	and (
	(hasattr(value, "__array_function__") and hasattr(value, "__array_ufunc__"))
	or hasattr(value, "__array_namespace__")
	)
	)

Uh oh!

Conversation

FBumann commented May 25, 2026

Description

Where this fires

Benchmark numbers

Checklist

Follow-up

AI Disclosure

Uh oh!

dcherian left a comment

Choose a reason for hiding this comment

Uh oh!

Illviljan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Illviljan left a comment •

edited

Loading