proximity/allocation/direction: consistent output dtype and .name across backends#2728
Open
brendancol wants to merge 3 commits into
Open
proximity/allocation/direction: consistent output dtype and .name across backends#2728brendancol wants to merge 3 commits into
brendancol wants to merge 3 commits into
Conversation
) The bounded dask+numpy path declared float64 output meta while the chunk function returns float32, disagreeing with numpy, cupy, dask+cupy, and the unbounded KDTree path. Set the map_overlap meta dtype to float32. Dask backends also leaked an internal dask op name (e.g. _trim-<hash>) into the result .name, where numpy/cupy return None. Reset .name to None in proximity/allocation/direction so all four backends agree. Add a parametrized regression test asserting declared dtype float32 and .name None for all three functions, both bounded and unbounded, on every backend.
brendancol
commented
May 29, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: consistent output dtype and .name across backends
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
- test_proximity.py:299 - the new test asserts
result.attrs == test_raster.attrs, but thetest_rasterfixture has empty attrs, so that line only checks empty == empty. It still guards against attrs being dropped, and the existing checks already cover attr preservation with the same fixture. Fine as-is; noting it only.
What looks good
- The dtype fix hits the exact root cause. The bounded dask+numpy
map_overlapdeclared float64 meta while the chunk function returns float32.meta=np.array((), dtype=np.float32)lines the declared dtype up with the real data and with the other three backends. - The
.name = Nonereset is the right call. xarray ignores aname=Nonekwarg when the data is a named dask array, so resetting after construction is the only thing that clears the leaked dask op name. The inline comment says why. - Changes are surgical: three identical two-line resets plus one meta dtype, nothing else touched.
- The regression test covers the bounded path (max_distance=10) and the unbounded/KDTree path (np.inf), across all four backends and all three functions. It checks the declared (pre-compute) dtype rather than the post-compute dtype, which is what general_output_checks already covers and what let this bug through.
Checklist
- Output dtype consistent across backends (float32)
- .name consistent across backends (None)
- attrs / coords / dims preserved (unchanged by this PR)
- Bounded and unbounded dask paths both covered by tests
- No premature materialization or unnecessary copies
- No benchmark needed (metadata-only fix)
- No README/docstring change needed (no API change)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2723
proximity,allocation, anddirectionreturned backend-dependent output metadata. This makes all four backends agree.float64output meta viada.map_overlap(..., meta=np.array(())), even though the chunk function returnsfloat32and every other backend/path returnsfloat32. Set the meta dtype tofloat32..name: dask backends leaked an internal dask op name (_trim-<hash>,_kdtree_chunk_fn-<hash>,asarray-<hash>) into the result.name, while numpy/cupy returnNone. Reset.nametoNoneafter building the result so all backends match. (xarray ignores aname=Nonekwarg when the data is a named dask array, so the reset has to happen after construction.)attrs, coords, and dims were already preserved and stay that way.
Verified end-to-end on all four backends, both bounded and unbounded
max_distance, for all three functions.Test plan:
float32and.name is Noneacross all four backends, both bounded and unbounded, for proximity/allocation/direction.test_proximity.pysuite passes (93 tests).