Skip to content

proximity: fix bounded GREAT_CIRCLE crash on dask backends#2722

Open
brendancol wants to merge 2 commits into
mainfrom
deep-sweep-accuracy-proximity-2026-05-29
Open

proximity: fix bounded GREAT_CIRCLE crash on dask backends#2722
brendancol wants to merge 2 commits into
mainfrom
deep-sweep-accuracy-proximity-2026-05-29

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Fixes #2721

Problem

Calling proximity / allocation / direction with distance_metric='GREAT_CIRCLE' and a finite max_distance raises ValueError: The overlapping depth N is larger than your array on dask-backed rasters (both dask+numpy and dask+cupy). The numpy and cupy backends accept the same call and return correct results.

The map_overlap overlap depth was computed as max_distance / cellsize. The cellsize is in degrees, but max_distance for GREAT_CIRCLE is in metres, so the depth came out to hundreds of thousands of pixels.

Fix

Measure the per-pixel pitch with the active distance metric (great-circle metres between adjacent cells) rather than the raw degree cellsize, so the overlap depth is in pixels for every metric. Applied to both the dask+numpy path (_process_dask) and the dask+cupy path (_process_dask_cupy). EUCLIDEAN and MANHATTAN are unaffected because their max_distance already shares units with the cellsize. The now-unused get_dataarray_resolution import is dropped.

Tests

Added test_great_circle_dask_bounded_matches_numpy, parametrized over proximity/allocation/direction, asserting the dask result matches the numpy result. It fails with ValueError on the old code and passes after the fix. Full test_proximity.py suite passes on all four backends (numpy, dask+numpy, cupy, dask+cupy).

Found via an accuracy sweep (Cat 4 projection/unit handling, Cat 5 backend parity).

The dask map_overlap pad depth was computed as max_distance / cellsize,
but cellsize is in degrees while max_distance is in metres for the
GREAT_CIRCLE metric. The resulting depth (hundreds of thousands of
pixels) made dask raise ValueError on valid input that the numpy and
cupy backends handle fine.

Measure the per-pixel pitch with the active distance metric instead of
the raw degree cellsize, so the overlap depth is in pixels for all
metrics. EUCLIDEAN and MANHATTAN are unaffected (max_distance shares
units with the cellsize).

Adds a regression test covering proximity/allocation/direction.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bounded GREAT_CIRCLE proximity/allocation/direction raises ValueError on dask rasters

1 participant