Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .claude/sweep-metadata-state.csv
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ polygonize,2026-05-19,2149,MEDIUM,1,"Audited 2026-05-19 (agent-ad1070530d37a4fdf
rasterize,2026-05-27,2504,HIGH,4,"rasterize() drops like.attrs, rebuilds like.coords via linspace (not bit-identical), and never emits _FillValue/nodatavals even when fill is non-NaN. Cat 1 HIGH: chained pipelines like slope(rasterize(gdf, like=elevation)) silently lose crs/res/transform. Cat 2 MEDIUM: linspace round-trip from re-derived bounds breaks xr.align with like. Cat 4 MEDIUM: rasterize(..., fill=-9999, dtype=int32) emits no _FillValue. All 4 backends share the same final return so the fix is one place. Fixed in deep-sweep-metadata-rasterize-2026-05-17-01 (worktree agent-ab7a9aee97c1e4cdf): _extract_grid_from_like now returns coords/attrs; rasterize() reuses like.coords directly when grid matches, copies like.attrs, and emits _FillValue + nodatavals when fill is not NaN. 9 new tests in TestMetadataPropagation cover attrs propagation, bit-identical coord reuse, fill-value emission, isolation from template attrs, and parity across numpy/cupy/dask+numpy/dask+cupy backends. Full test suite (193 passing) clean. | Re-audited 2026-05-21 (agent-a645dc07f847ae8ae worktree, branch deep-sweep-metadata-rasterize-2026-05-21). 4-backend (numpy/cupy/dask+numpy/dask+cupy) metadata parity reverified: all 4 backends route through the same final xr.DataArray constructor in rasterize(); crs / spatial_ref non-dim coord / coords / dims agree across backends. NEW HIGH finding #2251 (Cat 1): when rasterize(geoms, like=template, bounds=..., width=..., height=..., resolution=...) overrides the grid relative to like, the inherited attrs['transform'] and attrs['res'] from like are propagated unchanged so they describe the template's grid, not the actual output. get_dataarray_resolution() prefers attrs['res'] over calc_res from coords, so downstream slope/aspect/proximity see the wrong cellsize. Same class as #1407 sky_view_factor bug. Fix in rasterize(): out_attrs.pop('res') / out_attrs.pop('transform') when like_attrs is present but reuse_like_coords is False (output grid != template grid). Preserves crs / nodata triplet / spatial_ref handling. 9 new tests in TestLikeStaleGridAttrs2251 cover bounds override, width/height override, resolution override, matching width/height preserves attrs, get_dataarray_resolution consistency, and parity across all 4 backends. Full rasterize test suite (224 passed, 2 skipped) clean. | Re-audited 2026-05-27 (agent-ae44e871ba3e6bc50 worktree, branch deep-sweep-metadata-rasterize-2026-05-27). 4-backend (numpy/cupy/dask+numpy/dask+cupy) metadata parity reverified end-to-end with explicit cupy and dask+cupy live runs on the CUDA host. attrs / coords / dims / non-dim coords (spatial_ref) all agree across backends; the existing TestMetadataPropagation and TestLikeStaleGridAttrs2251 suites still pass cleanly. NEW HIGH finding #2504 (Cat 4): rasterize(..., dtype=<int>) with the default fill=np.nan silently coerced NaN to a platform-specific sentinel (INT_MIN on x86, 0 on Apple Silicon, 0 for unsigned dtypes) and emitted no _FillValue / nodata / nodatavals attr to mark unwritten pixels. Downstream consumers (geotiff writer, rioxarray masks) had no sentinel to key off and treated unwritten cells as legitimate burns -- a metadata propagation failure equivalent in shape to #1407. Fix in rasterize() before any host/device allocation: detect NaN fill against an integer final_dtype via np.issubdtype + float(fill) + np.isnan and raise ValueError with a pointer to fill=0/fill=-9999 or a floating dtype. Same guard fires on all 4 backends because it runs before backend dispatch. 18 new tests in test_rasterize_nan_int_fill_2504.py cover every signed/unsigned int width, the like=<int dtype> branch, all 4 backends, explicit-vs-default NaN, numpy-typed NaN, and the unaffected float-dtype path. The previous TestIntegerDtypeNanFill test (which had pinned the silent cast as observed behaviour on 2026-05-17) was rewritten to pin the raise. Full rasterize test suite (476 passed, 2 skipped) clean."
reproject,2026-05-10,1572;1573,HIGH,1;3;4,geoid_height_raster dropped input attrs and used dims[-2:] for 3D inputs (#1572). reproject/merge ignored nodatavals (rasterio convention) when rioxarray absent (#1573). Fixed in same branch.
resample,2026-05-27,2542,MEDIUM,2;4;5,"Audited 2026-05-27 (agent-a8135a6a246ecb93c worktree, branch deep-sweep-metadata-resample-2026-05-27). Cat 2 MEDIUM + Cat 4 MEDIUM + Cat 5 MEDIUM all rolled into issue #2542. (a) 2D non-identity path dropped scalar non-dim coords like rioxarrays spatial_ref and squeezed time/band selectors; identity path (scale==1.0, agg.copy()) and 3D path (per-band xr.concat) preserved them, so the bug was path-inconsistent (Cat 5). (b) _resolve_nodata reads attrs[nodata] as a fallback sentinel but the output post-processing only refreshed _FillValue and nodatavals, leaving attrs[nodata]=-9999 alongside data that was now NaN. Fix in resample(): refresh attrs[nodata] to NaN whenever the input had it, and carry across zero-dim non-dim coords on the 2D non-identity path. 7 new tests in TestMetadataPropagation cover nodata-attr refresh, spatial_ref/scalar coord carry, identity-vs-downsample coord parity, and the explicit choice to drop spatially-shaped extra coords. 4-backend (numpy/cupy/dask+numpy/dask+cupy) parity verified for spatial_ref carry; nodata-attr refresh verified on numpy/cupy/dask+numpy (dask+cupy non-NaN nodata masking hits a pre-existing xarray xr.where + cupy.astype quirk unrelated to this audit). Full resample test suite (175 passed) clean."
viewshed,2026-05-29,2743,MEDIUM,4;5,output .name differed across backends (None/viewshed/dask-token) and dtype float32 on GPU vs float64 on CPU; added name= param and forced float64 on all backends; attrs/coords/dims already preserved
zonal,2026-05-29,2611,MEDIUM,5,"Audited 2026-05-29 (agent-ae8d8b65cc3a5c40a worktree, branch deep-sweep-metadata-zonal-2026-05-29). CUDA available; all 4 backends (numpy/cupy/dask+numpy/dask+cupy) run live. 5 DataArray-returning functions checked end-to-end: apply, regions, hypsometric_integral, trim, crop. attrs (res/crs/transform/nodatavals), dims, and coords preserved correctly on all 4 backends for every function; trim/crop slice coords with no half-pixel drift. stats() and crosstab() return DataFrames by design so Cat 1-3 DataArray checks N/A. NEW MEDIUM finding #2611 (Cat 5): apply() never set output .name, so numpy/cupy returned None while dask+numpy/dask+cupy inherited a non-deterministic internal dask task name (e.g. _chunk_fn-<hash>). regions/hypsometric_integral/trim/crop all set deterministic names; apply was the outlier. Fix in PR #2611/#2622: add name param (default None) and assign result.name after DataArray construction (setting name= at construction does not override the dask graph name). New parametrized test test_apply_name_consistent_across_backends covers default-None and explicit-name on all 4 backends. Full zonal suite 213 passed. No other CRITICAL/HIGH/MEDIUM findings; no LOW findings to document."
12 changes: 9 additions & 3 deletions xrspatial/gpu_rtx/viewshed.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# that the required dependent libraries are installed.

import math
from typing import Union
from typing import Optional, Union

import cupy
import numba as nb
Expand Down Expand Up @@ -187,6 +187,7 @@ def _viewshed_rt(
observer_elev: float,
target_elev: float,
scale: float,
name: Optional[str] = 'viewshed',
) -> xr.DataArray:

H, W = raster.shape
Expand Down Expand Up @@ -257,9 +258,12 @@ def _viewshed_rt(
else:
visgrid = d_visgrid

# Emit float64 to match the CPU backends (the RT kernel works in float32)
visgrid = visgrid.astype(np.float64)

view = xr.DataArray(
visgrid,
name="viewshed",
name=name,
coords=raster.coords,
dims=raster.dims,
attrs=raster.attrs)
Expand All @@ -273,6 +277,7 @@ def viewshed_gpu(
y: Union[int, float],
observer_elev: float,
target_elev: float,
name: Optional[str] = 'viewshed',
) -> xr.DataArray:
if not isinstance(raster.data, cupy.ndarray):
raise TypeError("raster.data must be a cupy array")
Expand All @@ -283,4 +288,5 @@ def viewshed_gpu(
optix = RTX()
scale = create_triangulation(raster, optix)

return _viewshed_rt(raster, optix, x, y, observer_elev, target_elev, scale)
return _viewshed_rt(raster, optix, x, y, observer_elev, target_elev,
scale, name)
70 changes: 70 additions & 0 deletions xrspatial/tests/test_viewshed.py
Original file line number Diff line number Diff line change
Expand Up @@ -459,3 +459,73 @@ def test_viewshed_cpu_memory_guard_passes_with_max_distance():
v = viewshed(raster, x=50.0, y=50.0, observer_elev=5,
max_distance=3.0)
assert v.values[50, 50] == 180.0


def _make_raster(backend):
"""Build a small viewshed input raster for the given backend."""
arr = np.array([
[0, 0, 1, 0, 0],
[1, 3, 0, 0, 0],
[10, 2, 5, 2, -1],
[11, 1, 2, 9, 0]], dtype=np.float64)
xs = np.linspace(1, 5, 5)
ys = np.linspace(1, 4, 4)
attrs = {'res': (1.0, 1.0), 'crs': 'EPSG:4326'}

if backend == "numpy":
data = arr
elif backend == "dask+numpy":
data = da.from_array(arr, chunks=(2, 3))
elif backend == "cupy":
import cupy as cp
data = cp.asarray(arr)
elif backend == "dask+cupy":
import cupy as cp
data = da.from_array(cp.asarray(arr), chunks=(2, 3))
else:
raise ValueError(backend)

return xa.DataArray(data, dims=['y', 'x'],
coords={'y': ys, 'x': xs},
attrs=attrs, name='elevation')


_METADATA_BACKENDS = [
"numpy",
"dask+numpy",
pytest.param("cupy", marks=pytest.mark.skipif(
not has_rtx(), reason="requires rtxpy for the GPU viewshed path")),
pytest.param("dask+cupy", marks=pytest.mark.skipif(
not has_rtx(), reason="requires rtxpy for the GPU viewshed path")),
]


@pytest.mark.parametrize("backend", _METADATA_BACKENDS)
@pytest.mark.parametrize("max_distance", [None, 3.0])
def test_viewshed_output_name_and_dtype_consistent(backend, max_distance):
"""Output .name and dtype must not depend on the backend (#2743).

The default name is 'viewshed' on every backend, and the output dtype
is float64 everywhere (the GPU path internally works in float32).
"""
raster = _make_raster(backend)
result = viewshed(raster, x=3, y=2, observer_elev=1,
max_distance=max_distance)

assert result.name == "viewshed"
assert result.dtype == np.float64
assert result.dims == raster.dims
# attrs/coords still pass through
assert result.attrs == raster.attrs
np.testing.assert_allclose(result.coords['x'].data,
raster.coords['x'].data)
np.testing.assert_allclose(result.coords['y'].data,
raster.coords['y'].data)


@pytest.mark.parametrize("backend", _METADATA_BACKENDS)
def test_viewshed_custom_name(backend):
"""A user-supplied name is honoured on every backend."""
raster = _make_raster(backend)
result = viewshed(raster, x=3, y=2, observer_elev=1, name="my_vs")
assert result.name == "my_vs"
34 changes: 22 additions & 12 deletions xrspatial/viewshed.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from math import atan, atan2, fabs
from math import pi as PI
from math import sqrt
from typing import Union
from typing import Optional, Union

import numpy as np
import xarray
Expand Down Expand Up @@ -1510,6 +1510,7 @@ def _viewshed_cpu(
y: Union[int, float],
observer_elev: float = OBS_ELEV,
target_elev: float = TARGET_ELEV,
name: Optional[str] = 'viewshed',
) -> xarray.DataArray:

height, width = raster.shape
Expand Down Expand Up @@ -1598,6 +1599,7 @@ def _viewshed_cpu(
visibility_grid)

visibility = xarray.DataArray(viewshed_img,
name=name,
coords=raster.coords,
attrs=raster.attrs,
dims=raster.dims)
Expand All @@ -1609,7 +1611,8 @@ def viewshed(raster: xarray.DataArray,
y: Union[int, float],
observer_elev: float = OBS_ELEV,
target_elev: float = TARGET_ELEV,
max_distance: float = None) -> xarray.DataArray:
max_distance: float = None,
name: Optional[str] = 'viewshed') -> xarray.DataArray:
"""
Calculate viewshed of a raster (the visible cells in the raster)
for the given viewpoint (observer) location.
Expand All @@ -1634,6 +1637,9 @@ def viewshed(raster: xarray.DataArray,
evaluated. When set and the raster is dask-backed, only the
chunks within the distance window are loaded — this is the most
efficient way to run viewshed on very large dask rasters.
name : str, default='viewshed'
Name of the output DataArray. Set on every backend so the
result name does not depend on which backend ran.

Returns
-------
Expand Down Expand Up @@ -1709,26 +1715,28 @@ def viewshed(raster: xarray.DataArray,
# --- max_distance: extract spatial window for any backend ---
if max_distance is not None:
return _viewshed_windowed(raster, x, y, observer_elev, target_elev,
max_distance)
max_distance, name)

if isinstance(raster.data, np.ndarray):
return _viewshed_cpu(raster, x, y, observer_elev, target_elev)
return _viewshed_cpu(raster, x, y, observer_elev, target_elev, name)

elif has_cuda_and_cupy() and is_cupy_array(raster.data):
if has_rtx():
# Run on gpu
from .gpu_rtx.viewshed import viewshed_gpu
return viewshed_gpu(raster, x, y, observer_elev, target_elev)
return viewshed_gpu(raster, x, y, observer_elev, target_elev, name)
else:
# Convert to numpy and run on cpu
import cupy as cp
raster.data = cp.asnumpy(raster.data)
return _viewshed_cpu(raster, x, y, observer_elev, target_elev)
return _viewshed_cpu(raster, x, y, observer_elev, target_elev,
name)

elif has_dask_array():
import dask.array as da
if isinstance(raster.data, da.Array):
return _viewshed_dask(raster, x, y, observer_elev, target_elev)
return _viewshed_dask(raster, x, y, observer_elev, target_elev,
name)

raise TypeError(f"Unsupported raster array type: {type(raster.data)}")

Expand Down Expand Up @@ -2049,7 +2057,7 @@ def _viewshed_distance_sweep(dask_data, H, W, obs_r, obs_c,


def _viewshed_windowed(raster, x, y, observer_elev, target_elev,
max_distance):
max_distance, name='viewshed'):
"""Run viewshed on a spatial window around the observer.

Works for any backend: numpy, cupy, dask+numpy, dask+cupy. The window
Expand Down Expand Up @@ -2146,11 +2154,11 @@ def _viewshed_windowed(raster, x, y, observer_elev, target_elev,
full_vis = np.full((height, width), INVISIBLE, dtype=np.float64)
full_vis[r_lo:r_hi, c_lo:c_hi] = local_vals

return xarray.DataArray(full_vis, coords=raster.coords,
return xarray.DataArray(full_vis, name=name, coords=raster.coords,
dims=raster.dims, attrs=raster.attrs)


def _viewshed_dask(raster, x, y, observer_elev, target_elev):
def _viewshed_dask(raster, x, y, observer_elev, target_elev, name='viewshed'):
"""Dask-backed viewshed (no max_distance — handled by caller).

Two-tier strategy:
Expand Down Expand Up @@ -2202,8 +2210,10 @@ def _viewshed_dask(raster, x, y, observer_elev, target_elev):
observer_elev, target_elev)
result_np = result.data if isinstance(result.data, np.ndarray) \
else result.data.get()
# GPU path returns float32; emit float64 to match the CPU backends
result_np = result_np.astype(np.float64, copy=False)
vis_da = da.from_array(result_np, chunks=raster.data.chunks)
return xarray.DataArray(vis_da, coords=raster.coords,
return xarray.DataArray(vis_da, name=name, coords=raster.coords,
dims=raster.dims, attrs=raster.attrs)

# --- Tier C: out-of-core distance sweep (CPU only) ---
Expand Down Expand Up @@ -2241,5 +2251,5 @@ def _viewshed_dask(raster, x, y, observer_elev, target_elev):
)

vis_da = da.from_array(visibility, chunks=raster.data.chunks)
return xarray.DataArray(vis_da, coords=raster.coords,
return xarray.DataArray(vis_da, name=name, coords=raster.coords,
dims=raster.dims, attrs=raster.attrs)
Loading