diff --git a/.claude/sweep-metadata-state.csv b/.claude/sweep-metadata-state.csv index d1f3f853..870e2fd5 100644 --- a/.claude/sweep-metadata-state.csv +++ b/.claude/sweep-metadata-state.csv @@ -1,4 +1,5 @@ module,last_inspected,issue,severity_max,categories_found,notes +contour,2026-05-29,2700,HIGH,1;5,"Audited 2026-05-29 (agent-ab7fff484a8f57de2 worktree, branch deep-sweep-metadata-contour-2026-05-29). CUDA available; cupy and dask+cupy paths exercised live. contours() returns a list of (level, ndarray) tuples or a GeoDataFrame, not a DataArray, so Cat 2/3 DataArray checks reinterpreted as coordinate-transform + CRS propagation. Coordinate transform (np.interp over input dims, descending y respected) is correct and identical across all 4 backends (tracing is host-side via _contours_numpy). Cat 4 N/A: library convention is NaN-as-nodata; slope/aspect/curvature/focal do not read attrs['nodatavals'] either, so contour not reading it is consistent, not a bug. NEW HIGH finding #2700 (Cat 1/Cat 5): contours(return_type='geopandas') crashed with 'Assigning CRS to a GeoDataFrame without a geometry column is not supported' whenever the input had attrs['crs'] but the result was empty (flat raster, levels outside data range) because _to_geopandas built gpd.GeoDataFrame([], crs=crs) with no geometry column; separately the all-NaN early-return passed crs=None and silently dropped the CRS. Fix (PR #2708): _to_geopandas builds an empty frame with an explicit geometry column so the CRS attaches; all-NaN early-return forwards agg.attrs['crs']. Both empty paths now return a well-formed empty GeoDataFrame carrying the CRS. 4 new tests in TestGeoDataFrame cover populated-CRS, empty-with-CRS, all-NaN-with-CRS, and empty-without-CRS. Full contour suite 28 passed. numpy-return path emits no DataArray attrs by design (list of tuples)." aspect,2026-05-29,2682,MEDIUM,4;5,"Audited 2026-05-29 (agent-a3b7c82e34312ffcb worktree, branch deep-sweep-metadata-aspect-2026-05-29). CUDA available; all 4 backends (numpy/cupy/dask+numpy/dask+cupy) run live for aspect/northness/eastness across planar and geodesic methods. Cat 1 attrs, Cat 2 coords, Cat 3 dims, and .name all preserved correctly on every backend: the 3 public functions re-emit coords=agg.coords, dims=agg.dims, attrs=agg.attrs at the xr.DataArray constructor. NEW MEDIUM finding #2682 (Cat 4 + Cat 5): the planar dask backends (_run_dask_numpy, _run_dask_cupy) called map_overlap with a default-dtype meta (np.array(()) / cupy.array(())), so the lazy DataArray advertised float64 while the chunk functions _cpu / _run_cupy cast to and return float32. numpy and cupy backends already reported float32, and the geodesic dask paths already passed dtype=np.float32, so only the two planar dask paths were inconsistent: a backend-inconsistent metadata bug where agg.dtype differs by backend and silently flips float64->float32 on .compute(). Fix in PR #2741: pass dtype=np.float32 / dtype=cupy.float32 to the planar dask meta. northness/eastness derive from aspect so they inherit the corrected dtype. 5 new tests (test_dask_numpy_advertised_dtype_matches_computed parametrized over 4 boundary modes, plus test_dask_cupy_advertised_dtype_matches_computed) assert lazy dtype == computed dtype == float32. Full aspect suite 69 passed. slope.py and curvature.py share the same default-dtype meta pattern on their planar dask paths (out of scope for this aspect-only sweep; likely same inconsistency). No CRITICAL/HIGH/LOW findings." geotiff,2026-05-18,1909,HIGH,4;5,"Re-audit 2026-05-15 (agent-a55b69cec1ef2a092 worktree, branch deep-sweep-metadata-geotiff-2026-05-15). 4-backend (numpy/cupy/dask+numpy/dask+cupy) parity reverified after the #1813 modular refactor: full reads, windowed reads, multi-band, band=N selection, no-georef integer pixel coords, crs/crs_wkt/transform/nodata/x_resolution/y_resolution/resolution_unit/image_description/gdal_metadata all agree across backends. DataArray .name and dims agree (y, x for 2D; y, x, band for 3D). NEW HIGH finding #1909: GDS chunked GPU path (_read_geotiff_gpu_chunked_gds) declared the dask graph dtype as float64 when source had an in-range integer nodata sentinel, matching the CPU dask path's #1597 contract, but the per-chunk _chunk_task did not cast its returned cupy array to declared_dtype -- chunks with no sentinel hit returned the raw uint16/int16 source dtype, producing a silent declared/actual dtype mismatch. Fix mirrors the #1597 + #1624 CPU dask pattern: compute declared_dtype before defining _chunk_task, cast inside the task only when arr.dtype != declared_dtype to skip the no-op astype(copy=True). 6 regression tests added in test_chunked_gpu_declared_dtype_1909.py covering declared vs computed parity, CPU/GPU dask declared-dtype agreement, eager paths preserve source dtype, no-nodata round-trip, explicit dtype= kwarg, and sentinel-hit float64 promotion. Pre-existing test failures in test_predictor2_big_endian_gpu_1517.py and test_size_param_validation_gpu_vrt_1776.py exist on main (read_to_array AttributeError after #1813 refactor, tile_size=4 rejected by stricter _validate_tile_size_arg) and are unrelated to this audit. | Re-audited 2026-05-18 (agent-a59a61958f181c31a worktree, branch deep-sweep-metadata-geotiff-2026-05-18). 4-backend (numpy / cupy / dask+numpy / dask+cupy) metadata parity reverified end-to-end: open_geotiff over a tiled uint16 fixture with crs + transform + GDAL_NODATA sentinel emits identical attrs across all 4 backends (crs=32633, crs_wkt, transform 6-tuple, nodata=5, masked_nodata=True, _xrspatial_geotiff_contract=2, extra_tags, image_description, resolution_unit, x_resolution, y_resolution). Multi-band 3D (y, x, band) with band coord, no-georef int64 pixel coords, windowed reads with transform origin shift, and mask_nodata=False keeping integer dtype all agree across the 4 backends. Write round-trip via to_geotiff (numpy, cupy, dask streaming) re-emits crs / transform / nodata / masked_nodata / contract version with byte-stable transform. Band-first (band, y, x) input correctly remaps to (y, x, band) on disk. _populate_attrs_from_geo_info, _set_nodata_attrs, and _extract_rich_tags centralise attrs emission across all read paths (_init_, _backends/dask, _backends/gpu, _backends/vrt) and write paths (_writers/eager, _writers/gpu, _writers/vrt). _ATTRS_CONTRACT_VERSION=2 is stamped on every path including the chunked GPU GDS and chunked VRT inline-attrs branches. No new CRITICAL/HIGH/MEDIUM/LOW findings." polygonize,2026-05-19,2149,MEDIUM,1,"Audited 2026-05-19 (agent-ad1070530d37a4fdf worktree, branch deep-sweep-metadata-polygonize-2026-05-19). Output is vector (column, polygon_points / GeoDataFrame / GeoJSON dict / awkward) so Cat 2/3 do not apply in the DataArray sense. Cat 1 MEDIUM finding #2149: GeoDataFrame output drops raster.attrs['crs'] (and crs_wkt and rioxarray rio.crs); GeoDataFrame.crs is always None even when input is georeferenced. Fix: new _detect_raster_crs helper + crs= kwarg threaded into _to_geopandas; df.set_crs is called when a CRS is detected. spatialpandas has no CRS slot and GeoJSON RFC 7946 is WGS84-only, so propagation lives only on the geopandas path. CRS propagation runs at the public API level so all 4 backends (numpy / cupy / dask+numpy / dask+cupy) propagate consistently -- verified end-to-end with EPSG:4326 attrs across all 4 backends. 8 new tests in TestPolygonizeCRSPropagation cover EPSG string/int, crs_wkt, no CRS, unparseable CRS, attrs-vs-rioxarray preference, rioxarray-only path, and simplify interaction. Cat 2 LOW (not fixed): output coords are pixel-space when input has georeferenced x/y or attrs['transform']; user must pass transform= explicitly. Documented behavior, leave as-is. Cat 4 LOW (not fixed): nodatavals from input attrs is not auto-applied as a mask; documented behavior (explicit mask= kwarg)." diff --git a/xrspatial/contour.py b/xrspatial/contour.py index 6e38ba7f..fb4f9c5a 100644 --- a/xrspatial/contour.py +++ b/xrspatial/contour.py @@ -548,6 +548,14 @@ def _to_geopandas(results, crs=None): geom = LineString(coords[:, ::-1]) records.append({'level': level, 'geometry': geom}) + if not records: + # An empty records list has no geometry column, so geopandas refuses + # to attach a CRS. Build the frame with an explicit empty geometry + # column so the CRS still propagates on an empty result. + return gpd.GeoDataFrame( + {'level': [], 'geometry': gpd.GeoSeries([])}, crs=crs + ) + gdf = gpd.GeoDataFrame(records, crs=crs) return gdf @@ -628,7 +636,9 @@ def contours( vmax = float(np.nanmax(agg.values)) if np.isnan(vmin) or np.isnan(vmax): - return [] if return_type == "numpy" else _to_geopandas([], None) + if return_type == "numpy": + return [] + return _to_geopandas([], crs=agg.attrs.get('crs', None)) # Exclude exact min/max to avoid tracing along the boundary. levels = np.linspace(vmin, vmax, n_levels + 2)[1:-1] diff --git a/xrspatial/tests/test_contour.py b/xrspatial/tests/test_contour.py index ae8c7764..3576f034 100644 --- a/xrspatial/tests/test_contour.py +++ b/xrspatial/tests/test_contour.py @@ -325,6 +325,60 @@ def test_geopandas_return(self): assert 'geometry' in gdf.columns assert len(gdf) > 0 + def test_geopandas_propagates_crs(self): + """A populated geopandas result carries the input raster's CRS.""" + pytest.importorskip("geopandas") + data = _make_peak() + agg = create_test_raster(data, backend='numpy') # attrs include a crs + gdf = contours(agg, levels=[1.5], return_type="geopandas") + assert len(gdf) > 0 + assert gdf.crs == agg.attrs['crs'] + + def test_geopandas_empty_result_keeps_crs(self): + """Levels with no crossings return an empty GeoDataFrame with the CRS. + + Regression for #2700: gpd.GeoDataFrame(records, crs=crs) raised + ValueError when records was empty and crs was not None. + """ + pytest.importorskip("geopandas") + import geopandas as gpd + data = np.ones((4, 4), dtype=np.float64) # flat -> no crossings + agg = create_test_raster(data, backend='numpy') # attrs include a crs + gdf = contours(agg, levels=[5.0], return_type="geopandas") + assert isinstance(gdf, gpd.GeoDataFrame) + assert len(gdf) == 0 + assert 'level' in gdf.columns + assert 'geometry' in gdf.columns + assert gdf.crs == agg.attrs['crs'] + + def test_geopandas_all_nan_keeps_crs(self): + """All-NaN input with auto levels keeps the CRS on the empty frame. + + Regression for #2700: the all-NaN early-return path dropped the CRS. + """ + pytest.importorskip("geopandas") + import geopandas as gpd + data = np.full((4, 4), np.nan, dtype=np.float64) + agg = create_test_raster(data, backend='numpy') # attrs include a crs + gdf = contours(agg, return_type="geopandas") + assert isinstance(gdf, gpd.GeoDataFrame) + assert len(gdf) == 0 + assert gdf.crs == agg.attrs['crs'] + + def test_geopandas_empty_result_no_crs(self): + """An empty result with no input CRS returns an empty frame, no crash.""" + pytest.importorskip("geopandas") + import geopandas as gpd + data = np.ones((4, 4), dtype=np.float64) + agg = xr.DataArray( + data, dims=['y', 'x'], + coords={'y': np.linspace(2, 0, 4), 'x': np.linspace(0, 2, 4)}, + ) + gdf = contours(agg, levels=[5.0], return_type="geopandas") + assert isinstance(gdf, gpd.GeoDataFrame) + assert len(gdf) == 0 + assert gdf.crs is None + def test_invalid_return_type(self): data = _make_peak() agg = create_test_raster(data, backend='numpy')