[EXPERIMENTAL]: Integrate cp-measure#982
Conversation
…to feature/add_cpmeasure
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…to feature/add_cpmeasure
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
Note to self:
|
|
Ah I noticed another thing. For the created morphology table .uns["spatialdata_attrs"] needs to be set, otherwise this runs into problem when querying the Spatialdata afterwards. |
- Resolved dependency conflicts (updated to zarr>=3) - Moved exp module to experimental to align with main - Integrated CellProfiler features into experimental module - Added centrosome and cp_measure dependencies
for more information, see https://pre-commit.ci
Moved calculate_image_features from experimental._feature to experimental.im._feature to follow the existing module structure. Now accessible as squidpy.experimental.im.calculate_image_features
- Test basic feature calculation with shapes - Test copy vs inplace behavior - Test error cases for invalid keys - Uses sdata_hne fixture with skimage:label for fast execution
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #982 +/- ##
==========================================
+ Coverage 73.82% 74.17% +0.35%
==========================================
Files 45 47 +2
Lines 7013 7610 +597
Branches 1188 1310 +122
==========================================
+ Hits 5177 5645 +468
- Misses 1349 1427 +78
- Partials 487 538 +51
🚀 New features to boost your workflow:
|
|
Looking forward to this PR 👀🥸 |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Introduces _tiling.py with build_tile_specs() and extract_tile() that split a label image into overlapping tiles where each cell is assigned to exactly one tile by centroid. Non-owned cells are zeroed out so downstream processing never double-counts. Includes 31 tests: deterministic brick-pattern grid (touching and non-touching), coverage verification, and visual regression tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
|
Refactoring in anticipation of afermg/cp_measure#38 being merged so we can upstream behaviour. |
Wires the in-progress cp_measure.featurizer + lazy-tiling refactor onto a
working _tiling.py and closes out the six open notes on the PR.
_tiling.py:
* build_tile_specs now takes (shape, cell_info), so it is agnostic to
whether labels are in memory, dask-backed, or multiscale.
* compute_cell_info is public; new compute_cell_info_multiscale (read
coarsest scale, rescale to target) and compute_cell_info_tiled
(stream tiles, merge boundary-spanning cells via additive accumulators).
* extract_tile_lazy slices an xr.DataArray and materializes only the crop;
extract_tile retained for in-memory callers.
* verify_coverage takes a label_ids set.
_feature.py:
* Channel names: read via spatialdata.models.get_channel_names so c_coords
set at parse time flow through to output column suffixes.
* Progress: tqdm wrapper around joblib.Parallel(return_as='generator_unordered')
+ periodic logg.info('Tile {n}/{total} done (elapsed ...)') so non-TTY
runs (CI, slurm) also see progress.
* Alignment: _align_to_image_grid replaces the dim-mismatch raise with a
coordinate-system aware crop. Identity-or-integer-pixel-translation is
honored as a 1-to-1 pixel alignment; the overlap rectangle is processed
and out-of-extent cells are counted, not crashed on. Non-pixel-aligned
transforms either raise with a spatialdata.rasterize hint
(align_mode='strict', default) or trigger materialization via
spatialdata.rasterize (align_mode='rasterize') with a warning.
* DropReport: per-run counter for cells dropped due to extent, partial
boundary intersection, cp_measure no-data, or empty tiles. Emitted via
logg.info(report.summary()) at the end of every run.
Tests: 39 in test_tiling.py (was 30; new coverage for the lazy/multiscale
helpers + verify_coverage edge cases), 35 in test_calculate_image_features
including a TestPR982Concerns class with one regression test per open note.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* compute_cell_info_tiled: replace per-id np.where with scipy.ndimage.find_objects and np.bincount sums. One vectorized pass per tile instead of O(n_cells) scans. * _zero_non_owned: replace per-id rewrite loop with np.isin + np.where. * _classify_dropped_cells: drop the full-array .values + per-cell np.where; use compute_cell_info_tiled bboxes for inside/partial/outside classification, so the full label array is no longer materialized. * CellInfo: add bbox_y0/bbox_x0 fields so callers can do bbox math without reconstructing from the centroid (which is area-weighted, not bbox-centered). * _relabel_contiguous: replaced by skimage.segmentation.relabel_sequential. * _align_to_image_grid: flatten nested if/else with elif chain; extract _rasterize_to_image_grid so the shapes-key path and the align_mode='rasterize' path no longer duplicate the rasterize call. * DropReport: empty_tile_drop -> empty_tiles (the counter increments per tile). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Associated notebook: https://github.com/scverse/squidpy_notebooks/blob/add_cpmeasure_notebook/tutorials/tutorial_cpmeasure.ipynb