Skip to content

[EXPERIMENTAL]: Integrate cp-measure#982

Open
timtreis wants to merge 45 commits into
mainfrom
feature/add_cpmeasure
Open

[EXPERIMENTAL]: Integrate cp-measure#982
timtreis wants to merge 45 commits into
mainfrom
feature/add_cpmeasure

Conversation

@timtreis
Copy link
Copy Markdown
Member

@timtreis timtreis commented Mar 28, 2025

@timtreis timtreis marked this pull request as draft March 28, 2025 16:52
@timtreis timtreis added enhancement ✨ New feature or request image 🔬 squidpy2.0 Everything releated to a Squidpy 2.0 release sdata compat 🌌 release-added labels Mar 28, 2025
@timtreis
Copy link
Copy Markdown
Member Author

timtreis commented May 16, 2025

Note to self:

  • Doesn't correctly parse str names of channels
    INFO Calculating 'cpmeasure' correlation features between channels '0' and '1'.

  • Should show, for permutations, the total number of iterations (in general, the progress bar should contain a (step n out m) readout so one can know how far in the featurisation is. Can easily take more than a day given the amount of cells and cpu_cores. Should also maybe show the total runtime so far for the steps that are done

  • Fails if labels and image don't have the same dimensions, despite transformation to align them

@pakiessling
Copy link
Copy Markdown

Ah I noticed another thing.

For the created morphology table .uns["spatialdata_attrs"] needs to be set, otherwise this runs into problem when querying the Spatialdata afterwards.

timtreis and others added 6 commits January 27, 2026 13:46
- Resolved dependency conflicts (updated to zarr>=3)
- Moved exp module to experimental to align with main
- Integrated CellProfiler features into experimental module
- Added centrosome and cp_measure dependencies
Moved calculate_image_features from experimental._feature to
experimental.im._feature to follow the existing module structure.
Now accessible as squidpy.experimental.im.calculate_image_features
- Test basic feature calculation with shapes
- Test copy vs inplace behavior
- Test error cases for invalid keys
- Uses sdata_hne fixture with skimage:label for fast execution
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 27, 2026

Codecov Report

❌ Patch coverage is 78.39196% with 129 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.17%. Comparing base (093217d) to head (9848a6f).

Files with missing lines Patch % Lines
src/squidpy/experimental/im/_feature.py 74.40% 67 Missing and 40 partials ⚠️
src/squidpy/experimental/im/_tiling.py 87.70% 11 Missing and 11 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #982      +/-   ##
==========================================
+ Coverage   73.82%   74.17%   +0.35%     
==========================================
  Files          45       47       +2     
  Lines        7013     7610     +597     
  Branches     1188     1310     +122     
==========================================
+ Hits         5177     5645     +468     
- Misses       1349     1427      +78     
- Partials      487      538      +51     
Files with missing lines Coverage Δ
src/squidpy/experimental/im/_tiling.py 87.70% <87.70%> (ø)
src/squidpy/experimental/im/_feature.py 74.40% <74.40%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@LucaMarconato
Copy link
Copy Markdown
Member

Looking forward to this PR 👀🥸

timtreis and others added 12 commits January 27, 2026 15:52
Introduces _tiling.py with build_tile_specs() and extract_tile() that
split a label image into overlapping tiles where each cell is assigned
to exactly one tile by centroid. Non-owned cells are zeroed out so
downstream processing never double-counts.

Includes 31 tests: deterministic brick-pattern grid (touching and
non-touching), coverage verification, and visual regression tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@timtreis
Copy link
Copy Markdown
Member Author

timtreis commented Apr 8, 2026

Refactoring in anticipation of afermg/cp_measure#38 being merged so we can upstream behaviour.

timtreis and others added 3 commits May 14, 2026 16:30
Wires the in-progress cp_measure.featurizer + lazy-tiling refactor onto a
working _tiling.py and closes out the six open notes on the PR.

_tiling.py:
* build_tile_specs now takes (shape, cell_info), so it is agnostic to
  whether labels are in memory, dask-backed, or multiscale.
* compute_cell_info is public; new compute_cell_info_multiscale (read
  coarsest scale, rescale to target) and compute_cell_info_tiled
  (stream tiles, merge boundary-spanning cells via additive accumulators).
* extract_tile_lazy slices an xr.DataArray and materializes only the crop;
  extract_tile retained for in-memory callers.
* verify_coverage takes a label_ids set.

_feature.py:
* Channel names: read via spatialdata.models.get_channel_names so c_coords
  set at parse time flow through to output column suffixes.
* Progress: tqdm wrapper around joblib.Parallel(return_as='generator_unordered')
  + periodic logg.info('Tile {n}/{total} done (elapsed ...)') so non-TTY
  runs (CI, slurm) also see progress.
* Alignment: _align_to_image_grid replaces the dim-mismatch raise with a
  coordinate-system aware crop. Identity-or-integer-pixel-translation is
  honored as a 1-to-1 pixel alignment; the overlap rectangle is processed
  and out-of-extent cells are counted, not crashed on. Non-pixel-aligned
  transforms either raise with a spatialdata.rasterize hint
  (align_mode='strict', default) or trigger materialization via
  spatialdata.rasterize (align_mode='rasterize') with a warning.
* DropReport: per-run counter for cells dropped due to extent, partial
  boundary intersection, cp_measure no-data, or empty tiles. Emitted via
  logg.info(report.summary()) at the end of every run.

Tests: 39 in test_tiling.py (was 30; new coverage for the lazy/multiscale
helpers + verify_coverage edge cases), 35 in test_calculate_image_features
including a TestPR982Concerns class with one regression test per open note.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* compute_cell_info_tiled: replace per-id np.where with scipy.ndimage.find_objects
  and np.bincount sums. One vectorized pass per tile instead of O(n_cells) scans.
* _zero_non_owned: replace per-id rewrite loop with np.isin + np.where.
* _classify_dropped_cells: drop the full-array .values + per-cell np.where; use
  compute_cell_info_tiled bboxes for inside/partial/outside classification, so
  the full label array is no longer materialized.
* CellInfo: add bbox_y0/bbox_x0 fields so callers can do bbox math without
  reconstructing from the centroid (which is area-weighted, not bbox-centered).
* _relabel_contiguous: replaced by skimage.segmentation.relabel_sequential.
* _align_to_image_grid: flatten nested if/else with elif chain; extract
  _rasterize_to_image_grid so the shapes-key path and the align_mode='rasterize'
  path no longer duplicate the rasterize call.
* DropReport: empty_tile_drop -> empty_tiles (the counter increments per tile).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement ✨ New feature or request image 🔬 release-added sdata compat 🌌 squidpy2.0 Everything releated to a Squidpy 2.0 release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants