Add a logger by Intron7 · Pull Request #680 · scverse/rapids-singlecell

Intron7 · 2026-05-28T16:26:10Z

This adds a scanpy style logger to rsc.

Intron7 · 2026-05-28T16:26:34Z

coderabbitai · 2026-05-28T16:26:40Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-05-28T16:34:06Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Logging infrastructure with five configurable verbosity levels for operation transparency
- Settings system enabling package-level configuration and verbosity control
- Progress tracking with elapsed timing information across preprocessing and analysis functions

Walkthrough

This PR introduces a scverse-style logging system for rapids-singlecell with Pydantic settings-based verbosity control. It adds a new logging.py module with custom HINT log level, timestamp-aware message formatting, and five exported convenience functions, along with a Verbosity enum and Settings model for configuration. The logging infrastructure is then instrumented throughout preprocessing and tools workflows with start/finish markers and optional metadata, and comprehensive tests validate the system's behavior.

Changes

Logging infrastructure and instrumentation

Layer / File(s)	Summary
Core logging module and public API `src/rapids_singlecell/logging.py`	Custom logger with HINT log level (between INFO and DEBUG), `_RootLogger` override capturing `now()` and computing `time_passed`, `_LogFormatter` that injects level-specific prefixes (INFO/HINT/DEBUG), handles `{time_passed}` substitution, appends `deep` metadata, and selects output stream (stdout in IPython, stderr otherwise). Exports five functions—`error`, `warning`, `info`, `hint`, `debug`—each returning a timestamp and accepting optional `time`, `deep`, and `extra` parameters.
Verbosity enum and settings configuration `src/rapids_singlecell/_settings/verbosity.py`, `src/rapids_singlecell/_settings/__init__.py`	`Verbosity` IntEnum with five members (error through debug), a `level` property mapping to Python logging levels, and an `override(...)` context manager for temporary verbosity changes. Pydantic `_RscSettings` model coerces case-insensitive string verbosity names to enum values via `BeforeValidator`, validates post-construction to apply logging level by calling `rapids_singlecell.logging._set_log_level`, and exports a module-level `settings` instance.
Package-level API wiring `src/rapids_singlecell/__init__.py`, `src/rapids_singlecell/decoupler_gpu/_helper/_log.py`	Top-level `__init__.py` imports and exports `logging` module, `Verbosity`, and `settings` via explicit `__all__`; removes prior `cuml.internals.logger.set_level(2)` side-effect. `decoupler_gpu._log` helper now routes output through `rapids_singlecell.logging` (`logg.warning` for "warn" level, `logg.info` otherwise).
Logging in preprocessing functions `src/rapids_singlecell/preprocessing/_*`	Instrumented normalize_total, log1p, pca, hvg, neighbors, bbknn, regress_out, scale, harmony_integrate, scrublet/core, and filter_genes/cells/highly_variable with start-timestamp logging, finish-with-elapsed-time logging, optional debug messages (e.g., zero_center disabled, categorical regression), optional warnings (e.g., n_neighbors capped), and optional hint blocks listing output fields added to `adata`.
Logging in tools functions `src/rapids_singlecell/tools/_*`	Instrumented leiden, louvain, rank_genes_groups, tsne, and umap with start-timestamp logging, finish-with-elapsed-time logging, debug messages for parameters/groups, optional warnings (e.g., UMAP neighbor-graph method validation), and hint/info blocks describing output keys and `uns` fields written.
Comprehensive logging and settings tests `tests/test_logging.py`	Test suite with 11 test functions validating logging defaults and handler installation, log record emission and numeric level mappings, formatted output strings by level, hint "deep" message rendering, elapsed-time tracking (using `monkeypatch` to control timestamps), verbosity propagation to cuml logger, context-manager-based temporary verbosity overrides, string-to-enum coercion, validation errors on invalid names, and public re-exports. Two autouse fixtures attach caplog handler and reset verbosity state after each test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

scverse/rapids-singlecell#625: Both modify src/rapids_singlecell/preprocessing/_harmony_integrate.py — this PR adds logging, the other updates the Harmony2 function signature and parameter forwarding.

Suggested reviewers

Zethson

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 56.60% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add a logger' accurately and concisely describes the main change, which is the introduction of a Scanpy-style logging system to the rapids_singlecell package.
Description check	✅ Passed	The description 'This adds a scanpy style logger to rsc' directly relates to the changeset and accurately characterizes the primary purpose of the pull request.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add-a-logger

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/rapids_singlecell/logging.py`:
- Around line 104-116: The code currently uses truthiness checks on optional
formatter fields which skips valid falsy values (e.g., timedelta(0)); update the
conditionals to explicit None checks: replace "if record.time_passed:" with "if
record.time_passed is not None:" and keep the microseconds branch as an explicit
check (e.g., "if record.time_passed.microseconds != 0:" or similar) so
zero-duration is handled, and replace "if record.deep:" with "if record.deep is
not None:"; ensure the logic inside (replacing "{time_passed}" in record.msg,
adjusting self._style._fmt, and appending ": {deep}" behavior) remains unchanged
apart from these None checks.

In `@src/rapids_singlecell/preprocessing/_neighbors/__init__.py`:
- Around line 190-193: The auto-adjustment block that sets n_neighbors when
n_neighbors > adata.shape[0] needs an explicit guard for empty AnnData: check if
adata.shape[0] == 0 before adjusting and raise a clear ValueError (or log and
exit) instead of proceeding to set n_neighbors to 1; update the code around the
n_neighbors adjustment (the block that references adata.shape and n_neighbors in
src/rapids_singlecell/preprocessing/_neighbors/__init__.py) so empty input is
detected early and handled with an informative error message to avoid later KNN
failures.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ce41bad9-ff8d-4e9c-ac3c-2dbc9a4ddc78

📥 Commits

Reviewing files that changed from the base of the PR and between 55b331f and 1850beb.

📒 Files selected for processing (20)

src/rapids_singlecell/__init__.py
src/rapids_singlecell/_settings/__init__.py
src/rapids_singlecell/_settings/verbosity.py
src/rapids_singlecell/decoupler_gpu/_helper/_log.py
src/rapids_singlecell/logging.py
src/rapids_singlecell/preprocessing/_harmony_integrate.py
src/rapids_singlecell/preprocessing/_hvg/__init__.py
src/rapids_singlecell/preprocessing/_neighbors/__init__.py
src/rapids_singlecell/preprocessing/_normalize.py
src/rapids_singlecell/preprocessing/_pca.py
src/rapids_singlecell/preprocessing/_regress_out.py
src/rapids_singlecell/preprocessing/_scale.py
src/rapids_singlecell/preprocessing/_scrublet/__init__.py
src/rapids_singlecell/preprocessing/_scrublet/core.py
src/rapids_singlecell/preprocessing/_simple.py
src/rapids_singlecell/tools/_clustering.py
src/rapids_singlecell/tools/_rank_genes_groups/__init__.py
src/rapids_singlecell/tools/_tsne.py
src/rapids_singlecell/tools/_umap.py
tests/test_logging.py

coderabbitai · 2026-05-28T16:34:09Z

+        if record.time_passed:
+            if record.time_passed.microseconds:
+                record.time_passed = timedelta(
+                    seconds=int(record.time_passed.total_seconds())
+                )
+            if "{time_passed}" in record.msg:
+                record.msg = record.msg.replace(
+                    "{time_passed}", str(record.time_passed)
+                )
+            else:
+                self._style._fmt += " ({time_passed})"
+        if record.deep:
+            record.msg = f"{record.msg}: {record.deep}"


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use explicit None checks for optional formatter fields.

Please avoid truthiness checks on optional values here. This also fixes the zero-duration edge case (timedelta(0)) being skipped.

Suggested patch

- if record.time_passed: + if record.time_passed is not None: if record.time_passed.microseconds: record.time_passed = timedelta( seconds=int(record.time_passed.total_seconds()) ) if "{time_passed}" in record.msg: record.msg = record.msg.replace( "{time_passed}", str(record.time_passed) ) else: self._style._fmt += " ({time_passed})" - if record.deep: + if record.deep is not None: record.msg = f"{record.msg}: {record.deep}"

As per coding guidelines, `**/*.py`: Use `is None` and `is not None` for optional parameters instead of truthiness checks.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if record.time_passed:

if record.time_passed.microseconds:

record.time_passed = timedelta(

seconds=int(record.time_passed.total_seconds())

)

if "{time_passed}" in record.msg:

record.msg = record.msg.replace(

"{time_passed}", str(record.time_passed)

)

else:

self._style._fmt += " ({time_passed})"

if record.deep:

record.msg = f"{record.msg}: {record.deep}"

if record.time_passed is not None:

if record.time_passed.microseconds:

record.time_passed = timedelta(

seconds=int(record.time_passed.total_seconds())

)

if "{time_passed}" in record.msg:

record.msg = record.msg.replace(

"{time_passed}", str(record.time_passed)

)

else:

self._style._fmt += " ({time_passed})"

if record.deep is not None:

record.msg = f"{record.msg}: {record.deep}"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rapids_singlecell/logging.py` around lines 104 - 116, The code currently uses truthiness checks on optional formatter fields which skips valid falsy values (e.g., timedelta(0)); update the conditionals to explicit None checks: replace "if record.time_passed:" with "if record.time_passed is not None:" and keep the microseconds branch as an explicit check (e.g., "if record.time_passed.microseconds != 0:" or similar) so zero-duration is handled, and replace "if record.deep:" with "if record.deep is not None:"; ensure the logic inside (replacing "{time_passed}" in record.msg, adjusting self._style._fmt, and appending ": {deep}" behavior) remains unchanged apart from these None checks.

coderabbitai · 2026-05-28T16:34:09Z

+    if n_neighbors > adata.shape[0]:
+        n_neighbors = 1 + int(0.5 * adata.shape[0])
+        logg.warning(f"n_obs too small: adjusting to `n_neighbors = {n_neighbors}`")
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard empty AnnData before neighbor auto-adjustment.

When adata.shape[0] == 0, the new adjustment produces n_neighbors = 1, which is still invalid and fails later in KNN with a less direct error. Add an explicit early check for empty input.

💡 Proposed fix

+ if adata.shape[0] == 0: + raise ValueError("Cannot compute neighbors on empty AnnData (`n_obs == 0`).") + if n_neighbors > adata.shape[0]: n_neighbors = 1 + int(0.5 * adata.shape[0]) logg.warning(f"n_obs too small: adjusting to `n_neighbors = {n_neighbors}`")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if n_neighbors > adata.shape[0]:

n_neighbors = 1 + int(0.5 * adata.shape[0])

logg.warning(f"n_obs too small: adjusting to `n_neighbors = {n_neighbors}`")

if adata.shape[0] == 0:

raise ValueError("Cannot compute neighbors on empty AnnData (`n_obs == 0`).")

if n_neighbors > adata.shape[0]:

n_neighbors = 1 + int(0.5 * adata.shape[0])

logg.warning(f"n_obs too small: adjusting to `n_neighbors = {n_neighbors}`")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/rapids_singlecell/preprocessing/_neighbors/__init__.py` around lines 190 - 193, The auto-adjustment block that sets n_neighbors when n_neighbors > adata.shape[0] needs an explicit guard for empty AnnData: check if adata.shape[0] == 0 before adjusting and raise a clear ValueError (or log and exit) instead of proceeding to set n_neighbors to 1; update the code around the n_neighbors adjustment (the block that references adata.shape and n_neighbors in src/rapids_singlecell/preprocessing/_neighbors/__init__.py) so empty input is detected early and handled with an informative error message to avoid later KNN failures.

codecov-commenter · 2026-05-28T16:36:49Z

⚠️ JUnit XML file not found

The CLI was unable to find any JUnit XML files to upload.
For more help, visit our troubleshooting guide.

Intron7 added 2 commits May 28, 2026 18:20

add a logger

bf7bf73

add logging tests

1850beb

Intron7 added the run-gpu-ci label May 28, 2026

github-actions Bot removed the run-gpu-ci label May 28, 2026

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a logger#680

Add a logger#680
Intron7 wants to merge 2 commits into
mainfrom
add-a-logger

Intron7 commented May 28, 2026

Uh oh!

Intron7 commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 28, 2026

Uh oh!

coderabbitai Bot May 28, 2026

Uh oh!

codecov-commenter commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Intron7 commented May 28, 2026

Uh oh!

Intron7 commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented May 28, 2026

⚠️ JUnit XML file not found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants