sequential-parameter-optimization · bartzbeielstein · Jun 7, 2026 · Jun 7, 2026 · Jun 7, 2026 · Jun 7, 2026
@@ -1,3 +1,15 @@
+## [18.1.0-rc.1](https://github.com/sequential-parameter-optimization/spotforecast2-safe/compare/v18.0.1...v18.1.0-rc.1) (2026-06-07)
+
+
+### Features
+
+* **preprocessing:** add deviation rule to target-corruption detector ([95d45d2](https://github.com/sequential-parameter-optimization/spotforecast2-safe/commit/95d45d2ee3a6f53b3f17716987f4f32579105897))
+
+
+### Documentation
+
+* add live {python} Examples to all public symbols missing them ([5fac4ca](https://github.com/sequential-parameter-optimization/spotforecast2-safe/commit/5fac4cada521448aa51413d575013a90abf478f0))
+
 ## [18.0.1](https://github.com/sequential-parameter-optimization/spotforecast2-safe/compare/v18.0.0...v18.0.1) (2026-06-07)
 
 

@@ -7,7 +7,7 @@ This card describes what spotforecast2-safe is, how to use it safely, the condit
 | Field | Value |
 | --- | --- |
 | Name | spotforecast2-safe |
-| Version | 18.0.1 |
+| Version | 18.1.0-rc.1 |
 | Type | Deterministic Python library for time series feature engineering and recursive multi-step forecasting. It performs no training of its own. |
 | Developed by | Thomas Bartz-Beielstein, ORCID [0000-0002-5938-5158](https://orcid.org/0000-0002-5938-5158) |
 | Distributed by | the `sequential-parameter-optimization` GitHub organization |
@@ -18,7 +18,7 @@ This card describes what spotforecast2-safe is, how to use it safely, the condit
 
 The library depends only on numpy, pandas, scikit-learn, lightgbm, numba, pyarrow, requests, feature-engine, holidays, astral, and tqdm. It deliberately excludes plotly, matplotlib, spotoptim, optuna, torch, and tensorflow, so no plotting or automated-tuning code ships in this package.
 
-Two Common Platform Enumeration (CPE) identifiers let vulnerability-tracking and software bill of materials (SBOM) tools recognize the package. The wildcard identifier `cpe:2.3:a:sequential_parameter_optimization:spotforecast2_safe:*:*:*:*:*:*:*:*` matches any release; the current release is `cpe:2.3:a:sequential_parameter_optimization:spotforecast2_safe:18.0.1:*:*:*:*:*:*:*`.
+Two Common Platform Enumeration (CPE) identifiers let vulnerability-tracking and software bill of materials (SBOM) tools recognize the package. The wildcard identifier `cpe:2.3:a:sequential_parameter_optimization:spotforecast2_safe:*:*:*:*:*:*:*:*` matches any release; the current release is `cpe:2.3:a:sequential_parameter_optimization:spotforecast2_safe:18.1.0-rc.1:*:*:*:*:*:*:*`.
 
 The library itself is a low-risk component: it is deterministic, its source is fully inspectable, and it fails safe on invalid input. It is built to support high-risk AI systems in the sense of the EU AI Act, but it is not itself such a system. When it is embedded in a high-risk deployment, the duties that attach to that system fall on the integrator, not on the library.
 
@@ -30,7 +30,7 @@ Responsibilities are divided as follows.
 | Distribution | sequential-parameter-optimization on GitHub | repository issue tracker |
 | Deployment, operation, and audit | the system integrator | defined per deployment |
 
-The current release is 18.0.1, with a stable public interface pinned in `spotforecast2_safe.__init__.__all__`. The full version history, including release dates, is recorded in `CHANGELOG.md` and on the GitHub Releases page; it is maintained automatically by the release pipeline and is not repeated here.
+The current release is 18.1.0-rc.1, with a stable public interface pinned in `spotforecast2_safe.__init__.__all__`. The full version history, including release dates, is recorded in `CHANGELOG.md` and on the GitHub Releases page; it is maintained automatically by the release pipeline and is not repeated here.
 
 ## 2. Intended Use and Scope
 
@@ -216,7 +216,7 @@ Maintainer: Thomas Bartz-Beielstein, ORCID [0000-0002-5938-5158](https://orcid.o
 }
 ```
 
-Or as a formatted reference: Bartz-Beielstein, T. (2026). *spotforecast2-safe: Safety-critical subset of spotforecast2* (Version 18.0.1) [Computer software]. https://github.com/sequential-parameter-optimization/spotforecast2-safe
+Or as a formatted reference: Bartz-Beielstein, T. (2026). *spotforecast2-safe: Safety-critical subset of spotforecast2* (Version 18.1.0-rc.1) [Computer software]. https://github.com/sequential-parameter-optimization/spotforecast2-safe
 
 The technical report (`bart26h/index.qmd`) is the long-form reference for design rationale, compliance mapping, and evaluation protocol.
 

@@ -1,10 +1,10 @@
 {
-  "hash": "7dd9705960361ff99aefd892719fe3e3",
+  "hash": "4c054038f0742ab143305463c939dea7",
   "result": {
     "engine": "jupyter",
-    "markdown": "---\ntitle: preprocessing.target_corruption.detect_target_corruption\n---\n\n\n\n```python\npreprocessing.target_corruption.detect_target_corruption(\n    df,\n    *,\n    targets,\n    range_mw,\n    step_mw,\n    window_days,\n)\n```\n\nDetect physically-impossible target-column corruption in the native frame.\n\nApplies two independent rules on the native-cadence (e.g. 15-min) series\nwithin a rolling look-back window ending at the last observed target\ntimestamp:\n\n- **Range rule** (sub-hourly cadence only): an hour is flagged when\n  ``intra-hour max - intra-hour min > range_mw`` for any target column.\n  Vacuously skipped for hourly-or-coarser cadence (intra-hour range is\n  undefined on a single slot per hour).\n- **Step rule**: an hour is flagged when any ``|adjacent-slot diff|``\n  that *touches* that hour exceeds ``step_mw`` for any target column.\n  Applies to all cadences.\n\nFlags are OR-ed across target columns.  ALL native-cadence slots of a\nflagged calendar hour are marked ``True`` in the returned boolean\n``Series``, so downstream NaN-ing operates on full hours rather than\nindividual sub-hourly slots.\n\nThe detector is **inert** (returns all-``False``) unless ``window_days``\nis set AND at least one of ``range_mw`` / ``step_mw`` is set.  If the\ndata is shorter than ``window_days``, the window is clamped to\n``df.index.min()`` without raising.\n\n## Parameters {.doc-section .doc-section-parameters}\n\n| Name        | Type                                              | Description                                                                                                  | Default    |\n|-------------|---------------------------------------------------|--------------------------------------------------------------------------------------------------------------|------------|\n| df          | [pd](`pandas`).[DataFrame](`pandas.DataFrame`)    | Native-cadence ``DataFrame`` indexed by a ``DatetimeIndex``. Must contain all columns listed in ``targets``. | _required_ |\n| targets     | [Sequence](`typing.Sequence`)\\[[str](`str`)\\]     | Sequence of target column names to inspect.                                                                  | _required_ |\n| range_mw    | [Optional](`typing.Optional`)\\[[float](`float`)\\] | Maximum allowed intra-hour range (MW).  ``None`` skips the range rule.                                       | _required_ |\n| step_mw     | [Optional](`typing.Optional`)\\[[float](`float`)\\] | Maximum allowed absolute adjacent-slot difference (MW). ``None`` skips the step rule.                        | _required_ |\n| window_days | [Optional](`typing.Optional`)\\[[int](`int`)\\]     | Number of days before the last observed target to include in the scan.  ``None`` makes the detector inert.   | _required_ |\n\n## Returns {.doc-section .doc-section-returns}\n\n| Name   | Type                                     | Description                                                        |\n|--------|------------------------------------------|--------------------------------------------------------------------|\n|        | [pd](`pandas`).[Series](`pandas.Series`) | Boolean ``pd.Series`` aligned to ``df.index``.  ``True`` means the |\n|        | [pd](`pandas`).[Series](`pandas.Series`) | slot belongs to a flagged calendar hour.  All-``False`` when the   |\n|        | [pd](`pandas`).[Series](`pandas.Series`) | detector is inert or no corruption is found.                       |\n\n## Examples {.doc-section .doc-section-examples}\n\n\n::: {#851d62c6 .cell execution_count=1}\n``` {.python .cell-code}\nimport pandas as pd\nimport numpy as np\nfrom spotforecast2_safe.preprocessing.target_corruption import (\n    detect_target_corruption,\n)\n\n# 15-min cadence; one GW dropout at 12:15 inside the window\nidx = pd.date_range(\"2026-06-03\", periods=48, freq=\"15min\", tz=\"UTC\")\nvals = [55_000.0] * 48\nvals[5] = 44_000.0          # 11 GW step drop  -> flags 12:00 hour\ndf = pd.DataFrame({\"load\": vals}, index=idx)\n\nmask = detect_target_corruption(\n    df, targets=[\"load\"], range_mw=5_000, step_mw=8_000, window_days=3\n)\n# Slots in the 12:00 hour (index 4-7) are flagged\nassert mask.iloc[4:8].all(), \"Slots in the flagged hour must be True\"\nassert not mask.iloc[8:].any(), \"Subsequent clean slots must be False\"\nprint(\"flagged:\", mask.sum(), \"slots\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nflagged: 4 slots\n```\n:::\n:::\n\n\n",
+    "markdown": "---\ntitle: preprocessing.target_corruption.detect_target_corruption\n---\n\n\n\n```python\npreprocessing.target_corruption.detect_target_corruption(\n    df,\n    *,\n    targets,\n    range_mw,\n    step_mw,\n    window_days,\n    deviation_mw=None,\n    deviation_ref=None,\n    deviation_slots=2,\n)\n```\n\nDetect physically-impossible target-column corruption in the native frame.\n\nApplies two independent rules on the native-cadence (e.g. 15-min) series\nwithin a rolling look-back window ending at the last observed target\ntimestamp:\n\n- **Range rule** (sub-hourly cadence only): an hour is flagged when\n  ``intra-hour max - intra-hour min > range_mw`` for any target column.\n  Vacuously skipped for hourly-or-coarser cadence (intra-hour range is\n  undefined on a single slot per hour).\n- **Step rule**: an hour is flagged when any ``|adjacent-slot diff|``\n  that *touches* that hour exceeds ``step_mw`` for any target column.\n  Applies to all cadences.\n- **Deviation rule** (dropout-only, all cadences): an hour is flagged\n  when ``target − reference < -deviation_mw`` holds for at least\n  ``deviation_slots`` *consecutive* native-cadence slots within the\n  scan window, where the reference is a published companion column\n  such as the ENTSO-E day-ahead ``\"Forecasted Load\"``.  The rule is\n  asymmetric by design: the known corruption class is exclusively a\n  dropout *below* the day-ahead forecast, while actuals above the\n  forecast are ordinary under-forecasting.  ``NaN`` in either column\n  yields a ``NaN`` difference, which compares ``False`` — so the\n  publication-lag frontier (forecast published, actual not yet) never\n  flags, and a data gap breaks a consecutive run.  On hourly-or-coarser\n  cadence the sustained requirement collapses to a single slot.  The\n  rule is silently skipped when ``deviation_ref`` is missing from the\n  frame (mirroring how absent target columns are skipped).\n\nFlags are OR-ed across target columns.  ALL native-cadence slots of a\nflagged calendar hour are marked ``True`` in the returned boolean\n``Series``, so downstream NaN-ing operates on full hours rather than\nindividual sub-hourly slots.\n\nThe detector is **inert** (returns all-``False``) unless ``window_days``\nis set AND at least one of ``range_mw`` / ``step_mw`` / ``deviation_mw``\nis set.  If the data is shorter than ``window_days``, the window is\nclamped to ``df.index.min()`` without raising.\n\n## Parameters {.doc-section .doc-section-parameters}\n\n| Name            | Type                                              | Description                                                                                                                                                                                                                                                    | Default    |\n|-----------------|---------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|\n| df              | [pd](`pandas`).[DataFrame](`pandas.DataFrame`)    | Native-cadence ``DataFrame`` indexed by a ``DatetimeIndex``. Must contain all columns listed in ``targets``.                                                                                                                                                   | _required_ |\n| targets         | [Sequence](`typing.Sequence`)\\[[str](`str`)\\]     | Sequence of target column names to inspect.                                                                                                                                                                                                                    | _required_ |\n| range_mw        | [Optional](`typing.Optional`)\\[[float](`float`)\\] | Maximum allowed intra-hour range (MW).  ``None`` skips the range rule.                                                                                                                                                                                         | _required_ |\n| step_mw         | [Optional](`typing.Optional`)\\[[float](`float`)\\] | Maximum allowed absolute adjacent-slot difference (MW). ``None`` skips the step rule.                                                                                                                                                                          | _required_ |\n| window_days     | [Optional](`typing.Optional`)\\[[int](`int`)\\]     | Number of days before the last observed target to include in the scan.  ``None`` makes the detector inert.                                                                                                                                                     | _required_ |\n| deviation_mw    | [Optional](`typing.Optional`)\\[[float](`float`)\\] | Maximum allowed dropout below the reference column (MW, positive magnitude): slots with ``target − reference < -deviation_mw`` are candidates. ``None`` skips the deviation rule.                                                                              | `None`     |\n| deviation_ref   | [Optional](`typing.Optional`)\\[[str](`str`)\\]     | Name of the reference column (e.g. ``\"Forecasted Load\"``).  The rule is skipped when ``None`` or when the column is absent from ``df``.  The reference column itself is never checked as a target by this rule.                                                | `None`     |\n| deviation_slots | [int](`int`)                                      | Minimum number of *consecutive* sub-hourly slots the dropout must sustain before any hour is flagged (default ``2`` — a single-slot blip is more likely a metering glitch than the oscillating dropout class).  Clamped to ``1`` on hourly-or-coarser cadence. | `2`        |\n\n## Returns {.doc-section .doc-section-returns}\n\n| Name   | Type                                     | Description                                                        |\n|--------|------------------------------------------|--------------------------------------------------------------------|\n|        | [pd](`pandas`).[Series](`pandas.Series`) | Boolean ``pd.Series`` aligned to ``df.index``.  ``True`` means the |\n|        | [pd](`pandas`).[Series](`pandas.Series`) | slot belongs to a flagged calendar hour.  All-``False`` when the   |\n|        | [pd](`pandas`).[Series](`pandas.Series`) | detector is inert or no corruption is found.                       |\n\n## Examples {.doc-section .doc-section-examples}\n\n\n::: {#ed42fb8e .cell execution_count=1}\n``` {.python .cell-code}\nimport pandas as pd\nimport numpy as np\nfrom spotforecast2_safe.preprocessing.target_corruption import (\n    detect_target_corruption,\n)\n\n# 15-min cadence; one GW dropout at 12:15 inside the window\nidx = pd.date_range(\"2026-06-03\", periods=48, freq=\"15min\", tz=\"UTC\")\nvals = [55_000.0] * 48\nvals[5] = 44_000.0          # 11 GW step drop  -> flags 12:00 hour\ndf = pd.DataFrame({\"load\": vals}, index=idx)\n\nmask = detect_target_corruption(\n    df, targets=[\"load\"], range_mw=5_000, step_mw=8_000, window_days=3\n)\n# Slots in the 12:00 hour (index 4-7) are flagged\nassert mask.iloc[4:8].all(), \"Slots in the flagged hour must be True\"\nassert not mask.iloc[8:].any(), \"Subsequent clean slots must be False\"\nprint(\"flagged:\", mask.sum(), \"slots\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nflagged: 4 slots\n```\n:::\n:::\n\n\n::: {#643afacd .cell execution_count=2}\n``` {.python .cell-code}\n# Deviation rule: a sub-threshold dropout the dynamics rules miss.\nimport pandas as pd\nimport numpy as np\nfrom spotforecast2_safe.preprocessing.target_corruption import (\n    detect_target_corruption,\n)\n\nidx = pd.date_range(\"2026-06-07\", periods=16, freq=\"15min\", tz=\"UTC\")\nforecast = pd.Series(48_000.0, index=idx)\nactual = forecast.copy()\n# Two consecutive slots 11.6 GW below the forecast, stepping by\n# only 5.8 GW per slot — below a 6 GW step rule, no range breach.\nactual.iloc[4] = forecast.iloc[4] - 5_800.0\nactual.iloc[5] = forecast.iloc[5] - 11_600.0\nactual.iloc[6] = forecast.iloc[6] - 11_600.0\nactual.iloc[7] = forecast.iloc[7] - 5_800.0\n# Publication-lag frontier: forecast published, actual not yet.\nactual.iloc[12:] = np.nan\ndf = pd.DataFrame({\"Actual Load\": actual, \"Forecasted Load\": forecast})\n\ndyn_only = detect_target_corruption(\n    df, targets=[\"Actual Load\"],\n    range_mw=15_000, step_mw=6_000, window_days=3,\n)\nwith_dev = detect_target_corruption(\n    df, targets=[\"Actual Load\"],\n    range_mw=15_000, step_mw=6_000, window_days=3,\n    deviation_mw=8_000, deviation_ref=\"Forecasted Load\",\n)\nassert not dyn_only.any(), \"dynamics rules miss the dropout\"\nassert with_dev.iloc[4:8].any(), \"deviation rule catches it\"\nassert not with_dev.iloc[12:].any(), \"NaN frontier never flags\"\nprint(\"dynamics-only:\", int(dyn_only.sum()), \"| with deviation:\",\n      int(with_dev.sum()))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\ndynamics-only: 0 | with deviation: 4\n```\n:::\n:::\n\n\n",
     "supporting": [
-      "preprocessing.target_corruption.detect_target_corruption_files"
+      "preprocessing.target_corruption.detect_target_corruption_files/figure-html"
     ],
     "filters": [],
     "includes": {}

@@ -66,6 +66,9 @@ configurator.config_entsoe.ConfigEntsoe(
     target_corruption_policy='abort',
     target_max_heal_hours=0,
     target_anchor_zone_hours=168,
+    target_qc_deviation_mw=None,
+    target_qc_deviation_ref=None,
+    target_qc_deviation_slots=2,
 )
 ```
 

@@ -65,6 +65,9 @@ configurator.config_multi.ConfigMulti(
     target_corruption_policy='abort',
     target_max_heal_hours=0,
     target_anchor_zone_hours=168,
+    target_qc_deviation_mw=None,
+    target_qc_deviation_ref=None,
+    target_qc_deviation_slots=2,
 )
 ```