Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -739,8 +739,11 @@ quartodoc:
- calendar.holiday.get_holiday_features
- calendar.holiday.create_holiday_adjacency_df
- calendar.holiday.get_holiday_adjacency_features
- calendar.holiday.create_day_type_df
- calendar.holiday.get_day_type_features
- calendar.features.get_calendar_features
- calendar.features.get_day_night_features
- calendar.features.get_ephemeris_features

# ── Weather ───────────────────────────────────────────────────────────────
- title: "Weather"
Expand Down
71 changes: 71 additions & 0 deletions docs/reference/calendar.features.get_ephemeris_features.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# calendar.features.get_ephemeris_features { #spotforecast2_safe.calendar.features.get_ephemeris_features }

```python
calendar.features.get_ephemeris_features(
start,
cov_end,
location,
freq='h',
timezone='UTC',
)
```

Create continuous solar-geometry features from the ephemeris.

Unlike `get_day_night_features` (which rounds sunrise/sunset to whole
hours and emits a binary daylight flag), this builder exposes the
*continuous* solar geometry the hour-of-day RBFs only encode implicitly:
the per-hour solar elevation, the exact daylight duration, and the signed
time relative to sunrise and sunset. These linearise lighting-load timing
and the midday PV offset, are purely deterministic from the date and the
fixed coordinates, add no dependency, and leak nothing for any forecast
hour (Xie & Hong 2018, ``xieh18a``; López 2020, ``lope20a``).

The returned DataFrame contains four ``float64`` columns:

- ``solar_elevation`` — solar elevation angle in degrees (negative at
night, peaking at solar noon).
- ``daylight_duration_h`` — exact sunset−sunrise span for the date, hours.
- ``hours_since_sunrise`` — signed hours since that date's sunrise
(negative before sunrise).
- ``hours_to_sunset`` — signed hours until that date's sunset
(negative after sunset).

## Parameters {.doc-section .doc-section-parameters}

| Name | Type | Description | Default |
|----------|-----------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|------------|
| start | [Union](`typing.Union`)\[[str](`str`), [pd](`pandas`).[Timestamp](`pandas.Timestamp`)\] | Start of the time range. String values are parsed with ``utc=True``. | _required_ |
| cov_end | [Union](`typing.Union`)\[[str](`str`), [pd](`pandas`).[Timestamp](`pandas.Timestamp`)\] | Inclusive end of the time range. String values are parsed with ``utc=True``. | _required_ |
| location | [LocationInfo](`astral.LocationInfo`) | `LocationInfo` describing the geographic location. | _required_ |
| freq | [str](`str`) | Pandas-compatible frequency string for the output index. Defaults to ``"h"`` (hourly). | `'h'` |
| timezone | [str](`str`) | Timezone label applied to the generated index. Defaults to ``"UTC"``. | `'UTC'` |

## Returns {.doc-section .doc-section-returns}

| Name | Type | Description |
|--------|------------------------------------------------|---------------------------------------------------------------------|
| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | pd.DataFrame: Columns ``solar_elevation``, ``daylight_duration_h``, |
| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | ``hours_since_sunrise``, ``hours_to_sunset``; tz-aware |
| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | `DatetimeIndex` with the requested ``freq``. |

## Examples {.doc-section .doc-section-examples}

```{python}
import pandas as pd
from astral import LocationInfo
from spotforecast2_safe.calendar import get_ephemeris_features

start = pd.Timestamp("2024-06-21", tz="UTC")
cov_end = pd.Timestamp("2024-06-21 23:00", tz="UTC")
location = LocationInfo(latitude=51.5136, longitude=7.4653, timezone="UTC")

feats = get_ephemeris_features(start, cov_end, location)
print("columns:", feats.columns.tolist())
print("shape:", feats.shape)
# Summer solstice: long day and a high midday sun in Dortmund.
print("max elevation:", round(feats["solar_elevation"].max(), 1))
assert feats.shape == (24, 4)
assert feats["solar_elevation"].max() > 50.0
assert feats["daylight_duration_h"].iloc[0] > 14.0
```
63 changes: 63 additions & 0 deletions docs/reference/calendar.holiday.create_day_type_df.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# calendar.holiday.create_day_type_df { #spotforecast2_safe.calendar.holiday.create_day_type_df }

```python
calendar.holiday.create_day_type_df(
start,
end,
tz='UTC',
freq='h',
country_code='DE',
state='NW',
)
```

Create a day-type refinement of the public-holiday column.

Returns two integer columns derived purely from the weekday and the
public-holiday calendar (pure calendar arithmetic — known years ahead,
leakage-free):

- ``is_workday``: ``1`` when the day is Monday–Friday **and** not a public
holiday, else ``0``.
- ``day_type``: an integer class with public-holiday precedence —
``0`` working day, ``1`` Saturday (non-holiday), ``2`` Sunday
(non-holiday), ``3`` public holiday (any weekday). A public holiday that
falls on a weekend is still classed as ``3``.

These remove some of the worst single-day errors a plain holiday flag
leaves behind (Ziel 2018, ``ziel18a``).

## Parameters {.doc-section .doc-section-parameters}

| Name | Type | Description | Default |
|--------------|----------------------------------------------------------------|------------------------------------------------------|------------|
| start | [str](`str`) \| [pd](`pandas`).[Timestamp](`pandas.Timestamp`) | Start date/datetime. | _required_ |
| end | [str](`str`) \| [pd](`pandas`).[Timestamp](`pandas.Timestamp`) | End date/datetime. | _required_ |
| tz | [str](`str`) | Timezone to use if not inferred from start/end. | `'UTC'` |
| freq | [str](`str`) | Frequency of the resulting DataFrame. | `'h'` |
| country_code | [str](`str`) | Country code for holidays (e.g. ``"DE"``, ``"US"``). | `'DE'` |
| state | [str](`str`) | State code for holidays (e.g. ``"NW"``, ``"CA"``). | `'NW'` |

## Returns {.doc-section .doc-section-returns}

| Name | Type | Description |
|--------|------------------------------------------------|----------------------------------------------------------------------|
| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | pd.DataFrame: Index covering ``[start, end]`` at *freq* with integer |
| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | columns ``is_workday`` and ``day_type``; no NaNs. |

## Examples {.doc-section .doc-section-examples}

```{python}
import pandas as pd
from spotforecast2_safe.calendar import create_day_type_df

# 2024-01-01 Mon = New Year (holiday), 02 Tue = workday,
# 06 Sat, 07 Sun.
df = create_day_type_df("2024-01-01", "2024-01-07", freq="D")
print(df["is_workday"].tolist())
print(df["day_type"].tolist())
assert df.loc["2024-01-01", "day_type"] == 3 # holiday
assert df.loc["2024-01-02", "is_workday"] == 1
assert df.loc["2024-01-06", "day_type"] == 1 # Saturday
assert df.loc["2024-01-07", "day_type"] == 2 # Sunday
```
71 changes: 71 additions & 0 deletions docs/reference/calendar.holiday.get_day_type_features.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# calendar.holiday.get_day_type_features { #spotforecast2_safe.calendar.holiday.get_day_type_features }

```python
calendar.holiday.get_day_type_features(
data,
start,
cov_end,
forecast_horizon,
tz='UTC',
freq='h',
country_code='DE',
state='NW',
)
```

Build day-type indicators and align them to a regular time grid.

Generates ``is_workday`` and ``day_type`` via `create_day_type_df()`,
validates temporal coverage with `curate_holidays()`, and reindexes onto
the full ``[start, cov_end]`` grid. Trailing/leading grid cells outside the
generated range are filled with the working-day defaults
(``is_workday=0`` is wrong for a workday, so the grid is generated to fully
cover the request and no fill is expected; ``fill_value`` only guards the
degenerate empty-overlap case).

## Parameters {.doc-section .doc-section-parameters}

| Name | Type | Description | Default |
|------------------|-----------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------|------------|
| data | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | Reference time series DataFrame used for temporal coverage validation inside `curate_holidays()`. | _required_ |
| start | [Union](`typing.Union`)\[[str](`str`), [pd](`pandas`).[Timestamp](`pandas.Timestamp`)\] | Start timestamp. String values are parsed with ``utc=True``. | _required_ |
| cov_end | [Union](`typing.Union`)\[[str](`str`), [pd](`pandas`).[Timestamp](`pandas.Timestamp`)\] | Inclusive end timestamp (should cover the full forecast horizon). String values are parsed with ``utc=True``. | _required_ |
| forecast_horizon | [int](`int`) | Number of forecast steps ahead; passed to `curate_holidays()`. | _required_ |
| tz | [str](`str`) | Timezone applied to the generated index. Defaults to ``"UTC"``. | `'UTC'` |
| freq | [str](`str`) | Pandas-compatible frequency string. Defaults to ``"h"``. | `'h'` |
| country_code | [str](`str`) | ISO 3166-1 alpha-2 country code. Defaults to ``"DE"``. | `'DE'` |
| state | [str](`str`) | Sub-national state/region code. Defaults to ``"NW"``. | `'NW'` |

## Returns {.doc-section .doc-section-returns}

| Name | Type | Description |
|--------|------------------------------------------------|-------------------------------------------------------------------------|
| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | pd.DataFrame: Integer columns ``is_workday`` and ``day_type``; tz-aware |
| | [pd](`pandas`).[DataFrame](`pandas.DataFrame`) | `DatetimeIndex` with the requested *freq*. |

## Examples {.doc-section .doc-section-examples}

```{python}
import pandas as pd
from spotforecast2_safe.calendar import get_day_type_features

forecast_horizon = 24
n_data = 48
data = pd.DataFrame(
{"load": range(n_data)},
index=pd.date_range("2024-01-01", periods=n_data, freq="h", tz="UTC"),
)
start = data.index[0]
cov_end = start + pd.Timedelta(hours=(n_data + forecast_horizon - 1))

feats = get_day_type_features(
data=data, start=start, cov_end=cov_end,
forecast_horizon=forecast_horizon,
)
print("columns:", feats.columns.tolist())
print("shape:", feats.shape)
# 2024-01-01 is New Year (holiday) → day_type 3, not a workday.
assert feats.loc["2024-01-01 00:00:00+00:00", "day_type"] == 3
assert feats.loc["2024-01-01 00:00:00+00:00", "is_workday"] == 0
assert feats.shape == (n_data + forecast_horizon, 2)
```
2 changes: 2 additions & 0 deletions docs/reference/configurator.config_multi.ConfigMulti.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ configurator.config_multi.ConfigMulti(
include_apparent_temperature=False,
degree_hours_base_heating=15.0,
degree_hours_base_cooling=22.0,
include_ephemeris_features=False,
include_day_type_features=False,
poly_features_degree=1,
max_poly_features=10,
poly_mi_n_jobs=-1,
Expand Down
3 changes: 3 additions & 0 deletions docs/reference/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -233,8 +233,11 @@ construction.
| [calendar.holiday.get_holiday_features](calendar.holiday.get_holiday_features.qmd#spotforecast2_safe.calendar.holiday.get_holiday_features) | Build public-holiday indicators and align them to a regular time grid. |
| [calendar.holiday.create_holiday_adjacency_df](calendar.holiday.create_holiday_adjacency_df.qmd#spotforecast2_safe.calendar.holiday.create_holiday_adjacency_df) | Create a DataFrame with binary adjacency indicators for public holidays. |
| [calendar.holiday.get_holiday_adjacency_features](calendar.holiday.get_holiday_adjacency_features.qmd#spotforecast2_safe.calendar.holiday.get_holiday_adjacency_features) | Build holiday-adjacency indicators and align them to a regular time grid. |
| [calendar.holiday.create_day_type_df](calendar.holiday.create_day_type_df.qmd#spotforecast2_safe.calendar.holiday.create_day_type_df) | Create a day-type refinement of the public-holiday column. |
| [calendar.holiday.get_day_type_features](calendar.holiday.get_day_type_features.qmd#spotforecast2_safe.calendar.holiday.get_day_type_features) | Build day-type indicators and align them to a regular time grid. |
| [calendar.features.get_calendar_features](calendar.features.get_calendar_features.qmd#spotforecast2_safe.calendar.features.get_calendar_features) | Create calendar-based features for a contiguous time range. |
| [calendar.features.get_day_night_features](calendar.features.get_day_night_features.qmd#spotforecast2_safe.calendar.features.get_day_night_features) | Create day/night features using astronomical sunrise and sunset times. |
| [calendar.features.get_ephemeris_features](calendar.features.get_ephemeris_features.qmd#spotforecast2_safe.calendar.features.get_ephemeris_features) | Create continuous solar-geometry features from the ephemeris. |

## Weather

Expand Down
6 changes: 6 additions & 0 deletions src/spotforecast2_safe/calendar/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,25 @@
from spotforecast2_safe.calendar.features import (
get_calendar_features,
get_day_night_features,
get_ephemeris_features,
)
from spotforecast2_safe.calendar.holiday import (
create_day_type_df,
create_holiday_adjacency_df,
create_holiday_df,
get_day_type_features,
get_holiday_adjacency_features,
get_holiday_features,
)

__all__ = [
"create_day_type_df",
"create_holiday_adjacency_df",
"create_holiday_df",
"get_calendar_features",
"get_day_night_features",
"get_day_type_features",
"get_ephemeris_features",
"get_holiday_adjacency_features",
"get_holiday_features",
]
Loading
Loading