-
Notifications
You must be signed in to change notification settings - Fork 0
Add conversion from Solcast to SolarFarmer TSV weather format #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
60b8b2d
40f110d
e534f0a
7a63b64
047a837
6ebfaa4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -60,6 +60,43 @@ | |
| ) | ||
| df.index.name = "DateTime" | ||
| df.to_csv("weather.tsv", sep="\\t") | ||
|
|
||
| Solcast Column Mapping | ||
| ~~~~~~~~~~~~~~~~~~~~~~ | ||
| When converting a Solcast DataFrame to SolarFarmer TSV format, the following | ||
| column mapping and unit conversions are applied automatically by | ||
| :func:`from_solcast`. | ||
|
|
||
| Solcast ``period_end`` timestamps are shifted to period-beginning by | ||
| subtracting the inferred time resolution (e.g. −30 min for 30-min data). | ||
| ``precipitable_water`` is in kg/m² (equivalent to mm) and is divided by 10 | ||
| to obtain cm as required by SolarFarmer. ``surface_pressure`` is already in | ||
| hPa which equals mbar, so no pressure conversion is needed. | ||
|
|
||
| Only columns that are present in the DataFrame are mapped; ``period_end``, | ||
| ``air_temp``, and ``ghi`` are the most commonly available columns but the | ||
| others are all optional. The ``gti`` (plane-of-array irradiance) column is | ||
| not mapped; SolarFarmer derives POA irradiance internally from GHI/DHI. | ||
|
|
||
| ======================= =========== ===================================== | ||
| Solcast column SF column Unit conversion | ||
| ======================= =========== ===================================== | ||
| ``ghi`` ``GHI`` W/m² → W/m² (none) | ||
| ``dhi`` ``DHI`` W/m² → W/m² (none) | ||
| ``air_temp`` ``TAmb`` °C → °C (none) | ||
| ``wind_speed_10m`` ``WS`` m/s → m/s (none) | ||
| ``surface_pressure`` ``Pressure`` hPa → mbar (hPa = mbar, none) | ||
| ``precipitable_water`` ``Water`` kg/m² → cm (÷ 10) | ||
| ``relative_humidity`` ``RH`` % → % (none) | ||
| ``albedo`` ``Albedo`` fraction → fraction (none) | ||
| ``hsu_loss_fraction`` ``Soiling`` fraction → fraction (none) | ||
| ``kimber_loss_fraction`` ``Soiling`` fraction → fraction (none) | ||
| ``soiling`` ``Soiling`` fraction → fraction (none) | ||
| ======================= =========== ===================================== | ||
|
|
||
| .. note:: | ||
| Only one soiling column should be present in the DataFrame. If multiple | ||
| are provided, the last one wins after column renaming. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
@@ -78,6 +115,8 @@ | |
| "check_sequential_year_timestamps", | ||
| "from_dataframe", | ||
| "from_pvlib", | ||
| "from_solcast", | ||
| "shift_period_end_to_beginning", | ||
| ] | ||
|
|
||
|
|
||
|
|
@@ -130,6 +169,20 @@ def check_sequential_year_timestamps(file_path: str | pathlib.Path) -> None: | |
| "pressure": "Pressure", | ||
| } | ||
|
|
||
| SOLCAST_COLUMN_MAP: dict[str, str] = { | ||
| "ghi": "GHI", | ||
| "dhi": "DHI", | ||
| "air_temp": "TAmb", | ||
| "wind_speed_10m": "WS", | ||
| "surface_pressure": "Pressure", | ||
| "precipitable_water": "Water", | ||
| "relative_humidity": "RH", | ||
| "albedo": "Albedo", | ||
| "hsu_loss_fraction": "Soiling", | ||
| "kimber_loss_fraction": "Soiling", | ||
| "soiling": "Soiling", | ||
| } | ||
|
|
||
|
|
||
| def from_dataframe( | ||
| df: pd.DataFrame, | ||
|
|
@@ -241,6 +294,131 @@ def from_pvlib( | |
| ) | ||
|
|
||
|
|
||
| def from_solcast( | ||
| df: pd.DataFrame, | ||
| output_path: str | pathlib.Path, | ||
| ) -> pathlib.Path: | ||
| """Convert a Solcast DataFrame to a SolarFarmer TSV weather file. | ||
|
|
||
| Wrapper around :func:`from_dataframe` with the standard Solcast column | ||
| mapping. Two automatic conversions are applied before writing: | ||
|
|
||
| * **Timestamp shift**: Solcast timestamps represent ``period_end``; | ||
| SolarFarmer expects ``period_beginning``. The time resolution is | ||
| inferred from the minimum consecutive time difference and subtracted | ||
| from every timestamp. | ||
| * **Precipitable water**: Solcast ``precipitable_water`` is in kg/m² | ||
| (equivalent to mm); SolarFarmer expects cm, so the column is divided | ||
| by 10. | ||
|
|
||
| ``surface_pressure`` is already in hPa which equals mbar, so no pressure | ||
| conversion is needed. | ||
|
|
||
| Only columns that are present in the DataFrame are mapped; the minimum | ||
| required columns are ``period_end`` (as the index), ``air_temp``, and | ||
| ``ghi``. All other columns (``dhi``, ``wind_speed_10m``, | ||
| ``surface_pressure``, ``precipitable_water``, ``relative_humidity``, | ||
| ``albedo``, ``hsu_loss_fraction``, ``kimber_loss_fraction``, ``soiling``) | ||
| are optional and mapped when present. | ||
|
|
||
| .. note:: Requires ``pandas``. Install with ``pip install 'dnv-solarfarmer[all]'``. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| df : pandas.DataFrame | ||
| Solcast-style DataFrame with a DatetimeIndex (``period_end``) and | ||
| any subset of columns: ``ghi``, ``dhi``, ``air_temp``, | ||
| ``wind_speed_10m``, ``surface_pressure``, ``precipitable_water``, | ||
| ``relative_humidity``, ``albedo``, ``hsu_loss_fraction``, | ||
| ``kimber_loss_fraction``, ``soiling``. Unmapped columns are removed. | ||
| Only one soiling column should be present; if multiple are provided, | ||
| the last one wins after column renaming. | ||
| output_path : str or Path | ||
| Destination file path. | ||
|
|
||
| Returns | ||
| ------- | ||
| pathlib.Path | ||
|
|
||
| Raises | ||
| ------ | ||
| ValueError | ||
| If the DataFrame has no DatetimeIndex. | ||
| ImportError | ||
| If pandas is not installed. | ||
| """ | ||
| try: | ||
| import pandas as pd | ||
| except ImportError: | ||
| raise ImportError(PANDAS_INSTALL_MSG) from None | ||
|
|
||
| if not isinstance(df.index, pd.DatetimeIndex): | ||
| raise ValueError( | ||
| "DataFrame must have a DatetimeIndex. " | ||
| "Use df.set_index(pd.to_datetime(df['period_end'])) or similar." | ||
| ) | ||
|
|
||
| out = df.copy() | ||
|
|
||
| # Solcast timestamps are period_end; SolarFarmer expects period_beginning. | ||
| out = shift_period_end_to_beginning(out) | ||
|
|
||
| # Drop columns that have no SolarFarmer equivalent (e.g. gti, any other custom fields). | ||
| out = out[[c for c in out.columns if c in SOLCAST_COLUMN_MAP]] | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so if I add something to my data like "soiling" then I need to do it after calling this function? the documentation says "unmapped columns are unchanged"... is that contradictory or do I misunderstand?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. well spotted. A mistake in the docs string. I had added the columns removal afterwards to avoid passing the weather_type or other unexpected variables. I will add soiling to column map, so that it is preserved if the soiling is added before hand. |
||
|
|
||
| # precipitable_water: Solcast provides kg/m² (= mm); SolarFarmer expects cm. | ||
| if "precipitable_water" in out.columns: | ||
| out["precipitable_water"] = out["precipitable_water"] / 10 | ||
|
|
||
| return from_dataframe( | ||
| out, | ||
| output_path, | ||
| column_map=SOLCAST_COLUMN_MAP, | ||
| pressure_pa_to_mbar=False, # Solcast surface_pressure is hPa = mbar | ||
| ) | ||
|
|
||
|
|
||
| def shift_period_end_to_beginning(df: pd.DataFrame) -> pd.DataFrame: | ||
| """Shift DatetimeIndex from period_end to period_beginning. | ||
|
|
||
| Infers the time resolution from the minimum consecutive time difference | ||
| and subtracts it from all timestamps. Useful for converters where the | ||
| source data provides period_end timestamps but the target format expects | ||
| period_beginning. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| df : pandas.DataFrame | ||
| DataFrame with a DatetimeIndex representing period_end timestamps. | ||
|
|
||
| Returns | ||
| ------- | ||
| pandas.DataFrame | ||
| DataFrame with DatetimeIndex shifted to period_beginning. | ||
|
|
||
| Raises | ||
| ------ | ||
| ValueError | ||
| If the DataFrame has no DatetimeIndex. | ||
| """ | ||
| try: | ||
| import pandas as pd | ||
| except ImportError: | ||
| raise ImportError(PANDAS_INSTALL_MSG) from None | ||
|
|
||
| if not isinstance(df.index, pd.DatetimeIndex): | ||
| raise ValueError( | ||
| "DataFrame must have a DatetimeIndex. " | ||
| "Use df.set_index(pd.to_datetime(df['period_end'])) or similar." | ||
| ) | ||
|
|
||
| out = df.copy() | ||
| time_deltas = out.index.to_series().diff().dropna() | ||
| inferred_timedelta = time_deltas.median() | ||
| out.index = out.index - inferred_timedelta | ||
| return out | ||
|
|
||
|
|
||
| TSV_COLUMNS: dict = { | ||
| "required": [ | ||
| { | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.