Files
Astrape/docs/weather-source-data.md
T
rpotter6298 9d15860f0b first_commit
2026-04-25 20:35:25 +02:00

118 lines
4.1 KiB
Markdown

# Weather Source Data
## Goal
This subsystem aggregates external weather forecasts and stores them in a clean database-ready shape.
Terminology:
- **forecast**: data from an external weather source, such as Open-Meteo
- **resolved truth**: observed weather for a time that has already happened
- **prediction**: an internal estimate produced by a future Astrape/Gibil model
This module should not produce predictions or confidence scores. A later `weather_predictor.py` subsystem can use this clean forecast database to produce predictions and confidence.
## Subsystem Boundary
Initial classes should stay narrowly scoped:
- `OpenMeteoClient`: fetch raw hourly forecast payloads
- `OpenMeteoParser`: convert API payloads into external forecast runs and points
- `WeatherBuilder`: normalize and select clean forecast records for database use
- `WeatherStore`: persist forecast points and resolved truth
These classes communicate through data models like `WeatherForecastRun`, `WeatherForecastPoint`, and `WeatherResolvedTruth`.
## Core Data Shape
Every weather API pull is a forecast run.
```text
issued_at = when the external forecast was fetched
target_at = the hour being forecast
horizon_hours = target_at - issued_at
forecast_value = external forecast value for that target hour
```
Later, when `target_at` is in the past, Astrape can attach resolved truth:
```text
resolved_at = the hour that actually happened
truth = observed temperature / observed solar radiation
```
That creates rows future modules can use:
```text
target_at | resolved_truth | forecast_1h | forecast_2h | ... | forecast_48h
```
The future predictor can learn from those rows without needing to know anything about Open-Meteo payloads.
## First Variables
Use Open-Meteo hourly forecast fields:
- `temperature_2m`
- `shortwave_radiation`
- `cloud_cover`
Open-Meteo documents `shortwave_radiation` as average incoming solar radiation over the preceding hour at the surface, equivalent to GHI, measured in W/m2. That is the right starting solar forecast variable for Astrape.
## Storage Shape
Forecast points should be stored as individual rows.
Core fields:
- `issued_at`
- `target_at`
- `horizon_hours`
- `source`
- `temperature_c`
- `shortwave_radiation_w_m2`
- `cloud_cover_pct`
Resolved truth should be stored separately. For now, resolved truth comes from the Open-Meteo historical archive API.
Until archive data is available, Astrape can also store the current 0-hour Open-Meteo forecast as provisional truth with `source = open_meteo_zero_hour`. This gives the UI and future joins a near-real-time truth line. Archive truth remains separate with `source = open_meteo_archive`, so later modules can choose whether to prefer archive actuals over provisional 0-hour values.
Core fields:
- `resolved_at`
- `source`
- `temperature_c`
- `shortwave_radiation_w_m2`
The future predictor can join forecast points to truth by `target_at = resolved_at`.
Open-Meteo archive data can lag behind current time depending on model availability, so the database daemon backfills a configurable historical window instead of assuming the last completed hour is immediately available.
## Visual Explorer
We should build a small web output for inspecting forecast history.
Useful first view:
- select a weather variable, such as temperature or shortwave radiation
- select forecast horizons, such as 2h and 4h
- overlay those horizon-specific external forecasts against resolved truth
- plot by `target_at`
Example:
```text
target_at on x-axis
temperature_c on y-axis
line 1: Open-Meteo forecast made 2 hours before target_at
line 2: Open-Meteo forecast made 4 hours before target_at
line 3: resolved truth
```
This visual layer should read from the cleaned weather database. It should not be part of the Open-Meteo client or parser.
## First Implementation Slice
1. Fetch one Open-Meteo-style hourly forecast run.
2. Parse it into forecast points.
3. Normalize the run through `WeatherBuilder`.
4. Store forecast points through `WeatherStore`.
5. Add resolved truth rows when we have a source for observed weather.
6. Build the visual explorer after forecast/truth storage exists.