118 lines
4.1 KiB
Markdown
118 lines
4.1 KiB
Markdown
# Weather Source Data
|
|
|
|
## Goal
|
|
|
|
This subsystem aggregates external weather forecasts and stores them in a clean database-ready shape.
|
|
|
|
Terminology:
|
|
- **forecast**: data from an external weather source, such as Open-Meteo
|
|
- **resolved truth**: observed weather for a time that has already happened
|
|
- **prediction**: an internal estimate produced by a future Astrape/Gibil model
|
|
|
|
This module should not produce predictions or confidence scores. A later `weather_predictor.py` subsystem can use this clean forecast database to produce predictions and confidence.
|
|
|
|
## Subsystem Boundary
|
|
|
|
Initial classes should stay narrowly scoped:
|
|
|
|
- `OpenMeteoClient`: fetch raw hourly forecast payloads
|
|
- `OpenMeteoParser`: convert API payloads into external forecast runs and points
|
|
- `WeatherBuilder`: normalize and select clean forecast records for database use
|
|
- `WeatherStore`: persist forecast points and resolved truth
|
|
|
|
These classes communicate through data models like `WeatherForecastRun`, `WeatherForecastPoint`, and `WeatherResolvedTruth`.
|
|
|
|
## Core Data Shape
|
|
|
|
Every weather API pull is a forecast run.
|
|
|
|
```text
|
|
issued_at = when the external forecast was fetched
|
|
target_at = the hour being forecast
|
|
horizon_hours = target_at - issued_at
|
|
forecast_value = external forecast value for that target hour
|
|
```
|
|
|
|
Later, when `target_at` is in the past, Astrape can attach resolved truth:
|
|
|
|
```text
|
|
resolved_at = the hour that actually happened
|
|
truth = observed temperature / observed solar radiation
|
|
```
|
|
|
|
That creates rows future modules can use:
|
|
|
|
```text
|
|
target_at | resolved_truth | forecast_1h | forecast_2h | ... | forecast_48h
|
|
```
|
|
|
|
The future predictor can learn from those rows without needing to know anything about Open-Meteo payloads.
|
|
|
|
## First Variables
|
|
|
|
Use Open-Meteo hourly forecast fields:
|
|
|
|
- `temperature_2m`
|
|
- `shortwave_radiation`
|
|
- `cloud_cover`
|
|
|
|
Open-Meteo documents `shortwave_radiation` as average incoming solar radiation over the preceding hour at the surface, equivalent to GHI, measured in W/m2. That is the right starting solar forecast variable for Astrape.
|
|
|
|
## Storage Shape
|
|
|
|
Forecast points should be stored as individual rows.
|
|
|
|
Core fields:
|
|
- `issued_at`
|
|
- `target_at`
|
|
- `horizon_hours`
|
|
- `source`
|
|
- `temperature_c`
|
|
- `shortwave_radiation_w_m2`
|
|
- `cloud_cover_pct`
|
|
|
|
Resolved truth should be stored separately. For now, resolved truth comes from the Open-Meteo historical archive API.
|
|
|
|
Until archive data is available, Astrape can also store the current 0-hour Open-Meteo forecast as provisional truth with `source = open_meteo_zero_hour`. This gives the UI and future joins a near-real-time truth line. Archive truth remains separate with `source = open_meteo_archive`, so later modules can choose whether to prefer archive actuals over provisional 0-hour values.
|
|
|
|
Core fields:
|
|
- `resolved_at`
|
|
- `source`
|
|
- `temperature_c`
|
|
- `shortwave_radiation_w_m2`
|
|
|
|
The future predictor can join forecast points to truth by `target_at = resolved_at`.
|
|
|
|
Open-Meteo archive data can lag behind current time depending on model availability, so the database daemon backfills a configurable historical window instead of assuming the last completed hour is immediately available.
|
|
|
|
## Visual Explorer
|
|
|
|
We should build a small web output for inspecting forecast history.
|
|
|
|
Useful first view:
|
|
- select a weather variable, such as temperature or shortwave radiation
|
|
- select forecast horizons, such as 2h and 4h
|
|
- overlay those horizon-specific external forecasts against resolved truth
|
|
- plot by `target_at`
|
|
|
|
Example:
|
|
|
|
```text
|
|
target_at on x-axis
|
|
temperature_c on y-axis
|
|
line 1: Open-Meteo forecast made 2 hours before target_at
|
|
line 2: Open-Meteo forecast made 4 hours before target_at
|
|
line 3: resolved truth
|
|
```
|
|
|
|
This visual layer should read from the cleaned weather database. It should not be part of the Open-Meteo client or parser.
|
|
|
|
## First Implementation Slice
|
|
|
|
1. Fetch one Open-Meteo-style hourly forecast run.
|
|
2. Parse it into forecast points.
|
|
3. Normalize the run through `WeatherBuilder`.
|
|
4. Store forecast points through `WeatherStore`.
|
|
5. Add resolved truth rows when we have a source for observed weather.
|
|
6. Build the visual explorer after forecast/truth storage exists.
|