4.1 KiB
Weather Source Data
Goal
This subsystem aggregates external weather forecasts and stores them in a clean database-ready shape.
Terminology:
- forecast: data from an external weather source, such as Open-Meteo
- resolved truth: observed weather for a time that has already happened
- prediction: an internal estimate produced by a future Astrape/Gibil model
This module should not produce predictions or confidence scores. A later weather_predictor.py subsystem can use this clean forecast database to produce predictions and confidence.
Subsystem Boundary
Initial classes should stay narrowly scoped:
OpenMeteoClient: fetch raw hourly forecast payloadsOpenMeteoParser: convert API payloads into external forecast runs and pointsWeatherBuilder: normalize and select clean forecast records for database useWeatherStore: persist forecast points and resolved truth
These classes communicate through data models like WeatherForecastRun, WeatherForecastPoint, and WeatherResolvedTruth.
Core Data Shape
Every weather API pull is a forecast run.
issued_at = when the external forecast was fetched
target_at = the hour being forecast
horizon_hours = target_at - issued_at
forecast_value = external forecast value for that target hour
Later, when target_at is in the past, Astrape can attach resolved truth:
resolved_at = the hour that actually happened
truth = observed temperature / observed solar radiation
That creates rows future modules can use:
target_at | resolved_truth | forecast_1h | forecast_2h | ... | forecast_48h
The future predictor can learn from those rows without needing to know anything about Open-Meteo payloads.
First Variables
Use Open-Meteo hourly forecast fields:
temperature_2mshortwave_radiationcloud_cover
Open-Meteo documents shortwave_radiation as average incoming solar radiation over the preceding hour at the surface, equivalent to GHI, measured in W/m2. That is the right starting solar forecast variable for Astrape.
Storage Shape
Forecast points should be stored as individual rows.
Core fields:
issued_attarget_athorizon_hourssourcetemperature_cshortwave_radiation_w_m2cloud_cover_pct
Resolved truth should be stored separately. For now, resolved truth comes from the Open-Meteo historical archive API.
Until archive data is available, Astrape can also store the current 0-hour Open-Meteo forecast as provisional truth with source = open_meteo_zero_hour. This gives the UI and future joins a near-real-time truth line. Archive truth remains separate with source = open_meteo_archive, so later modules can choose whether to prefer archive actuals over provisional 0-hour values.
Core fields:
resolved_atsourcetemperature_cshortwave_radiation_w_m2
The future predictor can join forecast points to truth by target_at = resolved_at.
Open-Meteo archive data can lag behind current time depending on model availability, so the database daemon backfills a configurable historical window instead of assuming the last completed hour is immediately available.
Visual Explorer
We should build a small web output for inspecting forecast history.
Useful first view:
- select a weather variable, such as temperature or shortwave radiation
- select forecast horizons, such as 2h and 4h
- overlay those horizon-specific external forecasts against resolved truth
- plot by
target_at
Example:
target_at on x-axis
temperature_c on y-axis
line 1: Open-Meteo forecast made 2 hours before target_at
line 2: Open-Meteo forecast made 4 hours before target_at
line 3: resolved truth
This visual layer should read from the cleaned weather database. It should not be part of the Open-Meteo client or parser.
First Implementation Slice
- Fetch one Open-Meteo-style hourly forecast run.
- Parse it into forecast points.
- Normalize the run through
WeatherBuilder. - Store forecast points through
WeatherStore. - Add resolved truth rows when we have a source for observed weather.
- Build the visual explorer after forecast/truth storage exists.