Files
Astrape/docs/weather-source-data.md
T
rpotter6298 9d15860f0b first_commit
2026-04-25 20:35:25 +02:00

4.1 KiB

Weather Source Data

Goal

This subsystem aggregates external weather forecasts and stores them in a clean database-ready shape.

Terminology:

  • forecast: data from an external weather source, such as Open-Meteo
  • resolved truth: observed weather for a time that has already happened
  • prediction: an internal estimate produced by a future Astrape/Gibil model

This module should not produce predictions or confidence scores. A later weather_predictor.py subsystem can use this clean forecast database to produce predictions and confidence.

Subsystem Boundary

Initial classes should stay narrowly scoped:

  • OpenMeteoClient: fetch raw hourly forecast payloads
  • OpenMeteoParser: convert API payloads into external forecast runs and points
  • WeatherBuilder: normalize and select clean forecast records for database use
  • WeatherStore: persist forecast points and resolved truth

These classes communicate through data models like WeatherForecastRun, WeatherForecastPoint, and WeatherResolvedTruth.

Core Data Shape

Every weather API pull is a forecast run.

issued_at = when the external forecast was fetched
target_at = the hour being forecast
horizon_hours = target_at - issued_at
forecast_value = external forecast value for that target hour

Later, when target_at is in the past, Astrape can attach resolved truth:

resolved_at = the hour that actually happened
truth = observed temperature / observed solar radiation

That creates rows future modules can use:

target_at | resolved_truth | forecast_1h | forecast_2h | ... | forecast_48h

The future predictor can learn from those rows without needing to know anything about Open-Meteo payloads.

First Variables

Use Open-Meteo hourly forecast fields:

  • temperature_2m
  • shortwave_radiation
  • cloud_cover

Open-Meteo documents shortwave_radiation as average incoming solar radiation over the preceding hour at the surface, equivalent to GHI, measured in W/m2. That is the right starting solar forecast variable for Astrape.

Storage Shape

Forecast points should be stored as individual rows.

Core fields:

  • issued_at
  • target_at
  • horizon_hours
  • source
  • temperature_c
  • shortwave_radiation_w_m2
  • cloud_cover_pct

Resolved truth should be stored separately. For now, resolved truth comes from the Open-Meteo historical archive API.

Until archive data is available, Astrape can also store the current 0-hour Open-Meteo forecast as provisional truth with source = open_meteo_zero_hour. This gives the UI and future joins a near-real-time truth line. Archive truth remains separate with source = open_meteo_archive, so later modules can choose whether to prefer archive actuals over provisional 0-hour values.

Core fields:

  • resolved_at
  • source
  • temperature_c
  • shortwave_radiation_w_m2

The future predictor can join forecast points to truth by target_at = resolved_at.

Open-Meteo archive data can lag behind current time depending on model availability, so the database daemon backfills a configurable historical window instead of assuming the last completed hour is immediately available.

Visual Explorer

We should build a small web output for inspecting forecast history.

Useful first view:

  • select a weather variable, such as temperature or shortwave radiation
  • select forecast horizons, such as 2h and 4h
  • overlay those horizon-specific external forecasts against resolved truth
  • plot by target_at

Example:

target_at on x-axis
temperature_c on y-axis
line 1: Open-Meteo forecast made 2 hours before target_at
line 2: Open-Meteo forecast made 4 hours before target_at
line 3: resolved truth

This visual layer should read from the cleaned weather database. It should not be part of the Open-Meteo client or parser.

First Implementation Slice

  1. Fetch one Open-Meteo-style hourly forecast run.
  2. Parse it into forecast points.
  3. Normalize the run through WeatherBuilder.
  4. Store forecast points through WeatherStore.
  5. Add resolved truth rows when we have a source for observed weather.
  6. Build the visual explorer after forecast/truth storage exists.