# Weather Source Data ## Goal This subsystem aggregates external weather forecasts and stores them in a clean database-ready shape. Terminology: - **forecast**: data from an external weather source, such as Open-Meteo - **resolved truth**: observed weather for a time that has already happened - **prediction**: an internal estimate produced by a future Astrape/Gibil model This module should not produce predictions or confidence scores. A later `weather_predictor.py` subsystem can use this clean forecast database to produce predictions and confidence. ## Subsystem Boundary Initial classes should stay narrowly scoped: - `OpenMeteoClient`: fetch raw hourly forecast payloads - `OpenMeteoParser`: convert API payloads into external forecast runs and points - `WeatherBuilder`: normalize and select clean forecast records for database use - `WeatherStore`: persist forecast points and resolved truth These classes communicate through data models like `WeatherForecastRun`, `WeatherForecastPoint`, and `WeatherResolvedTruth`. ## Core Data Shape Every weather API pull is a forecast run. ```text issued_at = when the external forecast was fetched target_at = the hour being forecast horizon_hours = target_at - issued_at forecast_value = external forecast value for that target hour ``` Later, when `target_at` is in the past, Astrape can attach resolved truth: ```text resolved_at = the hour that actually happened truth = observed temperature / observed solar radiation ``` That creates rows future modules can use: ```text target_at | resolved_truth | forecast_1h | forecast_2h | ... | forecast_48h ``` The future predictor can learn from those rows without needing to know anything about Open-Meteo payloads. ## First Variables Use Open-Meteo hourly forecast fields: - `temperature_2m` - `shortwave_radiation` - `cloud_cover` Open-Meteo documents `shortwave_radiation` as average incoming solar radiation over the preceding hour at the surface, equivalent to GHI, measured in W/m2. That is the right starting solar forecast variable for Astrape. ## Storage Shape Forecast points should be stored as individual rows. Core fields: - `issued_at` - `target_at` - `horizon_hours` - `source` - `temperature_c` - `shortwave_radiation_w_m2` - `cloud_cover_pct` Resolved truth should be stored separately. For now, resolved truth comes from the Open-Meteo historical archive API. Until archive data is available, Astrape can also store the current 0-hour Open-Meteo forecast as provisional truth with `source = open_meteo_zero_hour`. This gives the UI and future joins a near-real-time truth line. Archive truth remains separate with `source = open_meteo_archive`, so later modules can choose whether to prefer archive actuals over provisional 0-hour values. Core fields: - `resolved_at` - `source` - `temperature_c` - `shortwave_radiation_w_m2` The future predictor can join forecast points to truth by `target_at = resolved_at`. Open-Meteo archive data can lag behind current time depending on model availability, so the database daemon backfills a configurable historical window instead of assuming the last completed hour is immediately available. ## Visual Explorer We should build a small web output for inspecting forecast history. Useful first view: - select a weather variable, such as temperature or shortwave radiation - select forecast horizons, such as 2h and 4h - overlay those horizon-specific external forecasts against resolved truth - plot by `target_at` Example: ```text target_at on x-axis temperature_c on y-axis line 1: Open-Meteo forecast made 2 hours before target_at line 2: Open-Meteo forecast made 4 hours before target_at line 3: resolved truth ``` This visual layer should read from the cleaned weather database. It should not be part of the Open-Meteo client or parser. ## First Implementation Slice 1. Fetch one Open-Meteo-style hourly forecast run. 2. Parse it into forecast points. 3. Normalize the run through `WeatherBuilder`. 4. Store forecast points through `WeatherStore`. 5. Add resolved truth rows when we have a source for observed weather. 6. Build the visual explorer after forecast/truth storage exists.