Files
Astrape/docs/ingestion-and-storage.md
T
rpotter6298 c8e3016fd6 Add new daemons and debug scripts for Sigenergy and Oracle functionalities
- Implement `sigen_daemon.py` to poll Sigenergy plant metrics and store snapshots.
- Create `web_daemon.py` for serving a web interface with various endpoints.
- Add debug scripts:
  - `debug_duplicates.py` to find duplicate target times in forecast data.
  - `debug_energy_forecast.py` to print baseline energy forecast curves.
  - `debug_oracle_evaluations.py` to run the oracle evaluator.
  - `debug_sigen.py` to inspect stored Sigenergy plant snapshots.
  - `debug_weather.py` to trace resolved truth data.
  - `modbus_test.py` for exploring Sigenergy plants or inverters over Modbus TCP.
- Introduce `oracle_evaluator.py` for evaluating stored oracle predictions against actuals.
- Add TCN training scripts in `tcn` directory for training usage sequence models.
2026-04-28 08:14:00 +02:00

8.1 KiB

Ingestion & Storage

Purpose

Astrape needs a reliable way to collect energy-related data, normalize it, store it, and give Gibil a clean view of the current system state. The first version should favor boring, inspectable data flows over cleverness.

Gibil should not need to know whether a value came from Modbus, Home Assistant, a weather API, a price API, or a manual override. It should receive timestamped observations and snapshots with enough metadata to decide whether the data is fresh and trustworthy.

Initial Sources

Sigen Inverter

  • Protocol: Modbus TCP
  • Polling target: every 5-10 seconds for fast-changing electrical state
  • Initial metrics:
    • solar_power_w
    • battery_soc_pct
    • battery_charge_w
    • battery_discharge_w
    • grid_import_w
    • grid_export_w
    • daily_yield_kwh
  • Risk: register map must be confirmed before this can be real

Home Assistant / Ganymede

  • Preferred integration: MQTT
  • Direction: HASS/Ganymede should publish selected state to Astrape where possible
  • Initial metrics:
    • home_power_w
    • indoor_temp_c
    • selected device states
    • selected sensor values needed for water/heating logic
  • Reasoning: MQTT keeps Astrape loosely coupled and avoids making HASS a synchronous dependency for every decision tick

Weather

  • Preferred first source: OpenMeteo
  • Polling target: hourly forecast refresh
  • Initial metrics:
    • outdoor_temp_c
    • cloud_cover_pct
    • ghi_w_m2
    • wind_speed_m_s
  • Use: external forecast history for generation and heating models

Grid Pricing

  • First implementation: static time-of-use config
  • Later implementation: spot pricing API if needed
  • Initial metrics:
    • grid_price_per_kwh
    • price_stage
    • cheap_window_active
  • Reasoning: static config lets Gibil produce useful behavior before price API work is settled

Manual Inputs

  • Purpose: allow operator-supplied values when a real integration is not available yet
  • Inputs may come from local config or a small authenticated admin path
  • Manual data should be marked clearly with source = manual

Observation Shape

Every collector should produce normalized observations.

observed_at: timestamp when the measurement was true
received_at: timestamp when Astrape received it
source: sigen | hass | weather | price | manual
metric: stable metric name
value: number, string, or boolean
unit: W | kWh | pct | C | SEK/kWh | state | none
quality: ok | stale | estimated | missing | error
metadata: source-specific context

Guidelines:

  • observed_at and received_at are both needed because pushed data may arrive late
  • metric names should be stable and boring
  • raw source names/registers/entities belong in metadata, not in the metric name
  • Gibil should be able to ignore stale or low-quality observations

Derived Snapshots

Gibil should reason from snapshots, not directly from loose individual observations.

A snapshot is the best-known whole-system state at a decision tick. It can include:

  • current solar generation
  • current home consumption
  • battery SoC
  • battery charge/discharge power
  • grid import/export
  • current price stage
  • active forecast window
  • stale/missing input flags

Snapshots should be persisted because they explain what Gibil knew when it made a decision.

Storage Choice

Use TimescaleDB as the first primary store.

Reasons:

  • It is Postgres, so querying and joining data stays straightforward
  • It handles time-series retention and aggregation well
  • It works for raw observations, derived snapshots, decisions, forecasts, and events
  • It leaves room for later model training without needing a second historical store immediately

InfluxDB remains a reasonable alternative, but TimescaleDB is the better default if we want relational joins, auditability, and forecast training queries.

The runtime expects ASTRAPE_DATABASE_URL to point at TimescaleDB. Weather ingest also expects ASTRAPE_LATITUDE and ASTRAPE_LONGITUDE.

Initial Tables

observations

Raw normalized metric samples from all collectors.

Core fields:

  • id
  • observed_at
  • received_at
  • source
  • metric
  • value_num
  • value_text
  • value_bool
  • unit
  • quality
  • metadata

Notes:

  • use one value column based on the metric type
  • keep metadata as JSON for source-specific details
  • make this a hypertable on observed_at

snapshots

Periodic whole-system state used by Gibil.

Core fields:

  • id
  • created_at
  • snapshot
  • input_quality

Notes:

  • store the snapshot as JSON initially
  • this can be normalized later if query patterns demand it

decisions

Gibil outputs and reasoning.

Core fields:

  • id
  • created_at
  • snapshot_id
  • stage
  • recommendations
  • reasons
  • confidence

Notes:

  • decisions should be explainable enough to debug after the fact
  • this table becomes the audit trail for HASS-facing behavior

weather_forecast_points

Clean external weather forecast points from weather sources.

Core fields:

  • id
  • issued_at
  • target_at
  • horizon_hours
  • source
  • temperature_c
  • shortwave_radiation_w_m2
  • cloud_cover_pct

Notes:

  • this stores external forecasts, not internal predictions
  • make this a hypertable on target_at

weather_resolved_truth

Observed weather for target hours that have already happened.

Core fields:

  • id
  • resolved_at
  • source
  • temperature_c
  • shortwave_radiation_w_m2

Notes:

  • future prediction modules can join this to weather_forecast_points
  • make this a hypertable on resolved_at

sigen_plant_snapshots

High-resolution Sigenergy plant state from Modbus TCP.

Core fields:

  • observed_at
  • received_at
  • source
  • solar_power_w
  • battery_soc_pct
  • battery_soh_pct
  • battery_power_w
  • grid_power_w
  • grid_import_w
  • grid_export_w
  • load_power_w
  • plant_active_power_w
  • accumulated_pv_energy_kwh
  • daily_consumed_energy_kwh
  • accumulated_consumed_energy_kwh
  • status fields for EMS, running state, and grid sensor state
  • raw_values

Notes:

  • raw polling target is SIGEN_POLL_SECONDS=5
  • make this a hypertable on observed_at
  • keep raw JSON during integration so unsupported or surprising registers can be debugged
  • rollup views should preserve averages, min/max spikes, and sample counts so short-duration usage signatures are not erased completely

Initial rollups:

  • sigen_plant_snapshots_1m
  • sigen_plant_snapshots_15m
  • sigen_plant_snapshots_1h

system_events

Operational events from collectors, storage, Gibil, and publishers.

Core fields:

  • id
  • created_at
  • component
  • severity
  • event_type
  • message
  • metadata

Notes:

  • this should capture stale data, auth failures, bad Modbus reads, publish failures, and degraded-mode decisions

Retention

Initial retention targets:

  • raw 5-10 second observations: 7-30 days
  • 1-minute aggregates: 6-12 months
  • 15-minute/hourly aggregates: keep indefinitely unless storage becomes a problem
  • decisions: keep indefinitely
  • system events: keep indefinitely or archive after a year

Retention should be revisited after real sample rates and database size are known.

First Slice

The first implementation slice should prove the shape before touching real hardware.

  1. Define the observation and snapshot models.
  2. Add a manual collector only if needed for operator-supplied values.
  3. Store observations in TimescaleDB or a local development substitute.
  4. Build one snapshot from the latest observations.
  5. Let Gibil make a simple stage decision from that snapshot.
  6. Persist the decision with reasons.

This gives us the whole loop:

collector -> observations -> snapshot -> Gibil decision -> stored audit trail

MQTT publishing can come immediately after this loop exists.

Open Questions

  • Should development use real TimescaleDB from day one, or SQLite/Postgres first?
  • What is the exact MQTT topic namespace for HASS/Ganymede integration?
  • Which HASS entities should be included in the first read-only state feed?
  • How should the gibil IPA identity authenticate to MQTT and HASS?
  • What high-resolution retention target is acceptable on the Astrape VM?
  • Should snapshots be created on a fixed schedule, on new data, or both?