# Evaluation harness Scenario-based evaluation for pumpingStation. Each scenario scripts a stream of inputs against a configured station, ticks the simulator at 1 s resolution, records every state, and prints a summary + event log + expectation check. Separate from unit tests (`test/`) — those verify individual pieces of logic in isolation; scenarios check end-to-end behaviour over time with realistic input trajectories. ## Run ```bash # One scenario node simulations/run.js levelbased-steady # All scenarios at once node simulations/run.js --all ``` Per-tick records are written to `simulations/logs/.jsonl` for post-hoc analysis (e.g. streaming into InfluxDB for Grafana, or pandas / jq for one-off exploration). ## Scenario file shape ```js // simulations/scenarios/.js module.exports = { name: 'scenario-identifier', description: 'one sentence — what the scenario is testing', durationSec: 1200, config: { /* PumpingStation config, same shape as nodeClass builds */ }, setup: async (ps) => { // Optional. Wire fake MGCs, calibrate initial level, etc. }, inputs: (t, ps) => { // Called every tick (t in seconds). Drive inflow, mode changes, // operator actions, etc. ps.setManualInflow(0.005, Date.now(), 'm3/s'); }, expectations: [ { name: 'no safety trips', type: 'safety_trips_eq', value: 0 }, { name: 'level stays below overflow', type: 'max_level_bounded', value: 4.5 }, ], }; ``` ## Supported expectation types | Type | Semantics | |---|---| | `max_level_bounded` | max level across the run must be `≤ value` | | `min_level_bounded` | min level across the run must be `≥ value` | | `max_demand_bounded` | max percControl must be `≤ value` | | `max_demand_gt` | max percControl must be `> value` | | `safety_trips_eq` | total ticks with `safetyActive` must equal `value` | | `safety_trips_gt` | total ticks with `safetyActive` must be `> value` | | `end_state_eq` | final record's `field` must equal `value` | | `threshold_issues_eq` | startup guardrail issue count must equal `value` | Add new expectation types in `run.js` (`evalExpectation`). ## Output Example run: ``` ═══ Scenario: levelbased-steady ═══ Constant sewer inflow below pump capacity; level converges inside the RAMP zone with demand matching inflow. Duration: 1200s, 1s ticks ─── Samples (every 10%) ─── t(s) level(m) vol(m3) dir netFlow(m3/s) src demand safe ──────────────────────────────────────────────────────────────────────────────────────── 0 2.00 20.00 steady 0 — 0% · 120 2.64 26.40 draining -0.0026 predicted 62% · 240 2.30 23.00 draining -0.0004 predicted 68% · ... ─── Events (3) ─── t= 15s direction steady → filling t= 134s direction filling → draining ─── Metrics ─── level min=2.00 max=2.73 end=2.33 m percControl min=0% max=73% end=66% safety trips=0 ticks threshold issues=0 at startup ─── Expectations ─── ✓ no safety trips: 0 ticks with safetyActive (expected 0) ✓ level stays below overflow: max level = 2.73 m (bound: ≤ 4.5) ✓ level stays above outflow: min level = 2.00 m (bound: ≥ 0.2) ✓ no threshold issues on init: 0 threshold issues at startup (expected 0) Log: simulations/logs/levelbased-steady.jsonl (1200 records) ✅ PASS ``` ## Why separate from `test/`? | | `test/` | `simulations/` | |---|---|---| | runner | `node --test` | `node simulations/run.js` | | scope | one function / small behaviour | end-to-end scenario over time | | duration | milliseconds | seconds to minutes (simulated) | | assertion style | tight, exact (`assert.equal`) | tolerance / bounds / event counts | | output | TAP | summary table + JSONL for analysis | | purpose | catch regressions | analyse how the system responds to input | Unit tests live under `test/basic/`, `test/integration/`, `test/edge/`. Scenarios live here under `simulations/scenarios/`. ## Sending logs to Grafana (optional) The JSONL output has one record per tick. To stream into InfluxDB for Grafana viewing, adapt a small consumer: ```bash jq -c '{ measurement: "pumping_station_eval", tags: { scenario: "'$SCENARIO'" }, fields: { level: .level, volume: .volume, demand: .percControl, safety: (.safetyActive|if . then 1 else 0 end) }, timestamp: (.t | tonumber | . * 1000000000) }' simulations/logs/$SCENARIO.jsonl \ | influx write --bucket=telemetry ... ``` The `t` field is seconds from the scenario start (not wall-clock), so point the Grafana time range at `now() - $duration` after running.