Bumps machineGroupControl (e1e1977) and pumpingStation (ef07f2a) — example dashboard JSON tweaks committed on each submodule's development branch. Adds docs/research/ and docs/prd/ for the dashboardAPI v2 graph-aware Grafana generator workflow (Gitea issues #32-#43). Ignores .prototypes/ — throwaway spike code lives there per the /prototype skill.
6.8 KiB
6.8 KiB
Research brief: graph-aware Grafana dashboard generator in dashboardAPI
Date: 2026-05-26
Context: follows /grill-me session that locked design constraints; feeds into /prd.
Questions
- Node-RED lifecycle: how does a custom node reliably detect "deploy complete" across deploy types?
- Prior art: existing Node-RED → Grafana auto-dashboard generators
- Grafana HTTP API: idempotent dashboard updates by UID, version conflicts, RBAC
- Dynamic min/max envelope pattern: dashed reference lines that vary over time
- EVOLV-internal scaffolding already in place
Design constraints already settled in /grill-me
- dashboardAPI = dashboard generator, not just an InfluxDB writer.
- One dashboardAPI instance = one Grafana dashboard. Multiple instances coexist.
- Single source of truth: regen on Node-RED deploy clobbers manual Grafana edits.
- Trigger: HTTP API push from dashboardAPI to Grafana, fired on Node-RED deploy.
- Auth: per-flow Grafana service-account token.
- Templates centralized in
nodes/dashboardAPI/src/templates/per node type. - Per-instance
_measurement= node name (already ininfluxdbFormatter). - No data duplication between parent and child panels (MGC shows group-level only).
- Predicted-vs-measured = 2 panels side by side; predicted only when no measured registered.
- Per-pump panel set: %control / flow / delta P / measured-from-children / efficiency / dashed dynamic bounds.
- Static config bounds → dashed reference lines that follow the live operating envelope (top/bottom dashed + act value).
What's already in this codebase
- Child registration is fully graph-aware.
ChildRegistrationUtilskeeps aMap<id, {child, softwareType, position, registeredAt}>with type-aware accessorsgetAllChildren(),getChildById(),getChildrenOfType(). (nodes/generalFunctions/src/helper/childRegistrationUtils.js:19-106) - dashboardAPI already iterates its children.
extractChildren()readsnodeSource.childRegistrationUtils.registeredChildren.values(). (nodes/dashboardAPI/src/specificClass.js:151-163) - Grafana upsert URL is already constructed but not yet dispatched.
grafanaUpsertUrl()builds the target URL — the HTTP send is missing. (nodes/dashboardAPI/src/specificClass.js:107-110) - InfluxDB schema is
measurement: nodeName, tags from flattened config (id, softwareType, role, positionVsParent, uuid, tagCode, geoLocation, category, type, model, unit). (nodes/generalFunctions/src/helper/outputUtils.js:44,99-117;formatters/influxdbFormatter.js:12-20) - Lifecycle hooks: only
node.on('close')andnode.on('input')are used. No EVOLV node currently subscribes toRED.events.on('flows:started')or similar — net-new wiring. (nodes/generalFunctions/src/nodered/BaseNodeAdapter.js:164,184) - dashboardAPI's bearer token is stored as a plain
defaultsfield, NOT as a Node-REDcredentials:block — so it's not encrypted at rest today. (nodes/dashboardAPI/dashboardAPI.html:15-16;src/nodeClass.js:38-42) Contradicts the grilling assumption that "the existing InfluxDB credentials path" is already in place — it isn't. - No outbound external HTTPS pattern exists anywhere in EVOLV nodes. Net-new code path.
External options
- Legacy Grafana API (
POST /api/dashboards/dbwithoverwrite: true). Skips version + uid-uniqueness checks → idempotent. Returns412 Precondition Failedon stale version whenoverwrite=false. Minimum RBAC:dashboards:writescoped to a folder. (docs) - Grafana 12 Kubernetes-style API (
/apis/dashboard.grafana.app/v1/...). Returns409 Conflictinstead of412. Newer but couples integration to Grafana 12+. flows:startedruntime event fires on every deploy (full / nodes / flows) with{type, diff}payload. De-dupe by inspectingdiff.added/changed/removed. Runtime events are undocumented — must read source. (Node-REDpackages/.../runtime/lib/flows/index.js)nodes-startedevent is deprecated — useflows:started.- Dashed-line dynamic bands: the only path that works today is emitting min/max as separate Influx fields + applying
fieldConfig.overrides[].properties[].id = "custom.lineStyle"with{fill: "dash", dash: [10,10]}. Per-series override viabyNamematcher. - Grafana thresholds are static-only (open issue grafana/grafana#115398 — Needs Prioritisation). Dead end for time-varying bands.
Prior art
- No relevant prior art found. Every "node-red + grafana" tutorial puts Influx in the middle and hand-builds dashboards. No npm package pushes Grafana dashboards from Node-RED. Greenfield lane.
- Grafana Foundation SDK / dashboards-as-code (docs) — assumes out-of-band CI generation, not a live Node-RED instance.
- Operating-envelope plotting in Grafana — community thread 57225 asks the exact question, no accepted answer.
- Known Grafana bugs around
custom.lineStyle: #75259 (transforms) and #86546 (overlapping dashed → solid).
Open unknowns
- (O-1)
flows:started+diffreliability. Doesdiffcleanly distinguish "this dashboardAPI's flow changed" from "an unrelated flow changed" across all three deploy modes? Source-readable but needs an actual spike to verify edge cases (e.g. aModified Nodesdeploy that adds a child measurement to a pumpingStation registered to a dashboardAPI in a different tab). → Candidate for/prototype. - (O-2) Dashed-line rendering against real Influx series. Two open Grafana bugs (#75259, #86546) affect
custom.lineStyle. Untested whether either bites with EVOLV's emission pattern. → Candidate for/prototype. - (O-3) Legacy
/api/dashboards/dbvs v12 K8s API. Which to commit to? Locks integration to a Grafana version family. Local stack usesgrafana/grafana:latest— version drifts ondocker compose pull. → PRD-time decision; pin Grafana image. - (O-4) Bearer-token storage migration. Assumption that "follow existing creds pattern" doesn't hold — dashboardAPI stores it as plain config today. Need to migrate to Node-RED
credentials:block. Risk: token currently sitting inflow.jsonof users' existing flows. → PRD-time decision; migration step in first issue.
Recommended next step
/prd — commit the design, resolve O-3 and O-4 explicitly, and queue O-1 and O-2 for /prototype before the first issue ships.