Add deployment blueprint

Extend architecture review with security positioning
2026-03-23 11:54:24 +01:00 · 2026-03-23 11:35:40 +01:00
2 changed files with 346 additions and 6 deletions
--- a/architecture/deployment-blueprint.md
+++ b/architecture/deployment-blueprint.md
@@ -0,0 +1,270 @@
+# EVOLV Deployment Blueprint
+
+## Purpose
+
+This document turns the current EVOLV architecture into a concrete deployment model.
+
+It focuses on:
+
+- target infrastructure layout
+- container/service topology
+- environment and secret boundaries
+- rollout order from edge to site to central
+
+It is the local source document behind the wiki deployment pages.
+
+## 1. Deployment Principles
+
+- edge-first operation: plant logic must continue when central is unavailable
+- site mediation: site services protect field systems and absorb plant-specific complexity
+- central governance: external APIs, analytics, IAM, CI/CD, and shared dashboards terminate centrally
+- layered telemetry: InfluxDB exists where operationally justified at edge, site, and central
+- configuration authority: `tagcodering` should become the source of truth for configuration
+- secrets hygiene: tracked manifests contain variables only; secrets live in server-side env or secret stores
+
+## 2. Layered Deployment Model
+
+### 2.1 Edge node
+
+Purpose:
+
+- interface with PLCs and field assets
+- execute local Node-RED logic
+- retain local telemetry for resilience and digital-twin use cases
+
+Recommended services:
+
+- `evolv-edge-nodered`
+- `evolv-edge-influxdb`
+- optional `evolv-edge-grafana`
+- optional `evolv-edge-broker`
+
+Should not host:
+
+- public API ingress
+- central IAM
+- source control or CI/CD
+
+### 2.2 Site node
+
+Purpose:
+
+- aggregate one or more edge nodes
+- host plant-local dashboards and engineering visibility
+- mediate traffic between edge and central
+
+Recommended services:
+
+- `evolv-site-nodered` or `coresync-site`
+- `evolv-site-influxdb`
+- `evolv-site-grafana`
+- optional `evolv-site-broker`
+
+### 2.3 Central platform
+
+Purpose:
+
+- fleet-wide analytics
+- API and integration ingress
+- engineering lifecycle and releases
+- identity and governance
+
+Recommended services:
+
+- reverse proxy / ingress
+- API gateway
+- IAM
+- central InfluxDB
+- central Grafana
+- Gitea
+- CI/CD runner/controller
+- optional broker for asynchronous site/central workflows
+- configuration services over `tagcodering`
+
+## 3. Target Container Topology
+
+### 3.1 Edge host
+
+Minimum viable edge stack:
+
+```text
+edge-host-01
+  - Node-RED
+  - InfluxDB
+  - optional Grafana
+```
+
+Preferred production edge stack:
+
+```text
+edge-host-01
+  - Node-RED
+  - InfluxDB
+  - local health/export service
+  - optional local broker
+  - optional local dashboard service
+```
+
+### 3.2 Site host
+
+Minimum viable site stack:
+
+```text
+site-host-01
+  - Site Node-RED / CoreSync
+  - Site InfluxDB
+  - Site Grafana
+```
+
+Preferred production site stack:
+
+```text
+site-host-01
+  - Site Node-RED / CoreSync
+  - Site InfluxDB
+  - Site Grafana
+  - API relay / sync service
+  - optional site broker
+```
+
+### 3.3 Central host group
+
+Central should not be one giant undifferentiated host forever. It should trend toward at least these responsibility groups:
+
+```text
+central-ingress
+  - reverse proxy
+  - API gateway
+  - IAM
+
+central-observability
+  - central InfluxDB
+  - Grafana
+
+central-engineering
+  - Gitea
+  - CI/CD
+  - deployment orchestration
+
+central-config
+  - tagcodering-backed config services
+```
+
+For early rollout these may be colocated, but the responsibility split should remain clear.
+
+## 4. Compose Strategy
+
+The current repository shows:
+
+- `docker-compose.yml` as a development stack
+- `temp/cloud.yml` as a broad central-stack example
+
+For production, EVOLV should not rely on one flat compose file for every layer.
+
+Recommended split:
+
+- `compose.edge.yml`
+- `compose.site.yml`
+- `compose.central.yml`
+- optional overlay files for site-specific differences
+
+Benefits:
+
+- clearer ownership per layer
+- smaller blast radius during updates
+- easier secret and env separation
+- easier rollout per site
+
+## 5. Environment And Secrets Strategy
+
+### 5.1 Current baseline
+
+`temp/cloud.yml` now uses environment variables instead of inline credentials. That is the minimum acceptable baseline.
+
+### 5.2 Recommended production rule
+
+- tracked compose files contain `${VARIABLE}` placeholders only
+- real secrets live in server-local `.env` files or a managed secret store
+- no shared default production passwords in git
+- separate env files per layer and per environment
+
+Suggested structure:
+
+```text
+/opt/evolv/
+  compose.edge.yml
+  compose.site.yml
+  compose.central.yml
+  env/
+    edge.env
+    site.env
+    central.env
+```
+
+## 6. Recommended Network Flow
+
+### 6.1 Northbound
+
+- edge publishes or syncs upward to site
+- site aggregates and forwards selected data to central
+- central exposes APIs and dashboards to approved consumers
+
+### 6.2 Southbound
+
+- central issues advice, approved config, or mediated requests
+- site validates and relays to edge where appropriate
+- edge remains the execution point near PLCs
+
+### 6.3 Forbidden direct path
+
+- enterprise or internet clients should not directly query PLC-connected edge runtimes
+
+## 7. Rollout Order
+
+### Phase 1: Edge baseline
+
+- deploy edge Node-RED
+- deploy local InfluxDB
+- validate PLC connectivity
+- validate local telemetry and resilience
+
+### Phase 2: Site mediation
+
+- deploy site Node-RED / CoreSync
+- connect one or more edge nodes
+- validate site-local dashboards and outage behavior
+
+### Phase 3: Central services
+
+- deploy ingress, IAM, API, Grafana, central InfluxDB
+- deploy Gitea and CI/CD services
+- validate controlled northbound access
+
+### Phase 4: Configuration backbone
+
+- connect runtime layers to `tagcodering`
+- reduce config duplication in flows
+- formalize config promotion and rollback
+
+### Phase 5: Smart telemetry policy
+
+- classify signals
+- define reconstruction rules
+- define authoritative layer per horizon
+- validate analytics and auditability
+
+## 8. Immediate Technical Recommendations
+
+- treat `docker/settings.js` as development-only and create hardened production settings separately
+- split deployment manifests by layer
+- define env files per layer and environment
+- formalize healthchecks and backup procedures for every persistent service
+- define whether broker usage is required at edge, site, central, or only selectively
+
+## 9. Next Technical Work Items
+
+1. create draft `compose.edge.yml`, `compose.site.yml`, and `compose.central.yml`
+2. define server directory layout and env-file conventions
+3. define production Node-RED settings profile
+4. define site-to-central sync path
+5. define deployment and rollback runbook
--- a/architecture/stack-architecture-review.md
+++ b/architecture/stack-architecture-review.md
@@ -364,7 +364,77 @@ Questions still open:
 - telemetry transport or only synchronization/eventing?
 - durability expectations and replay behavior?

-## 7. Recommended Ideal Stack
+## 7. Security And Regulatory Positioning
+
+### 7.1 Purdue-style layering is a good fit
+
+EVOLV's preferred structure aligns well with a Purdue-style OT/IT layering approach:
+
+- PLCs and field assets stay at the operational edge
+- edge runtimes stay close to the process
+- site systems mediate between OT and broader enterprise concerns
+- central services host APIs, identity, analytics, and engineering workflows
+
+That is important because it supports segmented trust boundaries instead of direct enterprise-to-field reach-through.
+
+### 7.2 NIS2 alignment
+
+Directive (EU) 2022/2555 (NIS2) requires cybersecurity risk-management measures, incident handling, and stronger governance for covered entities.
+
+This architecture supports that by:
+
+- limiting direct exposure of field systems
+- separating operational layers
+- enabling central policy and oversight
+- preserving local operation during upstream failure
+
+### 7.3 CER alignment
+
+Directive (EU) 2022/2557 (Critical Entities Resilience Directive) focuses on resilience of essential services.
+
+The edge-plus-site approach supports that direction because:
+
+- local/site layers can continue during central disruption
+- essential service continuity does not depend on one central runtime
+- degraded-mode behavior can be explicitly designed per layer
+
+### 7.4 Cyber Resilience Act alignment
+
+Regulation (EU) 2024/2847 (Cyber Resilience Act) creates cybersecurity requirements for products with digital elements.
+
+For EVOLV, that means the platform should keep strengthening:
+
+- secure configuration handling
+- vulnerability and update management
+- release traceability
+- lifecycle ownership of components and dependencies
+
+### 7.5 GDPR alignment where personal data is present
+
+Regulation (EU) 2016/679 (GDPR) applies whenever EVOLV processes personal data.
+
+The architecture helps by:
+
+- centralizing ingress
+- reducing unnecessary propagation of data to field layers
+- making access, retention, and audit boundaries easier to define
+
+### 7.6 What can and cannot be claimed
+
+The defensible claim is that EVOLV can be deployed in a way that supports compliance with strict European cybersecurity and resilience expectations.
+
+The non-defensible claim is that EVOLV is automatically compliant purely because of the architecture diagram.
+
+Actual compliance still depends on implementation and operations, including:
+
+- access control
+- patch and vulnerability management
+- incident response
+- logging and audit evidence
+- retention policy
+- data classification
+
+## 8. Recommended Ideal Stack

 The ideal EVOLV stack should be layered around operational boundaries, not around tools.

@@ -446,7 +516,7 @@ These should be explicit architecture elements:
 - versioned configuration and schema management
 - rollout/rollback strategy

-## 8. Recommended Opinionated Choices
+## 9. Recommended Opinionated Choices

 ### 8.1 Keep Node-RED as the orchestration layer, not the whole platform

@@ -501,7 +571,7 @@ The architecture should be designed so that `tagcodering` can mature into:
 - site/central configuration exchange point
 - API-served configuration source for runtime layers

-## 9. Suggested Phasing
+## 10. Suggested Phasing

 ### Phase 1: Stabilize contracts

@@ -533,13 +603,13 @@ The architecture should be designed so that `tagcodering` can mature into:
 - advisory services from central
 - auditability of downward recommendations and configuration changes

-## 10. Immediate Open Questions Before Wiki Finalization
+## 11. Immediate Open Questions Before Wiki Finalization

 1. Which signals are allowed to use reconstruction-aware smart storage, and which must remain raw or near-raw for audit/compliance reasons?
 2. How should `tagcodering` be exposed to runtime layers: direct database access, a dedicated API, or both?
 3. What exact responsibility split should EVOLV use between API synchronization and broker-based eventing?

-## 11. Recommended Wiki Structure
+## 12. Recommended Wiki Structure

 The wiki should not be one long page. It should be split into:

@@ -549,6 +619,6 @@ The wiki should not be one long page. It should be split into:
 4. security and access-boundary model
 5. configuration architecture centered on `tagcodering`

-## 12. Next Step
+## 13. Next Step

 Use this document as the architecture baseline. The companion markdown page in `architecture/` can then be shaped into a wiki-ready visual overview page with Mermaid diagrams and shorter human-readable sections.
Author	SHA1	Message	Date
znetsixe	1c4a3f9685	Add deployment blueprint	2026-03-23 11:54:24 +01:00
znetsixe	9ca32dddfb	Extend architecture review with security positioning	2026-03-23 11:35:40 +01:00