feat(multi-env): Phase B — make factory machinery env-capable (no activation)

ADR-0002 Phase B. Makes postgres/iac, argocd, and the conventions docs
multi-environment-capable WITHOUT activating any sandbox yet — every app
stays prod-only, so this change is behaviour-neutral:
  - postgres/iac `tofu plan` is a no-op (proven: the elision flatten keys
    are bare app names, db=<app>, role=<app>_role — identical addresses)
  - the argocd apps.yaml render is byte-identical (181→181 lines, diff
    empty) since no app declares `envs`

postgres/iac:
- variables.tf: `applications` becomes set(object({name, envs=optional(["prod"])}))
- main.tf: a `local.app_instances` flatten of applications × envs keyed by the
  elided instance id (env=prod → "<app>"); per-app resources iterate it and
  reference each.key / each.value.{database,role}. For prod-only apps every
  resource address + attribute is unchanged. (main.tf also got a full
  `tofu fmt` pass — the pgbouncer function block reindents 4→2 spaces, which
  is cosmetic; the correctness gate is the CI tofu plan, not the text diff.)
- terraform.tfvars: string entries → { name = "..." } objects.

argocd/templates/apps.yaml:
- after the prod Application, a `range $app_attr.envs` loop renders one extra
  Application per non-prod env: name/namespace `<app>-<env>`, shared repoURL,
  helm.valueFiles [values.yaml, values-<env>.yaml], per-env syncPolicy override.
  Renders nothing while no app sets `envs` → prod render unchanged.

docs:
- doc/runbooks/new-web-app/conventions.md (FR, authoritative): new section
  "Plusieurs environnements pour une même app" — elision rule, suffix rule,
  snake-case owner-role exception, erp/erp-sandbox table, ADR-0002 link.
- vibe/guidebooks/lab-ecosystem/naming-conventions.md (EN mirror): the env
  coordinate section + a "Two sandbox models" section reconciling the
  separate-cluster (ADR-0001, names repeat) vs in-cluster sibling (ADR-0002,
  <env> suffix) strategies; Last Updated bumped; ADR-0002 cross-links.

Activation (erp gets envs=["prod","sandbox"] in postgres tfvars + argocd
values + erp/iac) is Phase D, gated by its own plan review.

Refs ADR-0002 (factory#15). Phase A = tools#2 (merged). Phase C = erp#11 (merged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-28 16:28:28 +02:00
parent 8a1a63ee10
commit c00c4cdd5c
6 changed files with 178 additions and 59 deletions

View File

@@ -3,8 +3,8 @@
# Naming conventions — the `<app>` join key
> **Status**: 🟢 Active
> **Last Updated**: 2026-06-23
> **Related**: [Lab ecosystem](README.md) · [Factory brick](01-factory.md) · [Secrets & Vault](secrets-and-vault.md) · [PRD — isolation boundary](../../PRD/safe-prod-like-environment/isolation-boundary.md)
> **Last Updated**: 2026-06-25
> **Related**: [Lab ecosystem](README.md) · [Factory brick](01-factory.md) · [Secrets & Vault](secrets-and-vault.md) · [PRD — isolation boundary](../../PRD/safe-prod-like-environment/isolation-boundary.md) · [ADR 0002 — per-application environments](../../ADR/0002-per-application-environments.md)
> **Upstream (source of truth)**: [doc/runbooks/new-web-app/conventions.md](../../../doc/runbooks/new-web-app/conventions.md) (French, authoritative)
## TL;DR
@@ -83,9 +83,35 @@ The symptom is always the same: a brick that *looks* provisioned but never conne
✅ Choose a short, stable, lowercase kebab-case name up front and reuse it character-for-character.
❌ Never introduce variants (case, separators, plurals); nothing will warn you.
## Why this makes a sandbox safe
## Multiple environments per app (the `<env>` coordinate)
The `<app>` convention is also the reason a **production-like sandbox can reuse the exact same names** without colliding with production. Because every brick derives its resource names from `<app>` and from nothing else, an entire parallel universe of the platform — its own Vault, its own Postgres instance, its own k3s namespace scope — can host an `erp` named identically to the production `erp`, provided the two universes never share a backing store. Identity comes from the *environment boundary*, not from the name; the name is free to repeat. This is what lets QA and recovery drills run against `erp`, `webapp`, etc. with realistic identifiers instead of mangled `erp-staging`-style aliases that would themselves break the name-wiring. See the PRD's [isolation boundary](../../PRD/safe-prod-like-environment/isolation-boundary.md) for how that environment fence is drawn.
A single application can run as several deployed instances — `prod`, `sandbox`, and so on — **without becoming a separate app**: same repo, same chart, same version. A second coordinate `<env>` extends the join key, governed by an **elision rule** ([ADR 0002](../../ADR/0002-per-application-environments.md)):
- `env` defaults to `prod`, and **`prod` elides** — when `env == prod` no suffix is added, so every derived name is exactly the single-coordinate output of the mapping above. Existing apps are unaffected (their plan is a no-op).
- Non-prod envs take the **`<app>-<env>`** suffix everywhere — namespace, Vault paths / roles / policies, ArgoCD Application, DNS, GCS state sub-prefix — with the one snake-case exception inherited from the `_role` convention: the Postgres owner role is `<app>_<env>_role`.
- One repo, one chart, and one CI JWT role (`gitea_cicd_<app>`) serve every env; per-env differences are a `values-<env>.yaml` overlay.
Worked example — `erp` (prod, elided) and `erp-sandbox`:
| System | `erp` (env = prod) | `erp-sandbox` |
| --- | --- | --- |
| PostgreSQL database | `erp` | `erp-sandbox` |
| PostgreSQL owner role | `erp_role` | `erp_sandbox_role` |
| Namespace + ServiceAccount | `erp` | `erp-sandbox` |
| Vault dynamic DB creds | `postgres/creds/erp` | `postgres/creds/erp-sandbox` |
| Vault KV config | `kvv2/erp/config` | `kvv2/erp-sandbox/config` |
| ArgoCD Application | `erp` | `erp-sandbox` |
| Internal DNS | `erp.arcodange.lab` | `erp-sandbox.arcodange.lab` |
| Gitea repo / chart / CI JWT | `arcodange-org/erp` · chart · `gitea_cicd_erp` | shared |
## Two sandbox models, two naming strategies
There are two distinct ways to stand up a non-production copy, and they treat the join key differently — by design, not by accident.
- **Separate-cluster sandbox** ([ADR 0001](../../ADR/0001-safe-prod-like-environment.md)) — a whole parallel universe (its own Vault, Postgres, k3s) on the control node, for rehearsing dangerous *infrastructure* changes. The two universes never share a backing store, so identity comes from the *environment boundary*, not the name: the sandbox hosts an `erp` named identically to production. Names repeat freely; no `<env>` suffix is needed, so the name-wiring stays intact and drills run against realistic identifiers.
- **In-cluster sibling instance** ([ADR 0002](../../ADR/0002-per-application-environments.md)) — a second instance on the *same* cluster (e.g. `erp-sandbox` beside `erp`), for rehearsing *application-data* writes against the real API. Here there is no cluster fence to disambiguate by, so the `<env>` suffix *is* the separator: every derived name carries `-sandbox` to avoid colliding with prod's namespace, database, Vault paths, and DNS.
Both keep the name-wiring coherent — one by repeating the slug behind a cluster fence, the other by extending the slug with the elided `<env>` coordinate. See the PRD's [isolation boundary](../../PRD/safe-prod-like-environment/isolation-boundary.md) for how the separate-cluster fence is drawn, and [ADR 0002](../../ADR/0002-per-application-environments.md) for why the in-cluster sibling's blast radius stays bounded to one app's data.
## See also
@@ -93,4 +119,5 @@ The `<app>` convention is also the reason a **production-like sandbox can reuse
- [Secrets & Vault](secrets-and-vault.md) — how `gitea_cicd_<app>` and the `<app>` / `<app>-ops` policies fit the auth model.
- [Factory brick](01-factory.md) — where the ArgoCD app-of-apps, the Postgres OpenTofu, and the IaC live.
- [PRD — isolation boundary](../../PRD/safe-prod-like-environment/isolation-boundary.md) — why identical names are safe across environments.
- [ADR 0001 — Safe, production-like environment](../../ADR/0001-safe-prod-like-environment.md).
- [ADR 0001 — Safe, production-like environment](../../ADR/0001-safe-prod-like-environment.md) — the separate-cluster sandbox model.
- [ADR 0002 — Per-application environments](../../ADR/0002-per-application-environments.md) — the `<env>` coordinate + elision rule, and the in-cluster sibling sandbox model.