Files
factory/vibe/PRD/safe-prod-like-environment/isolation-boundary.md
Gabriel Radureau 7647a68cdc docs(vibe): bootstrap vibe/ knowledge tree + ecosystem AGENTS.md
Add a root AGENTS.md (ecosystem map of factory/tools/cms + agent operating
rules + the persona cohort & workflow) and a new vibe/ knowledge base for LLM
agents, modeled on tree-docs conventions and the factory house style.

vibe/ folders (each with a README hub + contribution rules):
- ADR/      optimized MADR-lite; canonical home going forward (doc/adr stays historical)
- PRD/      one subfolder per PRD, mandatory STATUS.md, QA strategy for big ones
- investigations/  single INV-NNN-slug.md, or stub + folder w/ notebooks
- guidebooks/      tree-docs maps; lab-ecosystem guidebook of factory+tools+cms
- runbooks/        [AGENT]/[HUMAN] step procedures (EN; doc/runbooks stays FR)
- shareouts/       dated FR handouts (decks/mp4)

Seed content (first ADR + PRD): a safe, production-like environment to rehearse
risky changes and recovery without touching real prod — local-only sandbox
(k3d + arm64 VMs) with a hard prod/sandbox isolation boundary. Includes
INV-001 (prod blast-radius couplings), the ecosystem guidebook, and a FR shareout.

Conventions enforced: no-tombstone rule, breadcrumb spine, bidirectional
cross-links, theme:base mermaid (MCP-validated) + ordered-list-after-diagram.
Built with a Workflow + persona cohort; 24 files, zero dead links.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 11:52:37 +02:00

4.0 KiB

vibe > PRD > Safe, production-like environment > Isolation boundary

Isolation boundary

Status: In design Last Updated: 2026-06-23 Upstream: Safe, production-like environment Related: ADR 0001 · INV-001 — prod blast-radius couplings

The isolation boundary is the load-bearing part of this PRD: the sandbox must be unable to mutate real prod even on a wrong command. Every prod coupling that a sandbox run could touch is mapped below to a concrete control. The boundary is the cluster + Vault + state + DNS zone — not the names (see the naming note).

Prod couplings → sandbox controls

Prod coupling What it can break in prod Sandbox control
Ansible inventory hosts.yml192.168.1.201-203 Wipe disks, reset k3s, corrupt Longhorn on the live Pis. Separate inventory/sandbox/hosts.yml (VM/cloud hosts only) plus a pre-task guard that aborts if any target IP is in 192.168.1.201-203 unless i_mean_prod=true is set explicitly.
OpenTofu state in gs://arcodange-tf (prefixes) A sandbox apply rewrites live state and re-plans prod resources. A sandbox prefix family (sandbox/factory/main, sandbox/tools/..., sandbox/factory/postgres) via a backend-config override, or a separate bucket gs://arcodange-tf-sandbox. Sandbox runs never touch prod state.
Gitea provider base_url gitea.arcodange.lab + ArgoCD repoURL / targetRevision Sandbox commits/pushes into the prod forge; ArgoCD syncs sandbox refs onto the prod cluster. Sandbox Gitea on the sandbox cluster (or org arcodange-sandbox); the sandbox app-of-apps points at a sandbox branch so the sandbox cluster syncs only sandbox refs.
Vault provider address vault.arcodange.lab + unseal key ~/.arcodange/cluster-keys.json Sandbox writes clobber prod policies/auth/mounts; a botched init overwrites the prod unseal key. A separate sandbox Vault; override the unseal-key path to ~/.arcodange/sandbox/cluster-keys.json so prod's key can never be overwritten.
PostgreSQL provider host 192.168.1.202 (superuser) Drop or alter live DBs — including ERP business records. Sandbox PG is the docker-compose on the sandbox pi2-equivalent; a guard refuses apply if host == 192.168.1.202 and workspace != prod.
Cloudflare account / OVH arcodange.fr / Zoho live mail A wrong MX/SPF/DKIM silently breaks arcodange.fr mail for days. DNS/email modules run plan-only against a throwaway zone/subdomain with a separate token. The real arcodange.fr token is never exported into a sandbox shell. Real public DNS/ACME is out of scope.
Longhorn backup bucket A restore drill overwrites prod backups. Sandbox backup target is a separate bucket/prefix so restore drills cannot overwrite prod backups.

The <app> naming note

The <app> key threads one kebab-case identifier through the Gitea repo, the PG db + role, the Vault paths/policies, the k8s namespace + SA, the ArgoCD Application, the GCS state prefix, and DNS — see conventions.

Because <app> keys everything within a cluster / Vault / DB / zone, the sandbox can reuse identical <app> names with no collision. The isolation boundary is the cluster + Vault + state + DNS zone, not the names. This is deliberate: runbooks read identically in both environments, so a drill exercises the exact same convention chain an operator runs in prod.

Caution

The real arcodange.fr Cloudflare token must never be exported into a sandbox shell. DNS/email work in the sandbox is plan-only against a throwaway zone with its own separate token. Exporting the prod token into a sandbox session would defeat the entire isolation boundary — a single tofu apply could rewrite live public DNS or mail records.