Add a root AGENTS.md (ecosystem map of factory/tools/cms + agent operating rules + the persona cohort & workflow) and a new vibe/ knowledge base for LLM agents, modeled on tree-docs conventions and the factory house style. vibe/ folders (each with a README hub + contribution rules): - ADR/ optimized MADR-lite; canonical home going forward (doc/adr stays historical) - PRD/ one subfolder per PRD, mandatory STATUS.md, QA strategy for big ones - investigations/ single INV-NNN-slug.md, or stub + folder w/ notebooks - guidebooks/ tree-docs maps; lab-ecosystem guidebook of factory+tools+cms - runbooks/ [AGENT]/[HUMAN] step procedures (EN; doc/runbooks stays FR) - shareouts/ dated FR handouts (decks/mp4) Seed content (first ADR + PRD): a safe, production-like environment to rehearse risky changes and recovery without touching real prod — local-only sandbox (k3d + arm64 VMs) with a hard prod/sandbox isolation boundary. Includes INV-001 (prod blast-radius couplings), the ecosystem guidebook, and a FR shareout. Conventions enforced: no-tombstone rule, breadcrumb spine, bidirectional cross-links, theme:base mermaid (MCP-validated) + ordered-list-after-diagram. Built with a Workflow + persona cohort; 24 files, zero dead links. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
vibe > Guidebooks > Lab ecosystem
Lab ecosystem
Status: ✅ Active Last Updated: 2026-06-23 Related: ADR-0001 · safe prod-like environment · PRD · safe prod-like environment · INV-001 · prod blast-radius couplings
What this is
This guidebook is the end-to-end map of the Arcodange home lab — how the three repos (factory, tools, cms), the three Raspberry Pis, and the cloud edge wire together into one running system. It is a descriptive reference map, not a procedure: it answers "how does this fit together right now?". For "how do I add a new app step by step?" see the new-web-app runbook; for "why was it built this way?" see the factory ADRs.
The lab is run from one control node — a MacBook Pro M4 — driving everything via Ansible (imperative host setup) and OpenTofu (declarative cloud/Gitea/Vault/Postgres state). The three Pis (pi1/pi2/pi3 = 192.168.1.201-203) sit behind a home Livebox. pi1 is the k3s server; pi2/pi3 are agents. Gitea + PostgreSQL run as Docker Compose outside k3s on pi2's disk; everything else runs inside k3s on Longhorn distributed block storage. The public edge is a Cloudflared Zero-Trust tunnel into the internal Traefik, with Cloudflare DNS and Zoho email fronting arcodange.fr.
The whole lab, end to end
%%{init: {'theme': 'base'}}%%
flowchart TB
classDef ctrl fill:#2563eb,stroke:#1e40af,color:#fff
classDef host fill:#0891b2,stroke:#0e7490,color:#fff
classDef proc fill:#059669,stroke:#047857,color:#fff
classDef store fill:#7c3aed,stroke:#6d28d9,color:#fff
classDef edge fill:#d97706,stroke:#b45309,color:#fff
classDef dead fill:#6b7280,stroke:#4b5563,color:#fff
MAC["Control node (MacBook Pro M4)<br>Ansible + OpenTofu"]:::ctrl
subgraph LAN["Home LAN (Livebox) — 192.168.1.0/24"]
subgraph PI2["pi2 · 192.168.1.202 (docker-compose, outside k3s)"]
GITEA["Gitea<br>arcodange-org/*"]:::host
PG[("PostgreSQL")]:::store
end
subgraph K3S["k3s cluster — pi1 server, pi2/pi3 agents"]
ARGO["ArgoCD app-of-apps<br> /argocd"]:::proc
LH[("Longhorn<br>block storage")]:::store
VAULT["Vault + VSO<br>secrets"]:::store
TRAEFIK["Traefik<br>ingress"]:::proc
TOOLS["tools namespace<br>(Vault, Grafana, CrowdSec, …)"]:::host
APPS["app namespaces<br>(webapp, erp, cms, …)"]:::host
end
OLLAMA["pi3 · ollama"]:::host
end
subgraph CLOUD["Cloud edge"]
CF["Cloudflare DNS<br>+ Cloudflared tunnel"]:::edge
ZOHO["Zoho<br>email (arcodange.fr)"]:::edge
GCS[("GCS gs://arcodange-tf<br>OpenTofu state + Longhorn backup")]:::store
end
INTERNET(["Internet"]):::edge
MAC -- "Ansible: provision hosts, k3s, docker-compose" --> PI2
MAC -- "Ansible: k3s, Longhorn, Traefik" --> K3S
MAC -- "OpenTofu: Gitea/Vault/PG/Cloudflare/OVH state" --> GITEA
MAC -- "OpenTofu state" --> GCS
GITEA -- "repoURL chart/" --> ARGO
ARGO -- "Application CRDs (prune+selfHeal)" --> TOOLS
ARGO -- "Application CRDs (prune+selfHeal)" --> APPS
VAULT -- "VSO injects secrets into pods" --> TOOLS
VAULT -- "VSO injects secrets into pods" --> APPS
APPS -- "dynamic creds" --> PG
LH -. "PVCs" .- TOOLS
LH -. "PVCs" .- APPS
LH -- "backup target" --> GCS
INTERNET --> CF -- "tunnel" --> TRAEFIK --> APPS
INTERNET --> ZOHO
- The control node (MacBook) provisions the three Pis with Ansible (OS, disks, Docker, k3s, Longhorn, Traefik) and manages all SaaS/Gitea/Vault/Postgres state with OpenTofu.
- On pi2, Gitea and PostgreSQL run as Docker Compose outside k3s, on the local disk — they are the source-of-truth services the cluster depends on.
- OpenTofu keeps its state in GCS (
gs://arcodange-tf), and Longhorn pushes volume backups to the same GCS project. - Gitea hosts every app repo; each repo's
chart/directory is the deployable Helm chart. - ArgoCD's app-of-apps turns each Gitea repo into an
ApplicationCRD (automatedprune+selfHeal) that deploys into thetoolsnamespace and the per-app namespaces. - Vault is the single source of truth for secrets; the Vault Secrets Operator (VSO) injects them into pods via Kubernetes auth, and apps draw dynamic PostgreSQL credentials from Vault against
pi2. - Longhorn provides the PVCs the in-cluster workloads mount, and backs up to GCS.
- The public edge routes Internet traffic through Cloudflare DNS and a Cloudflared Zero-Trust tunnel into the internal Traefik, which fronts the app namespaces; Zoho handles
arcodange.fremail.
Note
The ArgoCD Helm chart under
argocd/is defined and templated, but ArgoCD itself is not currently deployed in-cluster (its install step is commented out in the03_cicdprovisioning). The app-of-apps wiring documented here is the intended steady state; see 01 · factory for the caveat.
Deploy / secrets / DNS flows
- Deploy flow. Push to a Gitea repo → CI builds an image into the Gitea registry → ArgoCD (via the app-of-apps and, for some apps, the Image Updater) syncs the
chart/directory into the matching namespace withprune+selfHeal. The whole chain keys off one<app>identifier — see naming-conventions.md. - Secrets flow. Vault is the single source of truth (no sops/age). CI authenticates to Vault via Gitea OIDC JWT (role
gitea_cicd_<app>); pods receive secrets at runtime via VSO (Kubernetes auth +VaultDynamicSecretCRDs). Detail in secrets-and-vault.md. - DNS / edge flow. Internal names resolve under
*.arcodange.lab(Pi-hole + Step-CA-issued TLS). Public traffic forarcodange.frenters through Cloudflare and a Cloudflared tunnel to internal Traefik; public TLS is Let's Encrypt via Traefik's DNS-challenge (DuckDNS). Email runs through Zoho. Edge detail in 03 · cms.
Master index
| Page | What it maps | Status |
|---|---|---|
| 01 · factory | The cornerstone admin repo: Ansible host/cluster provisioning, ArgoCD app-of-apps, OpenTofu (iac/), and per-app PostgreSQL (postgres/iac/) |
✅ Active |
| 02 · tools | The tools namespace: Vault, VSO, Prometheus, Grafana, CrowdSec, poolers, Redis/KeyDB, Plausible + ClickHouse, the tool library chart |
✅ Active |
| 03 · cms | The public-facing site: Nuxt static site, Cloudflare zone + tunnel + Turnstile, Zoho email (MX/SPF/DKIM/DMARC/BIMI + aliases) | ✅ Active |
| naming-conventions.md | The <app> join key — one kebab-case name reused identically across Gitea, PG, Vault, k8s, ArgoCD, GCS, DNS |
✅ Active |
| secrets-and-vault.md | How Vault is the single source of truth: Gitea OIDC JWT for CI, VSO injection for pods, dynamic PostgreSQL creds | ✅ Active |
| storage-and-recovery.md | Longhorn block storage, GCS backup target, and the tested power-cut recovery sequence | ✅ Active |
Status legend
✅ done · 🟡 beta · 🔴 critical · ⚠️ known issue · ❌ disabled · ⬜ not started.
Maintenance rule
Important
If you alter a component documented here, update its page in the same change. A reference map that drifts from reality sends readers (and agents) confidently down dead paths. The PR that changes the component is the PR that updates its guidebook page — treat the doc edit as part of the diff, not a follow-up.
Cross-references
- ADR-0001 · safe prod-like environment — the decision this map supports.
- PRD · safe prod-like environment — the product framing of an isolated, prod-like sandbox.
- INV-001 · prod blast-radius couplings — the couplings (the
<app>join key, shared Vault/PG/Longhorn) that make blast radius real. - doc/adr — the canonical infrastructure ADRs (FRENCH).
- new-web-app conventions — the authoritative source for the
<app>naming convention.