Files
factory/vibe/guidebooks/lab-ecosystem
Gabriel Radureau dbe32161dc docs(vibe): add factory-provisioning guidebook (Ansible + OpenTofu)
Deep, code-grounded tree-docs guidebook under vibe/guidebooks/factory-provisioning/,
explored from the actual playbooks/roles and tofu code:

- Hub: the two provisioning engines (operator-run Ansible vs CI-applied OpenTofu),
  a green-field bring-up flow, master index, maintenance rule.
- ansible/ sub-tree: ordered pages 01-system .. 06-recover, an inventory & variables
  concept page, and a Tier-1/Tier-2 roles reference (hashicorp_vault, step_ca,
  crowdsec, pihole, deploy_docker_compose + the gitea_* family and helpers).
- opentofu/ sub-tree: factory-iac (Cloudflare/OVH/GCP/Gitea/Vault edge +
  cloudflare_token module), postgres-iac (per-app DB/role/pgbouncer lookup),
  ci-apply-flow (Gitea OIDC-JWT -> Vault -> auto-approve apply).

Cross-linked bidirectionally with the lab-ecosystem guidebook and the safe-env
ADR/PRD (the sandbox rehearses exactly these engines). 14 mermaid diagrams
MCP-validated; zero dead links. Authored by the Lab Cartographer cohort.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 21:11:51 +02:00
..

vibe > Guidebooks > Lab ecosystem

Lab ecosystem

Status: Active Last Updated: 2026-06-23 Related: ADR-0001 · safe prod-like environment · PRD · safe prod-like environment · INV-001 · prod blast-radius couplings

What this is

This guidebook is the end-to-end map of the Arcodange home lab — how the three repos (factory, tools, cms), the three Raspberry Pis, and the cloud edge wire together into one running system. It is a descriptive reference map, not a procedure: it answers "how does this fit together right now?". For "how do I add a new app step by step?" see the new-web-app runbook; for "why was it built this way?" see the factory ADRs.

The lab is run from one control node — a MacBook Pro M4 — driving everything via Ansible (imperative host setup) and OpenTofu (declarative cloud/Gitea/Vault/Postgres state). The three Pis (pi1/pi2/pi3 = 192.168.1.201-203) sit behind a home Livebox. pi1 is the k3s server; pi2/pi3 are agents. Gitea + PostgreSQL run as Docker Compose outside k3s on pi2's disk; everything else runs inside k3s on Longhorn distributed block storage. The public edge is a Cloudflared Zero-Trust tunnel into the internal Traefik, with Cloudflare DNS and Zoho email fronting arcodange.fr.

The whole lab, end to end

%%{init: {'theme': 'base'}}%%
flowchart TB
    classDef ctrl fill:#2563eb,stroke:#1e40af,color:#fff
    classDef host fill:#0891b2,stroke:#0e7490,color:#fff
    classDef proc fill:#059669,stroke:#047857,color:#fff
    classDef store fill:#7c3aed,stroke:#6d28d9,color:#fff
    classDef edge fill:#d97706,stroke:#b45309,color:#fff
    classDef dead fill:#6b7280,stroke:#4b5563,color:#fff

    MAC["Control node (MacBook Pro M4)<br>Ansible + OpenTofu"]:::ctrl

    subgraph LAN["Home LAN (Livebox) — 192.168.1.0/24"]
        subgraph PI2["pi2 · 192.168.1.202 (docker-compose, outside k3s)"]
            GITEA["Gitea<br>arcodange-org/*"]:::host
            PG[("PostgreSQL")]:::store
        end
        subgraph K3S["k3s cluster — pi1 server, pi2/pi3 agents"]
            ARGO["ArgoCD app-of-apps<br> /argocd"]:::proc
            LH[("Longhorn<br>block storage")]:::store
            VAULT["Vault + VSO<br>secrets"]:::store
            TRAEFIK["Traefik<br>ingress"]:::proc
            TOOLS["tools namespace<br>(Vault, Grafana, CrowdSec, …)"]:::host
            APPS["app namespaces<br>(webapp, erp, cms, …)"]:::host
        end
        OLLAMA["pi3 · ollama"]:::host
    end

    subgraph CLOUD["Cloud edge"]
        CF["Cloudflare DNS<br>+ Cloudflared tunnel"]:::edge
        ZOHO["Zoho<br>email (arcodange.fr)"]:::edge
        GCS[("GCS gs://arcodange-tf<br>OpenTofu state + Longhorn backup")]:::store
    end

    INTERNET(["Internet"]):::edge

    MAC -- "Ansible: provision hosts, k3s, docker-compose" --> PI2
    MAC -- "Ansible: k3s, Longhorn, Traefik" --> K3S
    MAC -- "OpenTofu: Gitea/Vault/PG/Cloudflare/OVH state" --> GITEA
    MAC -- "OpenTofu state" --> GCS

    GITEA -- "repoURL chart/" --> ARGO
    ARGO -- "Application CRDs (prune+selfHeal)" --> TOOLS
    ARGO -- "Application CRDs (prune+selfHeal)" --> APPS
    VAULT -- "VSO injects secrets into pods" --> TOOLS
    VAULT -- "VSO injects secrets into pods" --> APPS
    APPS -- "dynamic creds" --> PG
    LH -. "PVCs" .- TOOLS
    LH -. "PVCs" .- APPS
    LH -- "backup target" --> GCS

    INTERNET --> CF -- "tunnel" --> TRAEFIK --> APPS
    INTERNET --> ZOHO
  1. The control node (MacBook) provisions the three Pis with Ansible (OS, disks, Docker, k3s, Longhorn, Traefik) and manages all SaaS/Gitea/Vault/Postgres state with OpenTofu.
  2. On pi2, Gitea and PostgreSQL run as Docker Compose outside k3s, on the local disk — they are the source-of-truth services the cluster depends on.
  3. OpenTofu keeps its state in GCS (gs://arcodange-tf), and Longhorn pushes volume backups to the same GCS project.
  4. Gitea hosts every app repo; each repo's chart/ directory is the deployable Helm chart.
  5. ArgoCD's app-of-apps turns each Gitea repo into an Application CRD (automated prune + selfHeal) that deploys into the tools namespace and the per-app namespaces.
  6. Vault is the single source of truth for secrets; the Vault Secrets Operator (VSO) injects them into pods via Kubernetes auth, and apps draw dynamic PostgreSQL credentials from Vault against pi2.
  7. Longhorn provides the PVCs the in-cluster workloads mount, and backs up to GCS.
  8. The public edge routes Internet traffic through Cloudflare DNS and a Cloudflared Zero-Trust tunnel into the internal Traefik, which fronts the app namespaces; Zoho handles arcodange.fr email.

Note

The ArgoCD Helm chart under argocd/ is defined and templated, but ArgoCD itself is not currently deployed in-cluster (its install step is commented out in the 03_cicd provisioning). The app-of-apps wiring documented here is the intended steady state; see 01 · factory for the caveat.

Deploy / secrets / DNS flows

  • Deploy flow. Push to a Gitea repo → CI builds an image into the Gitea registry → ArgoCD (via the app-of-apps and, for some apps, the Image Updater) syncs the chart/ directory into the matching namespace with prune + selfHeal. The whole chain keys off one <app> identifier — see naming-conventions.md.
  • Secrets flow. Vault is the single source of truth (no sops/age). CI authenticates to Vault via Gitea OIDC JWT (role gitea_cicd_<app>); pods receive secrets at runtime via VSO (Kubernetes auth + VaultDynamicSecret CRDs). Detail in secrets-and-vault.md.
  • DNS / edge flow. Internal names resolve under *.arcodange.lab (Pi-hole + Step-CA-issued TLS). Public traffic for arcodange.fr enters through Cloudflare and a Cloudflared tunnel to internal Traefik; public TLS is Let's Encrypt via Traefik's DNS-challenge (DuckDNS). Email runs through Zoho. Edge detail in 03 · cms.

Master index

Page What it maps Status
01 · factory The cornerstone admin repo: Ansible host/cluster provisioning, ArgoCD app-of-apps, OpenTofu (iac/), and per-app PostgreSQL (postgres/iac/) Active
02 · tools The tools namespace: Vault, VSO, Prometheus, Grafana, CrowdSec, poolers, Redis/KeyDB, Plausible + ClickHouse, the tool library chart Active
03 · cms The public-facing site: Nuxt static site, Cloudflare zone + tunnel + Turnstile, Zoho email (MX/SPF/DKIM/DMARC/BIMI + aliases) Active
naming-conventions.md The <app> join key — one kebab-case name reused identically across Gitea, PG, Vault, k8s, ArgoCD, GCS, DNS Active
secrets-and-vault.md How Vault is the single source of truth: Gitea OIDC JWT for CI, VSO injection for pods, dynamic PostgreSQL creds Active
storage-and-recovery.md Longhorn block storage, GCS backup target, and the tested power-cut recovery sequence Active

Status legend

done · 🟡 beta · 🔴 critical · ⚠️ known issue · disabled · not started.

Maintenance rule

Important

If you alter a component documented here, update its page in the same change. A reference map that drifts from reality sends readers (and agents) confidently down dead paths. The PR that changes the component is the PR that updates its guidebook page — treat the doc edit as part of the diff, not a follow-up.

Cross-references