Files
factory/vibe/guidebooks/lab-ecosystem/README.md
Gabriel Radureau 7647a68cdc docs(vibe): bootstrap vibe/ knowledge tree + ecosystem AGENTS.md
Add a root AGENTS.md (ecosystem map of factory/tools/cms + agent operating
rules + the persona cohort & workflow) and a new vibe/ knowledge base for LLM
agents, modeled on tree-docs conventions and the factory house style.

vibe/ folders (each with a README hub + contribution rules):
- ADR/      optimized MADR-lite; canonical home going forward (doc/adr stays historical)
- PRD/      one subfolder per PRD, mandatory STATUS.md, QA strategy for big ones
- investigations/  single INV-NNN-slug.md, or stub + folder w/ notebooks
- guidebooks/      tree-docs maps; lab-ecosystem guidebook of factory+tools+cms
- runbooks/        [AGENT]/[HUMAN] step procedures (EN; doc/runbooks stays FR)
- shareouts/       dated FR handouts (decks/mp4)

Seed content (first ADR + PRD): a safe, production-like environment to rehearse
risky changes and recovery without touching real prod — local-only sandbox
(k3d + arm64 VMs) with a hard prod/sandbox isolation boundary. Includes
INV-001 (prod blast-radius couplings), the ecosystem guidebook, and a FR shareout.

Conventions enforced: no-tombstone rule, breadcrumb spine, bidirectional
cross-links, theme:base mermaid (MCP-validated) + ordered-list-after-diagram.
Built with a Workflow + persona cohort; 24 files, zero dead links.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 11:52:37 +02:00

8.5 KiB

vibe > Guidebooks > Lab ecosystem

Lab ecosystem

Status: Active Last Updated: 2026-06-23 Related: ADR-0001 · safe prod-like environment · PRD · safe prod-like environment · INV-001 · prod blast-radius couplings

What this is

This guidebook is the end-to-end map of the Arcodange home lab — how the three repos (factory, tools, cms), the three Raspberry Pis, and the cloud edge wire together into one running system. It is a descriptive reference map, not a procedure: it answers "how does this fit together right now?". For "how do I add a new app step by step?" see the new-web-app runbook; for "why was it built this way?" see the factory ADRs.

The lab is run from one control node — a MacBook Pro M4 — driving everything via Ansible (imperative host setup) and OpenTofu (declarative cloud/Gitea/Vault/Postgres state). The three Pis (pi1/pi2/pi3 = 192.168.1.201-203) sit behind a home Livebox. pi1 is the k3s server; pi2/pi3 are agents. Gitea + PostgreSQL run as Docker Compose outside k3s on pi2's disk; everything else runs inside k3s on Longhorn distributed block storage. The public edge is a Cloudflared Zero-Trust tunnel into the internal Traefik, with Cloudflare DNS and Zoho email fronting arcodange.fr.

The whole lab, end to end

%%{init: {'theme': 'base'}}%%
flowchart TB
    classDef ctrl fill:#2563eb,stroke:#1e40af,color:#fff
    classDef host fill:#0891b2,stroke:#0e7490,color:#fff
    classDef proc fill:#059669,stroke:#047857,color:#fff
    classDef store fill:#7c3aed,stroke:#6d28d9,color:#fff
    classDef edge fill:#d97706,stroke:#b45309,color:#fff
    classDef dead fill:#6b7280,stroke:#4b5563,color:#fff

    MAC["Control node (MacBook Pro M4)<br>Ansible + OpenTofu"]:::ctrl

    subgraph LAN["Home LAN (Livebox) — 192.168.1.0/24"]
        subgraph PI2["pi2 · 192.168.1.202 (docker-compose, outside k3s)"]
            GITEA["Gitea<br>arcodange-org/*"]:::host
            PG[("PostgreSQL")]:::store
        end
        subgraph K3S["k3s cluster — pi1 server, pi2/pi3 agents"]
            ARGO["ArgoCD app-of-apps<br> /argocd"]:::proc
            LH[("Longhorn<br>block storage")]:::store
            VAULT["Vault + VSO<br>secrets"]:::store
            TRAEFIK["Traefik<br>ingress"]:::proc
            TOOLS["tools namespace<br>(Vault, Grafana, CrowdSec, …)"]:::host
            APPS["app namespaces<br>(webapp, erp, cms, …)"]:::host
        end
        OLLAMA["pi3 · ollama"]:::host
    end

    subgraph CLOUD["Cloud edge"]
        CF["Cloudflare DNS<br>+ Cloudflared tunnel"]:::edge
        ZOHO["Zoho<br>email (arcodange.fr)"]:::edge
        GCS[("GCS gs://arcodange-tf<br>OpenTofu state + Longhorn backup")]:::store
    end

    INTERNET(["Internet"]):::edge

    MAC -- "Ansible: provision hosts, k3s, docker-compose" --> PI2
    MAC -- "Ansible: k3s, Longhorn, Traefik" --> K3S
    MAC -- "OpenTofu: Gitea/Vault/PG/Cloudflare/OVH state" --> GITEA
    MAC -- "OpenTofu state" --> GCS

    GITEA -- "repoURL chart/" --> ARGO
    ARGO -- "Application CRDs (prune+selfHeal)" --> TOOLS
    ARGO -- "Application CRDs (prune+selfHeal)" --> APPS
    VAULT -- "VSO injects secrets into pods" --> TOOLS
    VAULT -- "VSO injects secrets into pods" --> APPS
    APPS -- "dynamic creds" --> PG
    LH -. "PVCs" .- TOOLS
    LH -. "PVCs" .- APPS
    LH -- "backup target" --> GCS

    INTERNET --> CF -- "tunnel" --> TRAEFIK --> APPS
    INTERNET --> ZOHO
  1. The control node (MacBook) provisions the three Pis with Ansible (OS, disks, Docker, k3s, Longhorn, Traefik) and manages all SaaS/Gitea/Vault/Postgres state with OpenTofu.
  2. On pi2, Gitea and PostgreSQL run as Docker Compose outside k3s, on the local disk — they are the source-of-truth services the cluster depends on.
  3. OpenTofu keeps its state in GCS (gs://arcodange-tf), and Longhorn pushes volume backups to the same GCS project.
  4. Gitea hosts every app repo; each repo's chart/ directory is the deployable Helm chart.
  5. ArgoCD's app-of-apps turns each Gitea repo into an Application CRD (automated prune + selfHeal) that deploys into the tools namespace and the per-app namespaces.
  6. Vault is the single source of truth for secrets; the Vault Secrets Operator (VSO) injects them into pods via Kubernetes auth, and apps draw dynamic PostgreSQL credentials from Vault against pi2.
  7. Longhorn provides the PVCs the in-cluster workloads mount, and backs up to GCS.
  8. The public edge routes Internet traffic through Cloudflare DNS and a Cloudflared Zero-Trust tunnel into the internal Traefik, which fronts the app namespaces; Zoho handles arcodange.fr email.

Note

The ArgoCD Helm chart under argocd/ is defined and templated, but ArgoCD itself is not currently deployed in-cluster (its install step is commented out in the 03_cicd provisioning). The app-of-apps wiring documented here is the intended steady state; see 01 · factory for the caveat.

Deploy / secrets / DNS flows

  • Deploy flow. Push to a Gitea repo → CI builds an image into the Gitea registry → ArgoCD (via the app-of-apps and, for some apps, the Image Updater) syncs the chart/ directory into the matching namespace with prune + selfHeal. The whole chain keys off one <app> identifier — see naming-conventions.md.
  • Secrets flow. Vault is the single source of truth (no sops/age). CI authenticates to Vault via Gitea OIDC JWT (role gitea_cicd_<app>); pods receive secrets at runtime via VSO (Kubernetes auth + VaultDynamicSecret CRDs). Detail in secrets-and-vault.md.
  • DNS / edge flow. Internal names resolve under *.arcodange.lab (Pi-hole + Step-CA-issued TLS). Public traffic for arcodange.fr enters through Cloudflare and a Cloudflared tunnel to internal Traefik; public TLS is Let's Encrypt via Traefik's DNS-challenge (DuckDNS). Email runs through Zoho. Edge detail in 03 · cms.

Master index

Page What it maps Status
01 · factory The cornerstone admin repo: Ansible host/cluster provisioning, ArgoCD app-of-apps, OpenTofu (iac/), and per-app PostgreSQL (postgres/iac/) Active
02 · tools The tools namespace: Vault, VSO, Prometheus, Grafana, CrowdSec, poolers, Redis/KeyDB, Plausible + ClickHouse, the tool library chart Active
03 · cms The public-facing site: Nuxt static site, Cloudflare zone + tunnel + Turnstile, Zoho email (MX/SPF/DKIM/DMARC/BIMI + aliases) Active
naming-conventions.md The <app> join key — one kebab-case name reused identically across Gitea, PG, Vault, k8s, ArgoCD, GCS, DNS Active
secrets-and-vault.md How Vault is the single source of truth: Gitea OIDC JWT for CI, VSO injection for pods, dynamic PostgreSQL creds Active
storage-and-recovery.md Longhorn block storage, GCS backup target, and the tested power-cut recovery sequence Active

Status legend

done · 🟡 beta · 🔴 critical · ⚠️ known issue · disabled · not started.

Maintenance rule

Important

If you alter a component documented here, update its page in the same change. A reference map that drifts from reality sends readers (and agents) confidently down dead paths. The PR that changes the component is the PR that updates its guidebook page — treat the doc edit as part of the diff, not a follow-up.

Cross-references