Files
factory/vibe/guidebooks/lab-ecosystem/01-factory.md
Gabriel Radureau 4823394e0e docs(vibe): add applications/ guidebook (webapp + url-shortener)
Tree-docs guidebook under vibe/guidebooks/applications/ documenting the common
app pattern and two contrasting archetypes, drilling into lab-ecosystem/01-factory
(bidirectional):

- README.md  : the shared app pattern (repo = Dockerfile + chart + optional iac +
  CI; ArgoCD app-of-apps; the <app> join key; .fr vs .lab ingress conventions) +
  a two-archetype comparison.
- webapp.md  : canonical Go + Postgres exemplar (chart, VaultAuth/Static/Dynamic
  CRDs, inline iac vs the shared app_roles module, CI); notes the current nuance
  that the live pod still uses the static pgbouncer_auth DATABASE_URL.
- url-shortener.md : Rust + SQLite-on-Longhorn-RWO counterpart (single replica,
  no iac/no Vault, CI mirrors the upstream image); the power-cut recovery story.

erp is referenced in prose only (its own guidebook lands next). Sibling-repo code
via full gitea URLs; 2 mermaid diagrams MCP-validated; zero dead links.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 21:58:36 +02:00

9.7 KiB

vibe > Guidebooks > Lab ecosystem > 01 · factory

01 · factory

Status: Active Last Updated: 2026-06-23 Downstream: 02 · tools · 03 · cms Deeper dive: Factory provisioning guidebook — page-by-page walkthrough of the Ansible playbooks/roles and OpenTofu modules summarized here Related: naming-conventions.md · secrets-and-vault.md · storage-and-recovery.md

factory is the cornerstone admin repo: it provisions the hosts and the cluster, declares what gets deployed, and owns the platform-level cloud/Gitea/Vault/Postgres state that every app leans on. It has four pillars — Ansible (imperative host & cluster setup), ArgoCD (declarative app-of-apps), iac/ (OpenTofu for the cloud/Gitea/Vault edge), and postgres/iac/ (per-app PostgreSQL provisioning). The repos tools and cms are deployed by factory's ArgoCD and are mapped in 02 · tools and 03 · cms.

Pillar 1 — Ansible (ansible/)

The collection lives at ansible/arcodange/factory/. The inventory groups the three Pis and pins the service placement; numbered playbooks run an ordered narrative from bare OS to backups; recover/ holds the disaster-recovery playbooks.

Inventory (inventory/hosts.yml)

Group Hosts Purpose
raspberries pi1, pi2, pi3 (192.168.1.201-203) All three Pis; ansible_user: pi
postgres pi2 The PostgreSQL host (docker-compose, outside k3s)
gitea children of postgres (→ pi2) Gitea co-located with PG on pi2
pihole pi1, pi3 Internal DNS resolvers
step_ca pi1, pi2, pi3 Step-CA PKI for *.arcodange.lab (primary pi1, replicas pi2/pi3)
local localhost + the Pis Control-node-local tasks

Numbered playbooks (playbooks/)

Playbook Imports / does Notes
01_system system/system.yml → rpi base, DNS, SSL, prepare disks, Docker, iSCSI, k3s install (--docker --disable traefik), CoreDNS, cert-issuer, Longhorn/Traefik config k3s v1.34.3+k3s1 via upstream k3s-ansible; pi1 server, pi2/pi3 agents
02_setup setup/setup.yml → PostgreSQL + Gitea docker-compose; optional backup-NFS share Stands up the two out-of-cluster source-of-truth services on pi2
03_cicd Gitea act-runner docker-compose on pi1/pi3 (raspberries:&local:!gitea), plus the ArgoCD/Image-Updater install See the ArgoCD caveat below
04_tools tools/tools.ymlhashicorp_vault.yml, crowdsec.yml Platform tooling that bootstraps the cluster's Vault + CrowdSec
05_backup backup/backup.ymlpostgres.yml, gitea.yml, k3s_pvc.yml to /mnt/backups Scheduled PG/Gitea/PVC backups; cron-report wiring present

Recovery playbooks (playbooks/recover/)

Playbook When to use
longhorn.yml Recover Longhorn after a power cut when Volume CRDs still exist (CSI driver registration loss)
longhorn_data.yml Recover app data from raw replica .img files when Volume CRDs are gone (block-device level)

The tested power-cut recovery sequence (Longhorn restore → Vault unseal → VSO re-auth → ERP scaled up last) is documented in CLUSTER_RECOVERY.md at the lab root (outside this repo) and summarized in storage-and-recovery.md. Background on PVC recovery is in the Longhorn PVC recovery ADR.

Key roles

deploy_docker_compose (renders compose stacks), gitea_repo / gitea_token / gitea_secret / gitea_sync (Gitea repo/token/secret/mirror management), traefik_certs, playwright, plus sub-roles step_ca, hashicorp_vault, crowdsec, pihole.

Pillar 2 — ArgoCD app-of-apps (argocd/)

A Helm chart whose templates/apps.yaml loops over values.gitea_applications and emits one Application CRD per app. Each Application derives everything from the app name: repoURL = https://gitea.arcodange.lab/<org>/<app>, path = chart, namespace = <app> (CreateNamespace=true), with syncPolicy.automated prune: true + selfHeal: true by default.

Tip

Deeper dive: the Applications guidebook maps what these Application CRDs deploy — the common app-repo pattern (Dockerfile + chart/ + optional iac/ + CI) every app in the list below shares, and the two archetypes (Go + Postgres vs Rust + SQLite).

App Org override Image Updater
url-shortener
tools explicit prune+selfHeal
webapp digest strategy
telegram-gateway arcodange digest strategy
erp
cms digest strategy
dance-lessons-coach arcodange digest strategy

Note

The chart also templates a longhorn_backup_target and the ArgoCD Image Updater config (argocd.arcodange.lab). ArgoCD itself is not currently deployed in-cluster — its install is commented out in 03_cicd. This page documents the intended steady state; treat ArgoCD as "designed, not live" until that step is enabled.

Pillar 3 — OpenTofu (iac/)

Manages the cloud/Gitea/Vault edge. State lives in GCS (backend "gcs", bucket arcodange-tf, prefix factory/main). Tofu authenticates to Vault via Gitea OIDC JWT (mount gitea_jwt, role gitea_cicd).

Provider Used for
go-gitea/gitea (0.6.0) Repos, users, action secrets (e.g. the restricted tofu_module_reader CI user, CMS secrets)
vault (4.4.0) KV secrets + policies + k8s auth roles (e.g. Longhorn GCS-backup creds & policy)
google (7.0.1) GCS backup bucket + service account + HMAC key for Longhorn
cloudflare/cloudflare (~> 5) R2 bucket, API tokens, CMS edge wiring (detailed in 03 · cms)
ovh/ovh (2.8.0) OAuth2 client + IAM policy for the arcodange.fr domain (registrar = OVH)

modules/cloudflare_token is a reusable scoped-token factory. The whole module reuses the <app> name as the GCS state prefix (<app>/main) — see naming-conventions.md.

Pillar 4 — per-app PostgreSQL (postgres/iac/)

OpenTofu using the cyrilgdn/postgresql provider against PG on 192.168.1.202 (state prefix factory/postgres). It iterates over a var.applications set and, per app, creates:

Resource Name pattern Purpose
Database <app> The app's database (template0, owned by the role)
Owner role (non-login) <app>_role Database owner; granted to dynamic users by Vault
Editor role (login) credentials_editor Shared admin role that can grant the per-app roles
user_lookup() function per-<app> db SECURITY DEFINER lookup for pgbouncer auth (granted to pgbouncer_auth, revoked from public)

Current applications set: webapp, erp, crowdsec, plausible, dance-lessons-coach. Vault's PostgreSQL secrets engine then issues dynamic credentials on top of these roles — see secrets-and-vault.md. The pooler (pgbouncer) that consumes user_lookup() lives in the tools namespace — see 02 · tools.

Provisioning order

%%{init: {'theme': 'base'}}%%
flowchart LR
    classDef proc fill:#059669,stroke:#047857,color:#fff
    classDef store fill:#7c3aed,stroke:#6d28d9,color:#fff
    S1["01_system<br>OS + k3s + Longhorn"]:::proc --> S2["02_setup<br>PG + Gitea (pi2)"]:::proc --> S3["03_cicd<br>runners + ArgoCD"]:::proc --> S4["04_tools<br>Vault + CrowdSec"]:::proc --> S5["05_backup<br>PG/Gitea/PVC"]:::proc
    IAC["iac/ + postgres/iac<br>(OpenTofu state in GCS)"]:::store -. "declares cloud/Gitea/Vault/PG" .- S2
  1. 01_system lays the OS, disks, Docker, and k3s with Longhorn + Traefik onto the three Pis.
  2. 02_setup stands up PostgreSQL and Gitea as docker-compose on pi2 — the out-of-cluster source-of-truth services.
  3. 03_cicd registers the Gitea act-runners (and is where ArgoCD would install, currently commented out).
  4. 04_tools bootstraps the cluster's Vault and CrowdSec.
  5. 05_backup schedules PostgreSQL, Gitea, and k3s-PVC backups to /mnt/backups.
  6. In parallel, OpenTofu (iac/ and postgres/iac/) declares the cloud, Gitea, Vault, and PostgreSQL objects, keeping state in GCS.

Cross-references