Files
factory/vibe/guidebooks/factory-provisioning/ansible/02-setup.md
Gabriel Radureau dbe32161dc docs(vibe): add factory-provisioning guidebook (Ansible + OpenTofu)
Deep, code-grounded tree-docs guidebook under vibe/guidebooks/factory-provisioning/,
explored from the actual playbooks/roles and tofu code:

- Hub: the two provisioning engines (operator-run Ansible vs CI-applied OpenTofu),
  a green-field bring-up flow, master index, maintenance rule.
- ansible/ sub-tree: ordered pages 01-system .. 06-recover, an inventory & variables
  concept page, and a Tier-1/Tier-2 roles reference (hashicorp_vault, step_ca,
  crowdsec, pihole, deploy_docker_compose + the gitea_* family and helpers).
- opentofu/ sub-tree: factory-iac (Cloudflare/OVH/GCP/Gitea/Vault edge +
  cloudflare_token module), postgres-iac (per-app DB/role/pgbouncer lookup),
  ci-apply-flow (Gitea OIDC-JWT -> Vault -> auto-approve apply).

Cross-linked bidirectionally with the lab-ecosystem guidebook and the safe-env
ADR/PRD (the sandbox rehearses exactly these engines). 14 mermaid diagrams
MCP-validated; zero dead links. Authored by the Lab Cartographer cohort.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 21:11:51 +02:00

7.7 KiB

vibe > Guidebooks > Factory provisioning > Ansible > 02 · Setup

02 · Setup — Postgres, Gitea, NFS backup target

Note

Status: active · Last Updated: 2026-06-23 Upstream: Ansible sub-hub · 01 · System Downstream: 03 · CI/CD Related: Inventory & variables · Roles reference · Storage & recovery · Secrets & Vault

What it does

02 · Setup deploys the stateful services the rest of the platform leans on: a PostgreSQL server and a Gitea instance — both running as Docker Compose stacks on pi2, outside K3s — plus the in-cluster NFS backup target. The wrapper playbooks/02_setup.yml imports playbooks/setup/setup.yml, which pings the Pis, then imports three sub-playbooks: backup_nfs.yml (tagged never), postgres.yml, and gitea.yml.

Important

Postgres and Gitea do not run in Kubernetes. They are Docker Compose stacks on pi2 (the sole member of the postgres group, which gitea inherits as a child — see Inventory & variables). K3s only references them: Traefik exposes Gitea via an ExternalName Service, and the pg-fix-table-ownership CronJob reaches Postgres over the LAN. This keeps the two services available even when the cluster is being rebuilt.

Ordered steps

# Sub-playbook Purpose Key vars / versions
1 setup/backup_nfs.yml Provision the shared backup volume: a Longhorn RWX PVC backups-rwx (50Gi), a Longhorn RecurringJob, a busybox deploy to spawn the share-manager, then mount the resulting NFS share at /mnt/backups on every Pi. tags: never; backup_size: 50Gi, RecurringJob thrice-a-month-backup (cron 0 5 */2 * *, retain 2)
2 setup/postgres.yml Deploy the Postgres Compose stack (deploy_docker_compose + deploy_postgresql role), create the gitea DB/user, create the pgbouncer auth_user + user_lookup() functions in both postgres and gitea DBs, publish the K8s Secret postgres-admin-credentials, and install the pg-fix-table-ownership CronJob. Postgres 16.3-alpine; container postgres; CronJob daily 0 3 * * *
3 setup/gitea.yml Deploy the Gitea Compose stack (deploy_docker_compose + deploy_gitea role), create admin arcodange, mint an API token via gitea_token, upload the avatar, register the SSH key, create org arcodange-org, then delete the temp token. Gitea 1.25.5; base URL http://pi2:3000

NFS backup target — how the share is born

%%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#1f2937','primaryTextColor':'#f9fafb','lineColor':'#6b7280','fontSize':'13px'}}}%%
flowchart TD
  classDef cluster fill:#1e4032,stroke:#22c55e,color:#f0fdf4;
  classDef host fill:#1e3a5f,stroke:#3b82f6,color:#f9fafb;

  pvc["RWX PVC backups-rwx (50Gi)<br>longhorn-system"]:::cluster
  rj["RecurringJob thrice-a-month-backup<br>cron 0 5 */2 *"]:::cluster
  dep["busybox Deployment rwx-nfs<br>mounts the PVC"]:::cluster
  sm["Longhorn share-manager<br>(spawned by the mount)"]:::cluster
  svc["Service nfs-backups-rwx<br>ClusterIP :2049"]:::cluster
  mount["mount /mnt/backups on pi1/pi2/pi3<br>NFS vers=4.1"]:::host

  pvc --> rj
  pvc --> dep --> sm --> svc --> mount
  1. A ReadWriteMany Longhorn PVC (backups-rwx, 50Gi) is created in longhorn-system.
  2. A RecurringJob is attached to the volume so Longhorn snapshots/backs it up on the 0 5 */2 * * schedule.
  3. A busybox Deployment (rwx-nfs) mounts the PVC — the act of mounting an RWX volume makes Longhorn spawn an NFS share-manager pod.
  4. A stable ClusterIP Service (nfs-backups-rwx, port 2049) is created (or reused) to front the share-manager.
  5. Each Pi installs nfs-common and mounts the share at /mnt/backups (vers=4.1, nofail, x-systemd.automount), persisted in fstab.

Postgres — what gets created

Artifact Where Purpose
Compose stack arcodange_factory pi2 Docker Runs postgres:16.3-alpine, container postgres, port 5432, data under /home/pi/arcodange/docker_composes/postgres/data.
gitea DB + user inside Postgres Created by the deploy_postgresql role from applications_databases.gitea (gitea_database).
pgbouncer auth_user (pgbouncer_auth) postgres + gitea DBs Login role used by the pgbouncer pooler for SCRAM lookups.
user_lookup(text) function postgres + gitea DBs SECURITY DEFINER function over pg_shadow; EXECUTE granted only to pgbouncer_auth.
K8s Secret postgres-admin-credentials kube-system Base64 admin user/password so the in-cluster CronJob can authenticate.
CronJob pg-fix-table-ownership kube-system Runs postgres:16.3 daily at 03:00; discovers %_role roles, derives each DB by stripping _role, and re-ALTER TABLE ... OWNER TO every public table — repairing ownership after a restore.

Gitea — bootstrap sequence

  1. Compose deploy via deploy_docker_compose, then the deploy_gitea role wires Gitea to the Postgres DB (host/db/user/password pulled from the compose env).
  2. Admin user arcodange (arcodange@gmail.com) is created with --random-password --admin if absent.
  3. API token is minted by the gitea_token role and used for the next HTTP calls.
  4. Avatar upload, SSH public key registration (idempotent), and org arcodange-org (full name "Arcodange") creation + avatar.
  5. Cleanup — a post_tasks invocation of gitea_token with gitea_token_delete: true removes the temporary token.

Gotchas

Warning

The NFS play is never-tagged and order-sensitive. backup_nfs.yml only runs when explicitly tagged, and several of its tasks (Créer PVC RWX, Lancer un Deployment pour déclencher NFS, Attendre que le pod rwx-nfs soit Running) are themselves tags: never. The RWX volume must already exist for the busybox deploy to spawn the share-manager; running the mount step before the share-manager is Running will hang on the until retry loop.

Warning

Postgres lives on pi2 outside K3s. Treat it as a single-host service: there is no Postgres pod to kubectl get. The cluster only sees the postgres-admin-credentials Secret and the pg-fix-table-ownership CronJob, both of which reach the DB over the LAN at pi2:5432. A pi2 outage takes Postgres (and Gitea) down regardless of cluster health.

Caution

pg-fix-table-ownership exists because restores break ownership. After a Longhorn/data recovery, tables can come back owned by the wrong role and apps lose write access. The daily CronJob silently re-owns every public table to the <db>_role matching each %_role PostgreSQL role. If you add a database whose owning role does not follow the <db>_role naming convention, this job will not fix it — see Naming conventions.

Note

The admin password is random and printed once. Gitea's admin is created with --random-password; capture it from the play output (or reset it via docker exec) — it is not stored in the inventory. The bootstrap API token is deliberately deleted at the end, so re-running the play re-mints a fresh one.