Files
factory/vibe/guidebooks/factory-provisioning/ansible/roles.md
Gabriel Radureau dbe32161dc docs(vibe): add factory-provisioning guidebook (Ansible + OpenTofu)
Deep, code-grounded tree-docs guidebook under vibe/guidebooks/factory-provisioning/,
explored from the actual playbooks/roles and tofu code:

- Hub: the two provisioning engines (operator-run Ansible vs CI-applied OpenTofu),
  a green-field bring-up flow, master index, maintenance rule.
- ansible/ sub-tree: ordered pages 01-system .. 06-recover, an inventory & variables
  concept page, and a Tier-1/Tier-2 roles reference (hashicorp_vault, step_ca,
  crowdsec, pihole, deploy_docker_compose + the gitea_* family and helpers).
- opentofu/ sub-tree: factory-iac (Cloudflare/OVH/GCP/Gitea/Vault edge +
  cloudflare_token module), postgres-iac (per-app DB/role/pgbouncer lookup),
  ci-apply-flow (Gitea OIDC-JWT -> Vault -> auto-approve apply).

Cross-linked bidirectionally with the lab-ecosystem guidebook and the safe-env
ADR/PRD (the sandbox rehearses exactly these engines). 14 mermaid diagrams
MCP-validated; zero dead links. Authored by the Lab Cartographer cohort.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 21:11:51 +02:00

19 KiB
Raw Permalink Blame History

vibe > Guidebooks > Factory provisioning > Ansible > Roles reference

Roles reference

Note

Status: active · Last Updated: 2026-06-23 Upstream: Ansible sub-hub · Lab ecosystem · 01 factory Downstream: Inventory & variables Related: Secrets & Vault · Storage & recovery · Naming conventions · ADR-0001 safe prod-like environment

Roles live in two places, by reuse scope:

This page is split by altitude. Tier 1 covers the heavyweight platform-service roles (one subsection each); Tier 2 is a single table of the smaller building-block roles.


Tier 1 — platform-service roles

hashicorp_vault

playbooks/tools/roles/hashicorp_vault · runs on localhost in the 04 · Tools stage. It initializes and unseals the cluster Vault and wires Gitea as an OIDC provider so CI jobs can authenticate to Vault.

The tasks/main.yml flow is:

  1. Init (init.yml) — first run only. Lists the Vault server pods in the tools namespace, checks vault operator init -status, and if uninitialized runs vault operator init with key-shares=1, key-threshold=1 (defaults from defaults/main.yml). The JSON output — unseal keys + initial root token — is written to ~/.arcodange/cluster-keys.json (dir 0700, file 0600).
  2. Unseal (unseal.yml) — required after every reboot. Reads the keys file and runs vault operator unseal for each server, then revokes the initial root token (idempotent — tolerates an already-revoked token).
  3. Generate a fresh root token (new_root_token.yml) — runs the generate-root OTP/nonce dance using the unseal keys to mint a short-lived vault_root_token.
  4. Set up Gitea OIDC (gitea_oidc_auth.yml) — drives Gitea through the bundled playwright_setupGiteaApp.js (via the playwright role) to create an OAuth2 app, then applies the bundled OpenTofu hashicorp_vault.tf inside a disposable ghcr.io/opentofu/opentofu container (state on a throwaway docker volume) to provision the Vault JWT/OIDC backend. Finally it renders oidc_jwt_token.sh.j2 into the Gitea Actions secret vault_oauth__sh_b64 (base64) at org scope, then propagates the same secret to each user in gitea_secret_propagation_users (Action secrets are per-owner, so user-owned repos can't read org secrets).
  5. Revoke the temp root token — the always block of main.yml revokes vault_root_token no matter how step 4 ended, so no long-lived root token survives the run.
Var Default Meaning
vault_unseal_keys_path ~/.arcodange/cluster-keys.json Where unseal keys + root token are stored.
vault_unseal_keys_shares / _key_threshold 1 / 1 Single-key seal (lab posture; threshold <= shares).
vault_address https://vault.arcodange.lab The cluster Vault endpoint.
gitea_admin_user / gitea_admin_password arcodange@gmail.com / (prompted) Credentials Playwright uses to create the OAuth app.
vault_oidc_force_reset false When true, vault auth disable gitea + gitea_jwt before re-applying.

Caution

vault_oidc_force_reset=true is destructive: it disables and wipes all gitea_cicd_* per-app JWT roles created by the bundled tofu, every run. Default is off. Likewise, losing ~/.arcodange/cluster-keys.json means the Vault can never be unsealed again — that file is the single point of failure for the whole secret plane (see Secrets & Vault).

step_ca

playbooks/ssl/roles/step_ca · runs on the step_ca group (all three Pis) in the 01 · System stage via ssl/step-ca.yml. It is the lab's internal ACME/CA for *.arcodange.lab certificates, run active/standby: primary pi1, replicas pi2/pi3. The tasks/main.yml imports five task files in order:

  1. install — install the step / step-ca binaries.
  2. init (init.yml) — primary only. step ca init (non-interactive, password file) with creates: guard so it is idempotent. The CA name is Arcodange Lab CA, DNS ssl-ca.arcodange.lab, listen :8443.
  3. sync (sync.yml) — replicates the CA from primary to standbys. It takes a lockfile on the primary (.sync.lock), computes a deterministic tar | sha256sum checksum of ~/.step, compares it to the last checksum cached on the controller, and only rsyncs (pull → controller → push to standbys) when the checksum changed. This is how the standbys hold an identical CA without a shared filesystem.
  4. systemd — install/enable the step-ca unit (the restart step-ca handler fires on cert/config change).
  5. provisioners (provisioners.yml) — primary only. Ensures a JWK provisioner named cert-manager exists: lists provisioners, generates the JWK keypair (creates: guard) under ~/.step/provisioners/, and step ca provisioner adds it. This is what lets in-cluster cert-manager request certs from the CA.
Var Default Meaning
step_ca_primary pi1 The writable CA node; standbys sync from it.
step_ca_fqdn ssl-ca.arcodange.lab CA DNS name; URL is https://{fqdn}:8443.
step_ca_provisioner_name / _type cert-manager / JWK The cert-manager provisioner.
step_ca_force_reinit false When true, stops the service and wipes ~/.step before re-init.
Secret Source
vault_step_ca_password CA root password — from vaulted step_ca/step_ca_vault.yml.
vault_step_ca_jwk_password cert-manager JWK provisioner password — same vaulted file.

Caution

step_ca_force_reinit=true wipes the entire CA (~/.step) on the primary and re-issues a new root — every previously issued *.arcodange.lab cert immediately becomes untrusted until clients reload the new root. Use only for a deliberate PKI rebuild.

crowdsec

playbooks/tools/roles/crowdsec · runs on localhost in the 04 · Tools stage. It wires CrowdSec's decisions into Traefik as a bouncer middleware with a Turnstile CAPTCHA. The tasks/main.yml flow:

  1. Vault → K8s secret plumbing — creates a ServiceAccount (factory-ansible-tool-crowdsec-traefik-plugin), a VaultAuth (kubernetes auth, role factory_crowdsec_conf), and a VaultStaticSecret that reads kvv2/cms/factory/turnstile into a K8s secret (refreshAfter: 30s). The Turnstile sitekey/secret come from there.
  2. Bouncer key — finds the CrowdSec LAPI pod in tools and runs cscli bouncers add traefik-plugin (deletes + re-adds on conflict) to obtain the bouncer API key.
  3. CAPTCHA HTMLinject_captcha_html.yml pushes captcha.html into the Traefik PVC; this task is tagged never (opt-in only) so the default run skips it.
  4. Traefik Middleware — applies a traefik.io/v1alpha1 Middleware named crowdsec-bouncer (crowdsec in kube-system) configured with the bouncer key, stream mode, Turnstile (captchaProvider: turnstile + site/secret keys), and a Redis cache at redis.tools:6379.
  5. Restart Traefik — scales the Traefik Deployment to 0 then back to 1 (with a rescue/always guard guaranteeing it scales back up) to load the new middleware.
Var Default Meaning
traefik_pvc_name traefik The PVC the (tagged-never) captcha.html inject targets.
Secret Source
Turnstile sitekey + secret Vault kvv2/cms/factory/turnstile, surfaced via VaultStaticSecret.
Bouncer API key Minted at runtime by cscli bouncers add.

pihole

playbooks/dns/roles/pihole · runs on the pihole group (pi1, pi3) in the 01 · System stage. It configures HA DNS: two Pi-hole nodes kept in sync. The tasks/main.yml includes three task files:

  1. ha_pihole_setup.ymlwaits for a manual Pi-hole install (it prints the curl … | sudo bash command and wait_fors /etc/pihole/pihole-FTL.db for up to 10 minutes; Pi-hole itself is not installed by Ansible). It then patches pihole.toml (listen port, listeningMode = "ALL", enable /etc/dnsmasq.d) and writes three dnsmasq drop-ins: 10-custom-rules.conf (wildcard address=/fqdn/ip from pihole_custom_dns), 20-rpis.conf (<host>.homepreferred_ip for every Pi), and 99-upstream.conf (explicit upstream from pihole_upstream_dns).
  2. gravity_setup.yml — sets up Gravity Sync between the two nodes: a pihole_gravity system user with a freshly rotated ed25519 keypair each run, cross-authorized authorized_keys, full sudo (/etc/sudoers.d/gravity-sync), the installer, and a generated gravity-sync.conf (each node points REMOTE_HOST at the other), then runs the sync.
  3. client_setup.yml — points DNS clients at the Pi-hole pair by editing /etc/resolv.conf (insert nameservers after search) and the active NetworkManager connections via nmcli (per-interface ipv4.dns + dns-priority, eth0 50 / wlan0 100).
Var Default Meaning
pihole_primary pi1 First node; the other is derived as the secondary.
pihole_ports 8081o,443os,… Web-interface listen ports.
pihole_custom_dns {} FQDN→IP wildcard records (validated as IPv4).
pihole_upstream_dns [8.8.8.8, 1.1.1.1, 8.8.4.4] Explicit upstreams (avoids DHCP-provided DNS).

Warning

This role is not fully idempotent: it depends on a human running the Pi-hole installer first, it rotates the gravity SSH key on every run, and it grants the pihole_gravity user passwordless sudo ALL. Treat reruns as state-changing, not no-ops.

deploy_docker_compose

roles/deploy_docker_compose · shared. This is the generic compose mechanism every app deploy builds on. The caller passes a dockercompose_content dict; the tasks/main.yml:

  1. Derives app_name from dockercompose_content.name and creates /<root_path>/<partition>/<app_name>/ plus data/ and scripts/.
  2. Writes the compose file with to_nice_yaml and validates it with validate: 'docker compose -f %s config' — a bad compose fails the task before anything is written live.
  3. Writes a small wrapper script scripts/docker-compose that runs docker compose -f <the file> "$@", so the app can be driven without remembering the path.
Var Default Meaning
app_name (dockercompose_content.name) App directory name.
app_owner / app_group pi / docker File ownership.
root_path /home/pi/arcodange Base path; partition (docker_composes) nests under it.

Tier 2 — building-block roles

Smaller roles, mostly Gitea/forge plumbing and one-shot helpers. Shared roles live in roles/; deploy_gitea/deploy_postgresql are nested under playbooks/setup/roles/.

Role Purpose Key vars / notes Secrets
gitea_repo Ensure a repo exists across Gitea + GitHub + GitLab and add 8h push mirrors (sync_on_commit: true) to GitHub/GitLab. Creates missing repos on each forge; mirror URLs + namespace IDs in vars/main.yml. github_api_token, gitlab_api_token (from gitea_vault).
gitea_token Generate / replace / delete a Gitea access token via docker exec … gitea admin user generate-access-token. Stores the raw token in the fact named by gitea_token_fact_name; gitea_token_replace / gitea_token_delete toggles; scopes default to write:admin,organization,package,repository,user. The minted token itself (a fact, not persisted).
gitea_secret PUT a Gitea Actions secret at user or org scope. gitea_secret_name / _value; gitea_owner_type (user|org) selects the API path. gitea_api_token (Authorization).
gitea_sync List repos on all three forges, diff them, and call gitea_repo for the repos missing somewhere. Computes repos_incomplete = all common; loops gitea_repo over the gaps. GitHub/GitLab/Gitea API tokens.
traefik_certs Extract the live *.arcodange.lab cert from Traefik's acme.json. kubectl exec into Traefik → jq the LetsEncrypt wildcard cert → traefik_cert_pem fact; no-op if already set. — (reads in-cluster acme.json).
playwright Run a Playwright browser-automation script in Docker. Builds playwright:<version> (default 1.47.0) from files/, runs the script with playwright_env injected as -e; default script loginGitea.js. Used by hashicorp_vault for the OIDC app setup. Script-specific env (e.g. Gitea admin creds).
deploy_gitea Deploy Gitea: template app.ini.j2, docker compose up, then health-check :3000 until ready. Compose source is /home/pi/arcodange/docker_composes/gitea; admin user arcodange. (consumes the vaulted Gitea compose env).
deploy_postgresql Deploy Postgres via compose, then per-app create DB + user (create_db_and_user.yml). Waits on pg_isready, loops applications_databases ({app: {db_name, db_user, db_password}}). Per-app DB passwords from applications_databases.

Role dependency view

How the roles relate: shared building blocks feed the setup-stage app deploys, and a few platform-service roles include shared roles directly.

%%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#1f2937','primaryTextColor':'#f9fafb','lineColor':'#6b7280','fontSize':'14px'}}}%%
flowchart TD
  classDef shared fill:#1e3a5f,stroke:#3b82f6,color:#f9fafb;
  classDef setup fill:#1e4620,stroke:#22c55e,color:#f9fafb;
  classDef platform fill:#4a2c1e,stroke:#f59e0b,color:#f9fafb;

  dc["deploy_docker_compose<br/>generic compose writer"]:::shared
  pw["playwright<br/>browser automation"]:::shared
  gt["gitea_token<br/>mint access token"]:::shared
  gs["gitea_secret<br/>PUT Actions secret"]:::shared
  gr["gitea_repo<br/>mirror to GitHub/GitLab"]:::shared
  gsync["gitea_sync<br/>diff 3 forges"]:::shared
  tc["traefik_certs<br/>extract lab cert"]:::shared

  dpg["deploy_postgresql"]:::setup
  dgi["deploy_gitea"]:::setup

  hv["hashicorp_vault"]:::platform
  sca["step_ca"]:::platform
  cs["crowdsec"]:::platform
  ph["pihole"]:::platform

  gsync --> gr
  hv --> pw
  hv --> gs
  dc -. "used by app deploys" .-> dpg
  dc -. "used by app deploys" .-> dgi
  1. gitea_syncgitea_repo — the sync role include-loops gitea_repo for each repo missing from one of the three forges.
  2. hashicorp_vaultplaywright — Vault's OIDC setup drives Gitea through Playwright to create the OAuth app.
  3. hashicorp_vaultgitea_secret — the rendered vault_oauth__sh_b64 is published as a Gitea Actions secret at org and user scope.
  4. deploy_docker_composedeploy_postgresql / deploy_gitea — the generic compose writer is the substrate the setup-stage app deploys lean on.
  5. step_ca, crowdsec, pihole stand alone — they configure their own services (PKI, WAF, DNS) without including other roles.

See also

  • Inventory & variables — the groups (gitea, postgres, step_ca, pihole) these roles target, and the vaulted group_vars they read.
  • Secrets & Vault — where hashicorp_vault's OIDC tokens and the kvv2/cms/factory/turnstile path fit the broader secret model.
  • Storage & recovery — how the compose data/ dirs and the step-ca state relate to backup and disaster recovery.