Files
factory/vibe/guidebooks/factory-provisioning/ansible/04-tools.md
Gabriel Radureau dbe32161dc docs(vibe): add factory-provisioning guidebook (Ansible + OpenTofu)
Deep, code-grounded tree-docs guidebook under vibe/guidebooks/factory-provisioning/,
explored from the actual playbooks/roles and tofu code:

- Hub: the two provisioning engines (operator-run Ansible vs CI-applied OpenTofu),
  a green-field bring-up flow, master index, maintenance rule.
- ansible/ sub-tree: ordered pages 01-system .. 06-recover, an inventory & variables
  concept page, and a Tier-1/Tier-2 roles reference (hashicorp_vault, step_ca,
  crowdsec, pihole, deploy_docker_compose + the gitea_* family and helpers).
- opentofu/ sub-tree: factory-iac (Cloudflare/OVH/GCP/Gitea/Vault edge +
  cloudflare_token module), postgres-iac (per-app DB/role/pgbouncer lookup),
  ci-apply-flow (Gitea OIDC-JWT -> Vault -> auto-approve apply).

Cross-linked bidirectionally with the lab-ecosystem guidebook and the safe-env
ADR/PRD (the sandbox rehearses exactly these engines). 14 mermaid diagrams
MCP-validated; zero dead links. Authored by the Lab Cartographer cohort.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 21:11:51 +02:00

11 KiB

vibe > Guidebooks > Factory provisioning > Ansible > 04 · Tools

04 · Tools — Vault + CrowdSec

Note

Status: active · Last Updated: 2026-06-23 Upstream: Ansible sub-hub · Factory provisioning hub Downstream: Roles reference — deep mechanics of the hashicorp_vault and crowdsec roles Related: Secrets & Vault · 05 · Backup · 03 · CI/CD · ADR-0001 safe prod-like environment

Stage 4 installs the operational tooling layer on top of a running cluster: HashiCorp Vault (the lab's single secret store) and CrowdSec (the WAF/IPS that fronts Traefik). The entry point playbooks/04_tools.yml is a one-line wrapper that imports playbooks/tools/tools.yml, which in turn chains two sub-playbooks — hashicorp_vault.yml then crowdsec.yml. Both run against localhost (they drive the cluster through kubectl / kubernetes.core, not over SSH to the Pis).

Important

Vault is the chokepoint of the whole secret model. This page covers what the playbook orchestrates; the byte-level role internals (init, unseal, root-token minting, the OpenTofu OIDC backend) live in the Roles reference. Read Secrets & Vault first for the conceptual model — the two auth backends, the unseal posture, and why there is no secret material in git.


What stage 4 deploys

Sub-playbook File Builds Role invoked
Vault tools/hashicorp_vault.yml Initialises + unseals Vault, wires the Gitea OIDC/JWT auth backends via OpenTofu, publishes the vault_oauth__sh_b64 Gitea Action secret hashicorp_vault
CrowdSec tools/crowdsec.yml A VaultAuth + VaultStaticSecret for the Turnstile captcha keys, a fresh bouncer API key, and the Traefik crowdsec middleware crowdsec

Step 1 — hashicorp_vault.yml

The credential prompt

The play opens with a single vars_prompt for the Gitea admin password (gitea_admin_password, marked unsafe: true because the password may contain shell-hostile characters like {). This is the only interactive input the stage needs — everything else is derived or minted on the fly.

Orchestration flow

%%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#1f2937','primaryTextColor':'#f9fafb','lineColor':'#6b7280','fontSize':'14px'}}}%%
flowchart TD
  classDef prompt fill:#5f4a1e,stroke:#d97706,color:#fffbeb;
  classDef mint fill:#1e3a5f,stroke:#3b82f6,color:#f9fafb;
  classDef vault fill:#4c1d95,stroke:#7c3aed,color:#f5f3ff;
  classDef revoke fill:#5f1e1e,stroke:#ef4444,color:#fef2f2;

  P["vars_prompt:<br/>gitea_admin_password"]:::prompt
  T["Mint temp GITEA_ADMIN_TOKEN<br/>(role gitea_token, replace=true)"]:::mint
  R["Run hashicorp_vault role:<br/>init · unseal · OIDC backend · gitea secret"]:::vault
  D["post_tasks:<br/>delete GITEA_ADMIN_TOKEN"]:::revoke

  P --> T --> R --> D
  1. Mint a temporary token. The arcodange.factory.gitea_token role generates a GITEA_ADMIN_TOKEN with scopes write:admin,write:organization,write:repository,write:user (and gitea_token_replace: true, so any stale token of the same name is rotated). It is stashed in the fact vault_GITEA_ADMIN_TOKEN.
  2. Run the hashicorp_vault role. Invoked with three derived vars: the Postgres admin credentials (read straight out of the Postgres host's docker-compose environment via hostvars[groups.postgres[0]]), the gitea_admin_token (= the temp token), and the prompted gitea_admin_password. The role does the heavy lifting — see below.
  3. Revoke the temporary token. A post_tasks block re-invokes gitea_token with gitea_token_delete: true, so the admin token never outlives the run.

What the hashicorp_vault role does

The role's tasks/main.yml runs a fixed sequence; the OIDC backend setup is wrapped in a block/always so the freshly minted root token is always revoked, even on failure:

Phase Task file What happens
Init init.yml First-time only. Checks vault operator init -status; if uninitialised, runs vault operator init with 1 key share / threshold 1 and writes the keys to ~/.arcodange/cluster-keys.json (mode 600). Idempotent on re-run.
Unseal unseal.yml Reads cluster-keys.json and runs vault operator unseal on every server pod. Required on every reboot — Vault always restarts sealed.
Root token new_root_token.yml Mints a one-shot root token via the generate-root OTP/nonce dance (using the unseal key), needed to authenticate the OpenTofu apply.
OIDC backend gitea_oidc_auth.yml Drives a Playwright script to register/read the Gitea OAuth app, then runs OpenTofu in a throwaway Docker volume to provision the gitea (OIDC) + gitea_jwt (JWT) auth backends, the admin identity, and the kvv1 static secrets. Finally writes the vault_oauth__sh_b64 script to Gitea Actions secrets.
Revoke revoke_token.yml (in always) Revokes the root token unconditionally.

Important

The OpenTofu apply runs the hashicorp_vault.tf inside an ephemeral Docker volume (docker volume createtofu init + tofu applydocker volume rm), with the state in a GCS backend (gs://arcodange-tf, prefix tools/hashicorp_vault/gitea_oidc). The CA is mounted read-only via VAULT_CACERT. The destroy step is commented out by design — this provisions, it does not tear down.

The vault_oauth__sh_b64 Gitea secret

The last act of the role renders oidc_jwt_token.sh.j2 (an OIDC authorization-code → access-token helper for CI), base64-encodes it, and publishes it as the org-level Gitea Action secret vault_oauth__sh_b64. Because Gitea Action secrets are scoped per owner, the role then re-publishes the identical secret to each user-owned namespace listed in gitea_secret_propagation_users — repos under a personal account cannot read org-level secrets. This is what lets a Gitea Actions workflow obtain the OIDC JWT that authenticates to Vault under the gitea_cicd_<app> role (the CI half of the secret model).

Caution

The role has an off-by-default vault_oidc_force_reset flag. When set, it runs vault auth disable gitea and gitea_jwt before re-applying — which wipes every gitea_cicd_<app> per-app JWT role created by the tools-repo IaC. Leave it false unless you are deliberately rebuilding the OIDC backend from scratch (e.g. bound_issuer config drift).


Step 2 — crowdsec.yml

The CrowdSec sub-playbook is a thin wrapper that runs the crowdsec role to bolt a CrowdSec-bouncer middleware onto Traefik. The role's tasks/main.yml wires three things together.

Step What it creates Detail
Turnstile secret ServiceAccount + VaultAuth + VaultStaticSecret in kube-system Authenticates via the Kubernetes auth backend (role factory_crowdsec_conf) and pulls the Cloudflare Turnstile keys from kvv2 path cms/factory/turnstile into a K8s Secret (refreshAfter: 30s).
Bouncer key A CrowdSec LAPI bouncer named traefik-plugin Runs cscli bouncers add traefik-plugin inside the LAPI pod; on collision it deletes and re-adds, so the run is repeatable.
Traefik middleware A traefik.io/v1alpha1 Middleware named crowdsec Stream mode, captcha provider turnstile (site/secret keys from the Turnstile secret), Redis cache, trusted-IP allow-lists.

After applying the middleware the role cleans up Failed CrowdSec pods and bounces Traefik (scale to 0 → back to 1, inside a block/rescue/always that guarantees Traefik returns to 1 replica no matter what) so the new middleware config is loaded.

Note

The Turnstile keys come from the CMS-managed Vault path cms/factory/turnstile — they are provisioned outside this stage. CrowdSec only reads them here. See Secrets & Vault for how VaultStaticSecret materialises a Vault path into a Kubernetes Secret.


Gotchas

Warning

  • Vault must be unsealed before anything secret-dependent recovers. Stage 4's unseal step reads ~/.arcodange/cluster-keys.json; if that file is missing, init/unseal cannot proceed and the OpenTofu apply (which needs a live Vault) fails. The same file gates step 2 of the power-cut recovery order.
  • Docker is required on the control node. The OIDC backend provisioning shells out to docker run … opentofu and docker volume. The Playwright step also runs containerised. A control node without Docker will fail this stage.
  • gitea_admin_password is unsafe. Do not strip the unsafe: true flag from the prompt — passwords with {/} are mangled by Jinja templating otherwise.
  • Re-running is safe by default. Init and unseal are idempotent; the temp admin token and root token are both revoked on the way out. Only vault_oidc_force_reset makes a re-run destructive.
  • CrowdSec bounces Traefik. The middleware step briefly scales Traefik to 0 — expect a short ingress blip during stage 4. The always block restores it to 1 even if the scale-down errors.

Where stage 4 sits

%%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#1f2937','primaryTextColor':'#f9fafb','lineColor':'#6b7280','fontSize':'14px'}}}%%
flowchart LR
  classDef done fill:#1e3a5f,stroke:#3b82f6,color:#f9fafb;
  classDef here fill:#4c1d95,stroke:#7c3aed,color:#f5f3ff;
  classDef next fill:#1e3a5f,stroke:#3b82f6,color:#f9fafb;

  s03["03 · CI/CD"]:::done
  s04["04 · Tools<br/>Vault · CrowdSec"]:::here
  s05["05 · Backup"]:::next

  s03 --> s04 --> s05
  1. 03 · CI/CD registered the act_runner executors — a prerequisite, since the vault_oauth__sh_b64 secret published here is consumed by those CI runners.
  2. 04 · Tools (this page) stands up Vault and CrowdSec.
  3. 05 · Backup is next — it schedules the cron dumps that protect the state the cluster now holds.