[vibe](../../../README.md) > [Guidebooks](../../README.md) > [Factory provisioning](../README.md) > **Ansible** # Ansible — factory provisioning > [!NOTE] > **Status:** ✅ active · **Last Updated:** 2026-06-23 > **Upstream:** [Factory provisioning hub](../README.md) · [Lab ecosystem · 01 factory](../../lab-ecosystem/01-factory.md) > **Downstream:** [01 · System](01-system.md) · [02 · Setup](02-setup.md) · [03 · CI/CD](03-cicd.md) · [04 · Tools](04-tools.md) · [05 · Backup](05-backup.md) · [06 · Recover](06-recover.md) · [Inventory & variables](inventory.md) · [Roles reference](roles.md) > **Related:** [Secrets & Vault](../../lab-ecosystem/secrets-and-vault.md) · [Storage & recovery](../../lab-ecosystem/storage-and-recovery.md) · [Naming conventions](../../lab-ecosystem/naming-conventions.md) · [ADR-0001 safe prod-like environment](../../../ADR/0001-safe-prod-like-environment.md) Ansible is the **imperative half** of the factory: it takes three bare Raspberry Pis (`pi1`, `pi2`, `pi3`) and turns them into a running K3s cluster with Docker, Longhorn storage, Gitea CI runners, CrowdSec, and Vault. OpenTofu (the declarative half) then provisions everything that lives *outside* the cluster — see the [OpenTofu sub-hub](../opentofu/README.md). --- ## Collection layout Everything ships as a single Ansible **collection** committed under [`ansible/arcodange/factory/`](../../../../ansible/arcodange/factory). The collection root, not the repo root, is what `ansible-galaxy collection install` and the FQCN references (`arcodange.factory.`) resolve against. | File | Path | What it declares | | --- | --- | --- | | `galaxy.yml` | [`ansible/arcodange/factory/galaxy.yml`](../../../../ansible/arcodange/factory/galaxy.yml) | Collection identity: **namespace `arcodange`**, **name `factory`**, **version `1.0.0`**. Together they form the FQCN prefix `arcodange.factory.*` used by every role and playbook import. | | `requirements.yml` | [`ansible/requirements.yml`](../../../../ansible/requirements.yml) | External dependencies pulled at install time (see table below). | | `ansible.cfg` | [`ansible/arcodange/factory/ansible.cfg`](../../../../ansible/arcodange/factory/ansible.cfg) | `collections_path = ~/.ansible/collections` and `scp_if_ssh = True` for the SSH connection plugin. | | `inventory/` | [`ansible/arcodange/factory/inventory/`](../../../../ansible/arcodange/factory/inventory) | `hosts.yml` + `group_vars/`. Detailed in [Inventory & variables](inventory.md). | | `playbooks/` | [`ansible/arcodange/factory/playbooks/`](../../../../ansible/arcodange/factory/playbooks) | The numbered pipeline `01..05` plus the `recover/` branch. | | `roles/` | [`ansible/arcodange/factory/roles/`](../../../../ansible/arcodange/factory/roles) | Seven reusable roles. Detailed in [Roles reference](roles.md). | ### External dependencies (`requirements.yml`) | Dependency | Type | Why it is needed | | --- | --- | --- | | `geerlingguy.docker` | role | Installs and configures the Docker engine on each Pi. | | `ansible.posix` | collection | POSIX primitives (mounts, sysctl, `synchronize`). | | `community.crypto` | collection | Certificate/key generation for the step-ca PKI and Traefik. | | `community.docker` | collection | Manages containers and Compose stacks (Gitea, act_runner). | | `community.general` | collection | Broad utility modules used across the pipeline. | | `kubernetes.core` | collection | `k8s` / `helm` modules used by every K3s-facing task. Needs the `kubernetes` Python lib at runtime. | | `k3s-ansible` (`git+https://github.com/k3s-io/k3s-ansible.git`) | git role/collection | Upstream playbooks that install and cluster K3s itself. | > [!TIP] > The runtime Python libraries (`kubernetes`, `jmespath`, `dnspython`) that `kubernetes.core` and friends import are declared in the **repo-root `pyproject.toml`**, not in `requirements.yml`. `uv sync` installs them; `ansible-galaxy` installs the Galaxy/git content. Both steps are required. --- ## Invocation pattern The control node runs Ansible from a `uv`-managed venv. The `localhost` inventory entry sets `ansible_python_interpreter: "{{ ansible_playbook_python }}"`, so `uv run` is enough to put Ansible on the venv's Python — no hardcoded interpreter path. Full recipe lives in [`ansible/README.md`](../../../../ansible/README.md). 1. **Sync the venv** — installs `ansible-core` plus the runtime Python deps: ```sh uv sync ``` 2. **Install collection dependencies** — pulls the Galaxy + git content from `requirements.yml`: ```sh uv run ansible-galaxy collection install -r ansible/requirements.yml ``` 3. **Run a stage** — point `-i` at the inventory directory and pass one numbered playbook: ```sh uv run ansible-playbook \ -i ansible/arcodange/factory/inventory \ ansible/arcodange/factory/playbooks/.yml ``` ### The vault password (`ANSIBLE_VAULT_PASSWORD_FILE`) Encrypted vars are decrypted with a password that is **sourced from the cluster, not stored on disk**. `ANSIBLE_VAULT_PASSWORD_FILE` points at a tiny executable script that reads the K8s secret `arcodange-ansible-vault` from the `kube-system` namespace: ```sh kubectl get secret -n kube-system arcodange-ansible-vault \ --template='{{index .data.pass | base64decode}}' ``` > [!IMPORTANT] > The same `arcodange-ansible-vault` secret in `kube-system` is consumed by the Gitea CI runners (needed for the Gitea mailer). Create it once with `kubectl create secret generic arcodange-ansible-vault --from-literal="pass=" -n kube-system`. See [Secrets & Vault](../../lab-ecosystem/secrets-and-vault.md) for how this fits the broader secret model. --- ## The provisioning pipeline The numbered playbooks are meant to be run **in order** on a fresh cluster — each is a thin wrapper that `import_playbook`s a stage directory (e.g. `01_system.yml` → `system/system.yml`). The `recover/` playbooks are **not** part of the linear sequence; they are an on-demand branch used only during disaster recovery. ```mermaid %%{init: {'theme':'base', 'themeVariables': {'primaryColor':'#1f2937','primaryTextColor':'#f9fafb','lineColor':'#6b7280','fontSize':'14px'}}}%% flowchart LR classDef stage fill:#1e3a5f,stroke:#3b82f6,color:#f9fafb; classDef recover fill:#5f1e1e,stroke:#ef4444,color:#fef2f2; s01["01 · System
Docker · K3s · Longhorn · DNS · SSL"]:::stage s02["02 · Setup
Gitea · Postgres · NFS backup"]:::stage s03["03 · CI/CD
act_runner registration"]:::stage s04["04 · Tools
CrowdSec · Vault"]:::stage s05["05 · Backup
cron reports · PVC/db dumps"]:::stage rec["recover/*
Longhorn + data restore"]:::recover s01 --> s02 --> s03 --> s04 --> s05 s05 -. "on disaster" .-> rec rec -. "rejoin pipeline" .-> s01 ``` 1. **`01 · System`** — base OS hardening on each Pi, then Docker, Longhorn disk prep + iSCSI, K3s install, CoreDNS, the step-ca cert issuer, and final K3s config (kubeconfig, Longhorn, Traefik). 2. **`02 · Setup`** — deploys the cluster-resident services: Gitea, PostgreSQL (on `pi2`), and the NFS backup target. 3. **`03 · CI/CD`** — fetches a Gitea runner-registration token and rolls out the `act_runner` Docker Compose stack on every non-Gitea Pi so CI jobs have executors. 4. **`04 · Tools`** — installs the operational tooling layer: CrowdSec (WAF/IPS) and HashiCorp Vault. 5. **`05 · Backup`** — schedules the cron-driven backup + email-report jobs and the Gitea / Postgres / K3s-PVC dump routines. 6. **`recover/*` (on demand)** — invoked only after data loss to rebuild Longhorn and replay volume data; once recovered, the cluster re-enters the normal pipeline at `01 · System`. --- ## Index | # | Page | Covers | State | | --- | --- | --- | --- | | 01 | [System](01-system.md) | RPi hardening, Docker, K3s, Longhorn/iSCSI, CoreDNS, step-ca SSL | ✅ | | 02 | [Setup](02-setup.md) | Gitea, PostgreSQL, NFS backup target | ✅ | | 03 | [CI/CD](03-cicd.md) | Gitea `act_runner` registration & Compose deploy | ✅ | | 04 | [Tools](04-tools.md) | CrowdSec, HashiCorp Vault | ✅ | | 05 | [Backup](05-backup.md) | Cron report jobs, Gitea/Postgres/PVC dumps | ✅ | | 06 | [Recover](06-recover.md) | Longhorn + data restore (on-demand DR branch) | 🟡 | | — | [Inventory & variables](inventory.md) | `hosts.yml` groups, `group_vars/` layering, host→service mapping | ✅ | | — | [Roles reference](roles.md) | The seven `arcodange.factory.*` roles | ✅ | --- ## Maintenance rule > [!IMPORTANT] > **Alter a playbook, role, inventory entry, or `group_vars` → update the matching page here in the same change.** Adding a stage, renaming a role, bumping the K3s version or a `requirements.yml` dependency, or moving a host between groups all change what the pages above describe — edit the page in the PR that changes the code, never as a follow-up. This is the [factory-provisioning maintenance rule](../README.md#maintenance-rule) applied to the Ansible half; the guidebooks' full [Rules to contribute](../../README.md#rules-to-contribute) also apply.