[vibe](../../README.md) > [Guidebooks](../README.md) > [Tools](README.md) > **Secrets & VSO** # Tools — Secrets & VSO > **Status:** ✅ Active > **Last Updated:** 2026-06-23 > **Upstream:** [Tools](README.md) · [Components](components.md) > **Downstream:** consumed by every `tools`-namespace pod and by every app's CI/CD > **Related:** [secrets-and-vault concept](../lab-ecosystem/secrets-and-vault.md) · [naming-conventions concept](../lab-ecosystem/naming-conventions.md) · [storage-and-recovery](../lab-ecosystem/storage-and-recovery.md) · [tofu CI apply flow](../factory-provisioning/opentofu/ci-apply-flow.md) · [postgres IaC](../factory-provisioning/opentofu/postgres-iac.md) · [safe-env ADR](../../ADR/0001-safe-prod-like-environment.md) This page maps how secrets live in **HashiCorp Vault** (engines, auth backends) and how they reach **Kubernetes pods** via the **Vault Secrets Operator (VSO)**. The keystone is the **`app_policy` + `app_roles` module pair**: the machinery that turns a single `` name into a matched set of Vault policies, roles, and CI identities — the same `` join key documented in the [naming-conventions concept](../lab-ecosystem/naming-conventions.md). Vault itself runs as a component in the `tools` namespace; see the [Components](components.md) page for its deploy shape. The admin/bootstrap layer (the `kvv1` engine, the `gitea_jwt` auth backend, the base `gitea_cicd` role, the Kubernetes auth backend mount) is created **by factory's Ansible-managed Vault Terraform** in [`hashicorp_vault.tf`](https://gitea.arcodange.lab/arcodange-org/factory/src/branch/main/ansible/arcodange/factory/playbooks/tools/roles/hashicorp_vault/files/hashicorp_vault.tf); everything in this page that is *per-app* is created by the IaC under [`hashicorp-vault/iac`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac). > [!CAUTION] > Vault runs **standalone** with file/raft storage and starts **sealed** after any restart or node reboot. Until it is unsealed, every VSO read fails and no app can fetch DB creds or config — pods that depend on a `VaultDynamicSecret` will not start. Unseal procedure and key custody live in [storage-and-recovery](../lab-ecosystem/storage-and-recovery.md). --- ## 1) Vault engines & auth backends All engines below are mounted by [`hashicorp-vault/iac/main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/main.tf) except `kvv1`, which is bootstrapped by factory's Ansible Vault Terraform. | Mount | Type | Holds | Defined in | |---|---|---|---| | `kvv1` | KV **v1** | Admin / cloud secrets: `kvv1/google/credentials`, `kvv1/gitea/*`, `kvv1/cloudflare/*`, `kvv1/ovh/*`, `kvv1/postgres/credentials`, `kvv1/admin/*` | factory [`hashicorp_vault.tf`](https://gitea.arcodange.lab/arcodange-org/factory/src/branch/main/ansible/arcodange/factory/playbooks/tools/roles/hashicorp_vault/files/hashicorp_vault.tf) | | `kvv2` | KV **v2** (versioned) | Per-app config secrets under `kvv2//*` | [`main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/main.tf) | | `transit` | transit | The **VSO client-cache encryption key** `vso-client-cache` — lets VSO persist its client cache encrypted so it survives an operator restart without re-auth storms | [`main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/main.tf) | | `postgres` | database | **Dynamic** Postgres creds at `postgres/creds/`; connects to the DB through `pgbouncer.tools:5432` using the `credentials_editor` root account | [`main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/main.tf) | The `postgres` connection is configured with `allowed_roles = ["*"]` and a root-rotation statement (`ALTER USER … WITH PASSWORD`); the editor username/password come from the sensitive `POSTGRES_CREDENTIALS_EDITOR_*` variables. ### Auth backends | Backend | Mount | Who uses it | Role(s) | |---|---|---|---| | `kubernetes` | `kubernetes` | VSO controller + every app pod's ServiceAccount | `vault-secret-operator` (VSO itself), `` (one per app), `factory_crowdsec_conf` | | `gitea_jwt` | `gitea_jwt` | CI/OpenTofu jobs running in Gitea Actions | `gitea_cicd` (base, factory-bootstrapped) + per-app `gitea_cicd_` | - **`kubernetes`** auth ([`main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/main.tf)) is configured against `https://kubernetes.default.svc:443`. The VSO role `vault-secret-operator` binds SA `hashicorp-vault-vault-secrets-operator-controller-manager` in ns `tools`, `audience = vault`, and carries the `edit-vso-client-cache` policy (encrypt/decrypt on `transit/.../vso-client-cache`). - **`gitea_jwt`** is the OIDC/JWT backend for CI. Its backend, `default_role = gitea_cicd`, and the base `gitea_cicd` role are created by factory's Vault bootstrap; the Vault provider in each IaC project logs in via `auth_login_jwt { mount = "gitea_jwt", role = "gitea_cicd[_]" }` using the `TERRAFORM_VAULT_AUTH_JWT` env var. See the [tofu CI apply flow](../factory-provisioning/opentofu/ci-apply-flow.md) for how the token is minted in the pipeline. ### Terraform state Each IaC project keeps its state in the **`arcodange-tf` GCS bucket** under a distinct prefix: | Project | GCS prefix | |---|---| | Vault admin/app machinery | `tools/hashicorp_vault/main` | | Plausible | `tools/plausible/main` | | CrowdSec | `tools/crowdsec/main` | --- ## 2) The `app_policy` + `app_roles` modules — the `` join-key machinery > [!IMPORTANT] > These two modules are the heart of the secrets layer. Given a single `` name they emit a **matched, name-derived** set of Vault objects so that an app's runtime, its CI, and its database identity all line up on the same key. This is the Vault half of the lab-wide [naming convention](../lab-ecosystem/naming-conventions.md): the same `` string also names the Kubernetes namespace, the ServiceAccount, the Postgres `_role`, and the Gitea repo. The two modules live on **opposite sides of the trust boundary**: - [`modules/app_policy`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/modules/app_policy) is declared **once, centrally**, in the Vault admin project ([`main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/main.tf), `for_each` over `var.applications`). It creates the **policies and the CI identity** — the privileged bits — so the app's own repo never holds them. - [`modules/app_roles`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/modules/app_roles) is declared **by the subordinate app project** (pulled over SSH as a Git module), running under the ``-ops policy. It creates the **roles** the app needs. ### `app_roles` — runtime roles (declared by the app repo) For ``, [`app_roles/main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/modules/app_roles/main.tf) creates: | Resource | Path | Key settings | |---|---|---| | Kubernetes auth role | `auth/kubernetes/role/` | `bound_service_account_names = [] + extras`, `bound_service_account_namespaces = [] + extras`, `token_ttl = 3600` (1h), `token_policies = [default, ]`, `audience = vault` | | Postgres dynamic role | `postgres/roles/` | `db_name = postgres`; creation SQL: `CREATE ROLE "{{name}}" WITH LOGIN PASSWORD … VALID UNTIL …` then `GRANT _role TO "{{name}}"`; revocation: `REASSIGN OWNED BY "{{name}}" TO _role` then `REVOKE ALL ON DATABASE FROM "{{name}}"` | > [!IMPORTANT] > The Postgres dynamic role's creation SQL does `GRANT _role TO {{name}}` and its revocation does `REASSIGN OWNED BY {{name}} TO _role`. **The non-login `_role` must already exist in Postgres** — it is created by factory's [postgres IaC](../factory-provisioning/opentofu/postgres-iac.md) (`postgresql_role.app_role[""]`, owner of the `` database). If that role is missing, every ephemeral-user creation/revocation fails. This is the ordering dependency between the two repos: **factory postgres/iac before tools app_roles**. > [!NOTE] > The Kubernetes auth role binds **both** SA names **and** namespaces — the check is an **AND**. A token presenting SA `` from the wrong namespace (or any other SA from ns ``) is rejected. The default binding is SA `` in ns ``; the `service_account_names` / `service_account_namespaces` inputs widen it (e.g. CrowdSec/Plausible run in ns `tools`, not a namespace named after the app). The Postgres role can be skipped with `disable_database = true`; the DB name defaults to `` but can be overridden via `database`. ### `app_policy` — policies + CI identity (declared centrally) For ``, [`app_policy/main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/modules/app_policy/main.tf) creates: | Resource | Name | Grants | |---|---|---| | **App policy** | `` | `read,list` on `kvv2/data//*`; `read` on `postgres/creds/*` — what the runtime pod can do | | **Ops policy** | `-ops` | The CI bundle (below) | | **JWT role** | `gitea_cicd_` (mount `gitea_jwt`) | `token_policies = [default] + 's ops_policies`, `bound_audiences = [gitea_app_id]`, `user_claim = email`, `role_type = jwt` | | **Identity group** | `-ops` | Internal group carrying the `-ops` policy, so Vault users mapped to their Gitea entity inherit ops rights | The **`-ops` policy** is the privilege set a CI job needs to *manage* the app's own corner of Vault and the clouds: - `create/update` on `auth/token/create`; `read` on `sys/mounts/auth/*` (so the Vault provider works); - full CRUD on `postgres/roles/*` and on `auth/kubernetes/role/*` (so `app_roles` can apply) — the k8s-role rule is **parameter-constrained**: it may only set `bound_service_account_names`/`bound_service_account_namespaces` to the whitelisted `[] + extras` lists and `token_policies` to `["default",""]`, preventing a CI job from minting a role with broader bindings; - full CRUD on the app's KV-v2 data, delete/undelete/destroy, and `metadata` (`kvv2/data|delete|undelete|destroy|metadata//*`); - `read` on `kvv1/google/credentials` (the GCS backend SA), `kvv1/gitea/tofu_module_reader` (the bot SSH key that lets CI pull the `app_roles` Git module); - CRUD on `kvv1/cloudflare/*` and `kvv1/ovh/*` (cloud DNS/edge secrets scoped to the app). > [!NOTE] > The policy document is post-processed with two `replace()` calls. The Vault provider serializes the whitelisted list parameters as a JSON-encoded string (`"["webapp"]"`); the replaces strip the outer quotes so Vault receives a real list. If you change those `allowed_parameter` blocks, keep the replaces in sync. ### Apps wired in `terraform.tfvars` [`terraform.tfvars`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/terraform.tfvars) declares the `applications` set the central `app_policy` `for_each` walks: | `` | Extra SA | Extra ns | Extra ops policy | Notes | |---|---|---|---|---| | `webapp` | — | — | — | defaults: SA `webapp` / ns `webapp` | | `erp` | — | — | — | defaults | | `cms` | `cloudflared` | — | `factory__cf_r2_arcodange_tf` | extra SA for the Cloudflare tunnel; extra ops policy for the CF R2 Terraform-state bucket | | `crowdsec` | — | `tools` | — | runs in ns `tools` | | `plausible` | — | `tools` | — | runs in ns `tools` | > [!NOTE] > `terraform.tfvars` uses the key `ops_policies` for the CMS extra policy while `variables.tf` declares the optional attribute as `policies`; the central `main.tf` passes `each.value.policies` into the module's `ops_policies` input. Read these together when adding a new app so the extra-policy list actually lands on the JWT role. --- ## 3) VSO CRDs — how a secret becomes a Kubernetes Secret The [Vault Secrets Operator](https://developer.hashicorp.com/vault/docs/platform/k8s/vso) watches three custom resources and writes plain Kubernetes `Secret` objects that pods consume normally (env / volume). The app repo ships the CRDs; the operator does the Vault round-trips. | CRD | What it does | Refresh / rotation | |---|---|---| | `VaultAuth` | Picks the auth method (`kubernetes`), the `mount`, the Vault `role` (= ``), and the pod **ServiceAccount** (= ``) used to log in; references a `VaultConnection` (here the in-cluster `default` → `http://hashicorp-vault.tools.svc.cluster.local:8200`) | n/a — used by the other two CRDs via `vaultAuthRef` | | `VaultStaticSecret` | Reads a **KV-v2** path → writes a k8s `Secret` | `refreshAfter` (the lab uses `30s`) | | `VaultDynamicSecret` | Reads `postgres/creds/` (a **dynamic** lease) → writes a k8s `Secret`; `rolloutRestartTargets` lists Deployments to restart when creds rotate | follows the Vault lease TTL (1h); VSO renews/re-issues and restarts the targets | ### Worked example — Plausible (`tools` namespace) Files under [`plausible/resources`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible/resources): 1. **`VaultAuth` `plausible`** ([`vaultauth.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible/resources/vaultauth.yaml)) — `method: kubernetes`, `role: plausible`, `serviceAccount: plausible`, `audiences: [vault]`. This is the Vault role `app_roles` created in [`plausible/iac/main.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible/iac/main.tf). 2. **`VaultStaticSecret` `plausible`** ([`vaultsecret.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible/resources/vaultsecret.yaml)) — `kvv2` path `plausible/config` → Secret `plausible-config` (`refreshAfter: 30s`). The config payload holds **`SECRET_KEY_BASE`** and **`TOTP_VAULT_KEY`**, both **generated by Terraform** (`random_password`, base64-encoded) and written to `kvv2/plausible/config` via `vault_kv_secret_v2` in the plausible IaC. 3. **`VaultStaticSecret` `plausible-geoip`** ([`geoipsecret.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible/resources/geoipsecret.yaml)) — `kvv2` path `plausible/geoip` → Secret `plausible-geoip` exposing **`LICENSE_KEY`** (the MaxMind GeoIP license, an admin-seeded value, fed to the `geoipupdate` sidecar via env `GEOIPUPDATE_LICENSE_KEY`). 4. **`VaultDynamicSecret` `plausible-db-credentials`** ([`vaultdynamicsecret.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible/resources/vaultdynamicsecret.yaml)) — `postgres/creds/plausible` → Secret `plausible-db-credentials`; `rolloutRestartTargets` restarts Deployment `plausible`. An **init container** ([`add-initcontainer.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible/add-initcontainer.yaml)) reads `username`/`password` from that Secret and writes `DATABASE_URL` (`postgres://${DB_USER}:${DB_PASS}@${DB_HOST}:${DB_PORT}/${DB_NAME}`) into a shared `generated-secrets` volume the app reads. ### Worked example — CrowdSec (`tools` namespace) Templates under [`crowdsec/templates`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/crowdsec/templates): 1. **`VaultAuth` `crowdsec`** ([`vaultauth.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/crowdsec/templates/vaultauth.yaml)) — `role: crowdsec`, `serviceAccount: crowdsec`. 2. **`VaultDynamicSecret` `crowdsec-db-credentials`** ([`vaultdynamicsecret.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/crowdsec/templates/vaultdynamicsecret.yaml)) — `postgres/creds/crowdsec` → Secret `crowdsec-db-credentials`; `rolloutRestartTargets` restarts Deployment **`crowdsec-lapi`** (the Local API that owns the DB connection). ### `factory_auth.tf` — the Ansible CrowdSec/Traefik plugin reader Separately from the per-app machinery, [`factory_auth.tf`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/hashicorp-vault/iac/factory_auth.tf) wires a Kubernetes auth role **`factory_crowdsec_conf`** for SA **`factory-ansible-tool-crowdsec-traefik-plugin`** in ns **`kube-system`** (`token_ttl = 3600`). It carries policy `factory_crowdsec_conf`, which grants `read,list` on **`kvv2/data/cms/factory/*`**. This is how the Ansible-deployed CrowdSec/Traefik bouncer plugin reads the **Turnstile** configuration that the [`cms` repo](https://gitea.arcodange.lab/arcodange-org/cms) writes into `kvv2/cms/factory/*` — a cross-repo handoff entirely through Vault, with no shared file. The producer side (the Turnstile widget and the `vault_kv_secret_v2` write) is documented on the [CMS Cloudflare page](../cms/cloudflare.md). --- ## 4) Secret-paths inventory | Path | Engine | Holds | Producer | Consumer | |---|---|---|---|---| | `kvv2//config` | KV v2 | App runtime config | app CI (KV CRUD via `-ops`) | `VaultStaticSecret` → pod | | `kvv2/plausible/config` | KV v2 | `SECRET_KEY_BASE`, `TOTP_VAULT_KEY` | Plausible IaC (`random_password` → `vault_kv_secret_v2`) | `VaultStaticSecret plausible` → `plausible-config` | | `kvv2/plausible/geoip` | KV v2 | `LICENSE_KEY` (MaxMind) | admin-seeded | `VaultStaticSecret plausible-geoip` → `geoipupdate` sidecar | | `kvv2/cms/factory/turnstile` | KV v2 | Cloudflare Turnstile config | `cms` repo IaC | `factory_crowdsec_conf` k8s role → Ansible CrowdSec/Traefik plugin | | `postgres/creds/` | database | Ephemeral DB user (`username`/`password`, 1h lease) | Vault on demand (role ``, `GRANT _role`) | `VaultDynamicSecret` → pod (e.g. `plausible-db-credentials`, `crowdsec-db-credentials`) | | `transit/.../vso-client-cache` | transit | VSO client-cache encryption key | Vault admin IaC | VSO controller (encrypt/decrypt its cache) | | `kvv1/cloudflare/*` | KV v1 | Cloudflare DNS/edge secrets | admin | app CI (`-ops` CRUD) | | `kvv1/ovh/*` | KV v1 | OVH secrets | admin | app CI (`-ops` CRUD) | | `kvv1/gitea/tofu_module_reader` | KV v1 | Bot SSH key to pull the `app_roles` Git module | admin | app CI (`-ops` read) | | `kvv1/google/credentials` | KV v1 | GCS Terraform-backend SA key | admin | every IaC CI job (read) | --- ## 5) Secrets flow ```mermaid %%{init: {'theme': 'base'}}%% flowchart TB classDef eng fill:#7c3aed,stroke:#5b21b6,color:#ffffff classDef auth fill:#b45309,stroke:#92400e,color:#ffffff classDef crd fill:#059669,stroke:#047857,color:#ffffff classDef k8s fill:#2563eb,stroke:#1e40af,color:#ffffff classDef ci fill:#be123c,stroke:#9f1239,color:#ffffff subgraph VAULT["Vault (tools ns)"] KV2["kvv2 engine
kvv2/<app>/*"]:::eng PG["postgres engine
postgres/creds/<app>"]:::eng TR["transit
vso-client-cache"]:::eng KKUB["kubernetes auth
role <app> (SA AND ns)"]:::auth KJWT["gitea_jwt auth
gitea_cicd_<app>"]:::auth end subgraph RUNTIME["Runtime path"] VA["VaultAuth
role <app>, SA <app>"]:::crd VSS["VaultStaticSecret
kvv2/<app>/config"]:::crd VDS["VaultDynamicSecret
postgres/creds/<app>"]:::crd SEC["k8s Secret
<app>-config / -db-credentials"]:::k8s POD["App pod
(SA <app>)"]:::k8s end subgraph CICD["CI path"] GHA["Gitea Actions
OpenTofu job"]:::ci TOFU["apply app_roles
(under <app>-ops)"]:::ci end KKUB --> VA VA --> VSS VA --> VDS KV2 --> VSS PG --> VDS VSS --> SEC VDS -- "rolloutRestart on rotation" --> SEC SEC --> POD TR -. "encrypts client cache" .-> VA GHA -- "JWT login" --> KJWT KJWT --> TOFU TOFU -- "creates" --> KKUB TOFU -- "creates" --> PG ``` 1. **Vault** mounts the engines (`kvv2`, `postgres`, `transit`) and the two auth backends (`kubernetes`, `gitea_jwt`), all in the `tools` namespace. 2. A pod's `VaultAuth` logs in through the **`kubernetes`** backend with SA `` against role ``; the role accepts only when **both** the SA name **and** its namespace match (AND). 3. `VaultStaticSecret` reads `kvv2//config` and `VaultDynamicSecret` reads `postgres/creds/` using that auth; VSO writes the values into ordinary k8s `Secret` objects. 4. The pod consumes the Secret (env or volume); on a dynamic-cred **rotation** VSO restarts the `rolloutRestartTargets` Deployment so it picks up the new credentials. 5. The **`transit`** key `vso-client-cache` encrypts VSO's client cache so an operator restart doesn't trigger a re-auth storm. 6. On the CI side, a **Gitea Actions** OpenTofu job logs into the **`gitea_jwt`** backend as `gitea_cicd_` (audience = the Gitea OAuth app id, identity from the `email` claim). 7. Running under the `-ops` policy, that job **applies the `app_roles` module**, creating/updating the Kubernetes auth role and the Postgres dynamic role for `` — closing the loop so the runtime path in steps 2-4 works. --- ## Gotchas - **Vault must be unsealed after every restart.** Sealed Vault → all VSO reads fail → dynamic-secret consumers won't start. See [storage-and-recovery](../lab-ecosystem/storage-and-recovery.md). - **The Kubernetes auth role binds SA *and* namespace (AND).** The wrong namespace, or a different SA in the right namespace, is rejected. Apps in ns `tools` (CrowdSec, Plausible) widen the binding via `service_account_namespaces`. - **The Postgres dynamic role depends on `_role` existing.** `GRANT _role TO {{name}}` (create) and `REASSIGN OWNED BY {{name}} TO _role` (revoke) both fail if factory's [postgres IaC](../factory-provisioning/opentofu/postgres-iac.md) hasn't created the `_role` non-login role first. Order: **factory postgres/iac → tools app_roles**. - **The `ops_policies` vs `policies` key mismatch** in `terraform.tfvars` / `variables.tf` (see §2) — read both when adding an app's extra ops policy. - **The sandbox uses a separate Vault.** Per the [safe-env ADR](../../ADR/0001-safe-prod-like-environment.md), the prod-like sandbox stands up its own Vault instance; none of the paths or roles above are shared with it. Don't assume a secret seeded in prod exists in the sandbox.