Add a root AGENTS.md (ecosystem map of factory/tools/cms + agent operating rules + the persona cohort & workflow) and a new vibe/ knowledge base for LLM agents, modeled on tree-docs conventions and the factory house style. vibe/ folders (each with a README hub + contribution rules): - ADR/ optimized MADR-lite; canonical home going forward (doc/adr stays historical) - PRD/ one subfolder per PRD, mandatory STATUS.md, QA strategy for big ones - investigations/ single INV-NNN-slug.md, or stub + folder w/ notebooks - guidebooks/ tree-docs maps; lab-ecosystem guidebook of factory+tools+cms - runbooks/ [AGENT]/[HUMAN] step procedures (EN; doc/runbooks stays FR) - shareouts/ dated FR handouts (decks/mp4) Seed content (first ADR + PRD): a safe, production-like environment to rehearse risky changes and recovery without touching real prod — local-only sandbox (k3d + arm64 VMs) with a hard prod/sandbox isolation boundary. Includes INV-001 (prod blast-radius couplings), the ecosystem guidebook, and a FR shareout. Conventions enforced: no-tombstone rule, breadcrumb spine, bidirectional cross-links, theme:base mermaid (MCP-validated) + ordered-list-after-diagram. Built with a Workflow + persona cohort; 24 files, zero dead links. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
8.7 KiB
vibe > Guidebooks > Lab ecosystem > Secrets & Vault
Secrets & Vault
Status: 🟢 Active Last Updated: 2026-06-23 Related: Lab ecosystem · Tools brick · Storage & recovery · Naming conventions Decision: ADR 0001 — Safe, production-like environment
TL;DR
HashiCorp Vault is the single source of truth for every secret in the lab. There is no sops, no age, no secret files in git — if a credential exists, Vault either stores it or mints it on demand. Two parties consume secrets, and each authenticates a different way: pods use the Kubernetes auth backend (via the Vault Secrets Operator), and CI / OpenTofu use Gitea OIDC JWT (one role gitea_cicd_<app> per app). Vault holds static config in KV, encryption keys in transit, and issues short-lived, dynamic PostgreSQL credentials so no long-lived DB password is ever written down. The trade-off: Vault is sealed on every restart and must be manually unsealed (1 key, threshold 1) before anything that needs a secret can come back.
Why Vault, and only Vault
The lab made a deliberate choice: one secret store, accessed over the network, rather than encrypted secret files scattered through the repos. The consequences are structuring:
- No secret material in git. Charts and OpenTofu reference Vault paths, never values. A leaked repo leaks no credentials.
- One revocation point. Rotating or revoking a credential happens in Vault; consumers pick up the change on their next read or lease renewal.
- Dynamic over static. Where a backend supports it (Postgres), Vault issues a fresh, time-boxed credential per consumer instead of a shared static password.
Vault itself runs as the hashicorp-vault chart in the tools namespace. Its full configuration — engines, auth backends, policies, the per-app role/policy modules — lives in the tools repo; see the Tools brick for the deployment context.
What Vault mounts
| Mount | Type | Purpose |
|---|---|---|
kvv2/ |
KV v2 (versioned) | Application static config, e.g. kvv2/<app>/config. Versioned so a bad write can be rolled back. |
| KV v1 | KV v1 (unversioned) | Flat secrets that don't need history. |
transit/ |
Transit | Encryption-as-a-service: encrypt/decrypt and sign without exposing the key. |
postgres/ |
Database (dynamic) | Issues short-lived PostgreSQL credentials on demand: postgres/creds/<app> hands out a fresh login user, granted <app>_role, with a lease that expires. |
The <app> slug threads through every one of these paths — kvv2/<app>/config, postgres/creds/<app> — exactly as described in Naming conventions.
The two auth backends
Vault doesn't trust callers by static token. Each class of consumer proves its identity through a backend matched to where it runs:
- Kubernetes auth — for pods. The Vault Secrets Operator (VSO) and workloads present their Kubernetes ServiceAccount token; Vault validates it against the cluster's API and maps the SA to the Vault role
<app>, which carries the runtime policy<app>. - Gitea OIDC / JWT auth — for CI and OpenTofu. A Gitea Actions workflow obtains an OIDC token; Vault validates it and maps it to the JWT role
gitea_cicd_<app>, which carries the CI/ops policy<app>-ops. This is howtofu applyin CI reads and writes the secrets it manages without any pre-shared Vault token.
The split matters: pods get only what they need at runtime (the <app> policy), while CI gets the broader provisioning rights (<app>-ops) needed to create the very secrets the pods will later read.
How VSO delivers secrets to pods
Inside the cluster, the Vault Secrets Operator is the bridge between Vault and Kubernetes. It watches two CRDs:
VaultAuth— declares how to authenticate to Vault (the Kubernetes auth mount + the<app>role).VaultDynamicSecret(andVaultStaticSecret) — declares what to fetch (e.g.postgres/creds/<app>) and which Kubernetes Secret to materialise it into. For dynamic secrets, VSO also renews the lease and rotates the Secret before it expires.
The pod then mounts the resulting Kubernetes Secret as it would any other — it never speaks to Vault directly, and never sees a static DB password.
The secret flow, end to end
%%{init: {'theme':'base'}}%%
flowchart LR
subgraph CI["CI / Provisioning path"]
GHA["Gitea Actions<br/>workflow"]:::src
TOFU["OpenTofu<br/>tofu apply"]:::proc
end
subgraph RT["Runtime path (in-cluster)"]
VSO["Vault Secrets<br/>Operator (VSO)"]:::proc
POD["App pod<br/>(ServiceAccount <app>)"]:::proc
end
VAULT["Vault<br/>KV v1/v2 · transit · postgres dynamic"]:::store
GHA -->|"OIDC JWT<br/>role gitea_cicd_<app>"| VAULT
VAULT -->|"policy <app>-ops<br/>read/write secrets"| TOFU
TOFU -->|"writes config to<br/>kvv2/<app>/config"| VAULT
VSO -->|"k8s auth<br/>role <app> (SA token)"| VAULT
VAULT -->|"dynamic creds<br/>postgres/creds/<app>"| VSO
VSO -->|"materialises +<br/>renews K8s Secret"| POD
classDef src fill:#2563eb,stroke:#1e40af,color:#fff
classDef proc fill:#059669,stroke:#047857,color:#fff
classDef store fill:#7c3aed,stroke:#6d28d9,color:#fff
- CI path: a Gitea Actions workflow requests an OIDC JWT and presents it to Vault under the role
gitea_cicd_<app>. Vault validates the token and grants the<app>-opspolicy. - With that policy, OpenTofu (
tofu apply, running in CI) reads the secrets it needs and writes the app's static config back tokvv2/<app>/config. No pre-shared Vault token is ever stored — the trust is established per-run via OIDC. - Runtime path: in the cluster, the Vault Secrets Operator authenticates with the Kubernetes auth backend, presenting the app's ServiceAccount token mapped to the Vault role
<app>. - Vault issues a short-lived, dynamic PostgreSQL credential from
postgres/creds/<app>back to VSO. - VSO materialises that credential into a Kubernetes Secret in the app's namespace, then renews the lease and rotates the Secret before it expires.
- The app pod mounts the Kubernetes Secret like any other — it never talks to Vault, and never holds a long-lived database password.
The unseal model
Vault encrypts its storage with a master key that is never persisted in usable form. On every start — a fresh deploy, a pod reschedule, or a full cluster recovery — Vault comes up sealed and refuses every request until it is unsealed.
- Shamir config: 1 unseal key, threshold 1 (a single-operator lab, so no key-splitting ceremony).
- Where the key lives: on the control node (the MacBook), at
~/.arcodange/cluster-keys.json. It is not in git, not in Kubernetes, not in Vault. - Operational consequence: nothing that needs a secret recovers until a human unseals Vault. This is the chokepoint baked into the recovery order — VSO cannot re-auth, dynamic DB creds cannot be issued, and dependent apps cannot start, until the unseal happens. See Storage & recovery for where unseal sits in the tested startup sequence.
Caution
If
~/.arcodange/cluster-keys.jsonis lost, Vault's data is unrecoverable — there is no second copy of the unseal key and no key-recovery path. Treat that file as the most critical secret in the lab.
Sandbox implications
A production-like sandbox does not share the production Vault. It runs its own Vault instance with its own unseal key and its own policies, so that exercising secret flows, rotating credentials, or testing a broken unseal cannot touch production secrets. Because the <app> join key is environment-relative (see Naming conventions), the sandbox can keep identical role and policy names — gitea_cicd_<app>, <app>, <app>-ops — while remaining fully isolated. The rationale for that separate-Vault, separate-unseal posture is recorded in ADR 0001 — Safe, production-like environment.
See also
- Tools brick — where the
hashicorp-vaultchart, VSO, and the per-app Vault IaC modules are deployed. - Storage & recovery — Vault unseal as a step in the tested power-cut recovery order.
- Naming conventions — how
gitea_cicd_<app>,<app>, and<app>-opsderive from the join key. - ADR 0001 — Safe, production-like environment — the sandbox's separate-Vault decision.