Two agent-oriented runbooks under vibe/runbooks/ with [AGENT]/[HUMAN] step markers, grounded in real diffs: - new-tool.md : add a platform component to the tools repo so ArgoCD deploys it into the tools namespace (wrapper Chart.yaml + the tool library + a row in chart/values.yaml; optional iac/ for secrets). Mirrors the prometheus/crowdsec additions. - new-app.md : stand up a brand-new application across THREE repos (app + factory + tools) with the strict ordering dependency and the TERRAFORM_SSH_KEY pitfall. Phase-by-phase mapped to the dance-lessons-coach onboarding PRs (#89/#97/#98/#99/#100), factory #1/#2, tools #1; the FR doc/runbooks/new-web-app is linked as the detailed companion. 2 mermaid diagrams MCP-validated; zero dead links across the vibe tree. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
280 lines
15 KiB
Markdown
280 lines
15 KiB
Markdown
[vibe](../README.md) > [Runbooks](README.md) > **Set up a new tool**
|
|
|
|
# Set up a new tool
|
|
|
|
> **Status:** ✅ Active
|
|
> **Audience:** platform operator + agents (English). For the application-onboarding equivalent see [Set up a new app](new-app.md).
|
|
> **Last Updated:** 2026-06-23
|
|
|
|
## TL;DR
|
|
|
|
> [!TIP]
|
|
> Adding a platform component means dropping a small **wrapper chart** into the `tools` repo and registering it in the app-of-apps. An agent can do the bulk of it: scaffold `tools/<tool>/` (a wrapper `Chart.yaml` that depends on the upstream chart + the local `tool` library chart, the two `helm-chart*.yaml` templates, and a `values.yaml`), add one key under `tools:` in [`tools/chart/values.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/chart/values.yaml), and lint it locally. The **human approval gate** sits at two places: (1) any Vault/database wiring under `tools/<tool>/iac/` and (2) opening + merging the PR — ArgoCD auto-syncs the new Application the moment it lands on `main`.
|
|
|
|
## Scope
|
|
|
|
This runbook covers adding a **new platform component** (monitoring, cache, security engine, connection pooler, analytics, …) to the [`tools` repo](https://gitea.arcodange.lab/arcodange-org/tools) so the factory ArgoCD `tools` project renders an Application for it and deploys it into the **`tools` namespace**.
|
|
|
|
Systems touched: Gitea (`tools` repo), ArgoCD (the `tools` AppProject), k3s (the helm-controller that materialises each `HelmChart` CR), and — only for secret-backed tools — Vault + the Vault Secrets Operator (VSO).
|
|
|
|
This runbook does **not** cover standing up a brand-new business application (its own repo, chart, CI/CD, database). That is the [Set up a new app](new-app.md) runbook. It also does not cover the underlying app-of-apps wiring of the `tools` project itself — read the [tools guidebook](../guidebooks/tools/README.md) for how that works.
|
|
|
|
## Preconditions
|
|
|
|
- [ ] Working in a worktree under `.claude/worktrees/<slug>/` of a `tools` repo clone (never the trunk).
|
|
- [ ] The tool deploys into the **`tools` namespace** (the `tools` AppProject only permits that destination).
|
|
- [ ] You know the **upstream Helm chart** (chart name + repo URL) and a **pinned version**, OR you have decided this tool needs **Kustomize + helm inflation** (charts that require post-render patching, like `clickhouse`/`plausible`).
|
|
- [ ] `helm` (with the upstream repo reachable) and, for the Kustomize path, `kustomize` available locally for the lint step.
|
|
- [ ] If the tool needs secrets or a database: confidence with the Vault `app_roles` module pattern and the `tofu-apply` CI flow — see the [tools secrets & VSO page](../guidebooks/tools/secrets-and-vso.md) and the [tofu CI apply flow](../guidebooks/factory-provisioning/opentofu/ci-apply-flow.md).
|
|
|
|
## Procedure
|
|
|
|
1. **[HUMAN]** Choose the tool name `<tool>` (kebab-case) and the deployment shape.
|
|
|
|
Decide between the two supported shapes:
|
|
- **Wrapper chart (default).** A thin Helm chart that depends on the upstream chart at a pinned version and lets the local `tool` library chart emit a k3s `HelmChart` custom resource. Used by [`prometheus`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/prometheus) and [`crowdsec`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/crowdsec).
|
|
- **Kustomize + helm inflation.** For charts that need post-render JSON6902 patches or extra `resources/`. Used by [`clickhouse`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/clickhouse) and [`plausible`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible).
|
|
|
|
Pin the upstream chart **version** now — it goes verbatim into the next step.
|
|
|
|
2. **[AGENT]** Scaffold `tools/<tool>/` (wrapper-chart shape).
|
|
|
|
Create four files. The `Chart.yaml` declares **two** dependencies — the local `tool` library chart (served from the Gitea Helm package registry) and the upstream chart pinned to your chosen version:
|
|
|
|
```yaml
|
|
# tools/<tool>/Chart.yaml
|
|
apiVersion: v2
|
|
name: <tool>
|
|
description: A Helm chart for Kubernetes
|
|
|
|
dependencies:
|
|
- name: tool
|
|
version: 0.1.0
|
|
repository: https://gitea.arcodange.lab/api/packages/arcodange-org/helm
|
|
- name: <upstream-chart>
|
|
version: <pinned-version>
|
|
repository: https://<upstream-helm-repo>
|
|
type: application
|
|
version: 0.1.0
|
|
```
|
|
|
|
The two template files are one-liners that delegate to the `tool` library (they only render when `tool.kind` is `HelmChart`; under `SubChart` they are inert and the upstream chart is pulled as a normal dependency):
|
|
|
|
```yaml
|
|
# tools/<tool>/templates/helm-chart.yaml
|
|
{{- if eq .Values.tool.kind "HelmChart" -}}
|
|
{{- include "tool.helm-chart.tpl" . -}}
|
|
{{- end -}}
|
|
```
|
|
|
|
```yaml
|
|
# tools/<tool>/templates/helm-chart-config.yaml
|
|
{{- if eq .Values.tool.kind "HelmChart" -}}
|
|
{{- include "tool.helm-chart-config.tpl" . -}}
|
|
{{- end -}}
|
|
```
|
|
|
|
The `values.yaml` carries the upstream values under a YAML anchor and re-references it from the `tool:` block. Web-facing tools set an ingress host `<tool>.arcodange.lab`; stateful tools set persistence with the longhorn storage class and resource requests/limits. The shape, taken from [`prometheus/values.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/prometheus/values.yaml):
|
|
|
|
```yaml
|
|
# tools/<tool>/values.yaml
|
|
<upstream-chart>: &<tool>_config
|
|
# ── upstream values go here ──
|
|
# web-facing tools: expose an ingress host
|
|
ingress:
|
|
enabled: true
|
|
hosts:
|
|
- <tool>.arcodange.lab
|
|
# stateful tools: pin storage class + size
|
|
persistence:
|
|
enabled: true
|
|
storageClass: longhorn
|
|
size: 8Gi
|
|
resources:
|
|
requests:
|
|
cpu: 100m
|
|
memory: 256Mi
|
|
limits:
|
|
cpu: 500m
|
|
memory: 512Mi
|
|
|
|
tool:
|
|
# kind 'SubChart': pull the upstream chart as a dependency and pass it the values below.
|
|
# kind 'HelmChart': let the tool library emit a k3s HelmChart CR instead.
|
|
kind: 'SubChart'
|
|
repo: https://<upstream-helm-repo>
|
|
chart: <upstream-chart>
|
|
version: <pinned-version>
|
|
values: *<tool>_config
|
|
```
|
|
|
|
> [!NOTE]
|
|
> Under `tool.kind: 'HelmChart'` the local [`tool` library chart](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/tool) emits a `helm.cattle.io/v1` `HelmChart` CR (and an optional `HelmChartConfig`) pinned to `namespace: tools` / `targetNamespace: tools`, and the k3s helm-controller installs the upstream chart. Under `'SubChart'` (the default that prometheus and crowdsec use) the upstream chart is just a Helm dependency rendered in-line. Pick `SubChart` unless you specifically need the helm-controller to own the release.
|
|
|
|
For the **Kustomize shape** instead, skip the wrapper `Chart.yaml`/templates and create a `kustomization.yaml` that inflates the upstream chart plus any `resources/`, mirroring [`plausible/kustomization.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible/kustomization.yaml):
|
|
|
|
```yaml
|
|
# tools/<tool>/kustomization.yaml
|
|
apiVersion: kustomize.config.k8s.io/v1beta1
|
|
kind: Kustomization
|
|
namespace: tools
|
|
|
|
helmCharts:
|
|
- name: <upstream-chart>
|
|
repo: https://<upstream-helm-repo>
|
|
version: <pinned-version>
|
|
releaseName: <tool>
|
|
valuesFile: <tool>Values.yaml
|
|
namespace: tools
|
|
|
|
resources:
|
|
- resources/ingressroute.yaml
|
|
# patches: / patchesJson6902: ← post-render tweaks, see plausible for a worked example
|
|
```
|
|
|
|
3. **[AGENT]** Register the tool in the app-of-apps.
|
|
|
|
Add a single key for `<tool>` under `tools:` in [`tools/chart/values.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/chart/values.yaml):
|
|
|
|
```yaml
|
|
# tools/chart/values.yaml
|
|
tools:
|
|
pgbouncer: {}
|
|
hashicorp-vault: {}
|
|
crowdsec: {}
|
|
# …existing entries…
|
|
<tool>: {}
|
|
```
|
|
|
|
The `chart/templates/apps.yaml` template ranges over `.Values.tools` and renders one ArgoCD `Application` per key, with `path: <tool>` and `destination.namespace: tools` under the `tools` AppProject. The key **must match the directory name** you created in step 2. See the [tools guidebook](../guidebooks/tools/README.md) for how the app-of-apps meta-chart drives this.
|
|
|
|
4. **[HUMAN]** If the tool needs **secrets** or a **database**, wire Vault + VSO and a tofu-apply workflow.
|
|
|
|
This step mutates Vault (creates roles/secrets) and so is gated. Use [`crowdsec`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/crowdsec) (dynamic Postgres role) and [`plausible`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible) (kvv2 static secrets) as the worked examples, and read the [tools secrets & VSO page](../guidebooks/tools/secrets-and-vso.md).
|
|
|
|
a. Add `tools/<tool>/iac/` — OpenTofu that configures Vault. For a dynamic Postgres role, reuse the shared `app_roles` module exactly as crowdsec does:
|
|
|
|
```hcl
|
|
# tools/<tool>/iac/main.tf
|
|
module "app_roles" {
|
|
source = "git::ssh://git@192.168.1.202:2222/arcodange-org/tools.git//hashicorp-vault/iac/modules/app_roles?depth=1&ref=main"
|
|
name = "<tool>"
|
|
service_account_namespaces = ["tools"]
|
|
}
|
|
# for kvv2 static config, add vault_kv_secret_v2 resources (see plausible/iac/main.tf)
|
|
```
|
|
|
|
Pair it with a `backend.tf` (GCS state at `prefix = "tools/<tool>/main"`) and a `providers.tf` whose `auth_login_jwt` role is `gitea_cicd_<tool>` — both copied from crowdsec.
|
|
|
|
b. Add the VSO CRDs to the chart templates so VSO mints a k8s Secret the workload consumes. A `serviceaccount.yaml`, a `VaultAuth` bound to a Vault `kubernetes` role named `<tool>`, and a `VaultDynamicSecret` (or `VaultStaticSecret` for kvv2) pointing at the Vault path:
|
|
|
|
```yaml
|
|
# tools/<tool>/templates/vaultauth.yaml
|
|
apiVersion: secrets.hashicorp.com/v1beta1
|
|
kind: VaultAuth
|
|
metadata:
|
|
name: <tool>
|
|
namespace: {{ .Release.Namespace }}
|
|
spec:
|
|
vaultConnectionRef: default
|
|
method: kubernetes
|
|
mount: kubernetes
|
|
kubernetes:
|
|
role: <tool>
|
|
serviceAccount: <tool>
|
|
audiences:
|
|
- vault
|
|
```
|
|
|
|
```yaml
|
|
# tools/<tool>/templates/vaultdynamicsecret.yaml
|
|
apiVersion: secrets.hashicorp.com/v1beta1
|
|
kind: VaultDynamicSecret
|
|
metadata:
|
|
name: <tool>-db-credentials
|
|
namespace: {{ .Release.Namespace }}
|
|
spec:
|
|
mount: postgres
|
|
path: creds/<tool>
|
|
destination:
|
|
create: true
|
|
name: <tool>-db-credentials
|
|
rolloutRestartTargets:
|
|
- kind: Deployment
|
|
name: <tool>
|
|
vaultAuthRef: <tool>
|
|
```
|
|
|
|
Then reference the VSO-created secret from the workload (env `valueFrom.secretKeyRef`), as crowdsec's `values.yaml` does for `DB_USER`/`DB_PASSWORD`. For the Kustomize shape, add these CRDs as files under `resources/` and list them in `kustomization.yaml` instead of `templates/`.
|
|
|
|
c. Add a `.gitea/workflows/<tool>.yaml` that tofu-applies `<tool>/iac` on changes, mirroring [`crowdsec.yaml`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/.gitea/workflows/crowdsec.yaml): a path filter on `'<tool>/**/*.tf'`, a Gitea→Vault JWT auth job, and a `dflook/terraform-apply` step with `path: <tool>/iac`. See the [tofu CI apply flow](../guidebooks/factory-provisioning/opentofu/ci-apply-flow.md) for what that pipeline does end to end.
|
|
|
|
5. **[AGENT]** Lint and render locally before opening the PR.
|
|
|
|
For the wrapper-chart shape:
|
|
|
|
```bash
|
|
helm dependency update tools/<tool>
|
|
helm lint tools/<tool>
|
|
helm template <tool> tools/<tool> | head -n 60
|
|
# render the app-of-apps Application for <tool>:
|
|
helm template tools-apps tools/chart | grep -A12 "name: <tool>"
|
|
```
|
|
|
|
For the Kustomize shape:
|
|
|
|
```bash
|
|
kustomize build --enable-helm tools/<tool> | head -n 60
|
|
```
|
|
|
|
6. **[HUMAN]** Open a PR on the `tools` repo, get it reviewed, and merge.
|
|
|
|
```bash
|
|
git checkout -b arcodange/<slug>
|
|
git add tools/<tool> tools/chart/values.yaml
|
|
git commit -m "declare <tool>"
|
|
git push -u origin arcodange/<slug>
|
|
```
|
|
|
|
> [!IMPORTANT]
|
|
> The `tools` repo is on **Gitea**, not GitHub — open the PR with the `mcp__gitea__*` tools (load `select:mcp__gitea__pull_request_write` via `ToolSearch`), not `gh`. Once the PR merges to `main`, ArgoCD detects the new key in `chart/values.yaml`, renders the `<tool>` Application, and syncs it automatically.
|
|
|
|
## Verification
|
|
|
|
All read-only — an agent can run these after the PR merges and ArgoCD has reconciled.
|
|
|
|
```bash
|
|
# 1. The ArgoCD Application for <tool> is Synced + Healthy
|
|
kubectl --context <ctx> -n argocd get application <tool> \
|
|
-o jsonpath='{.status.sync.status}/{.status.health.status}{"\n"}'
|
|
# expected: Synced/Healthy
|
|
|
|
# 2. The pod is Running in the tools namespace
|
|
kubectl --context <ctx> -n tools get pods -l app.kubernetes.io/name=<tool>
|
|
# expected: <tool>-… 1/1 Running
|
|
|
|
# 3. Web-facing tools: the ingress is admitted and the host resolves
|
|
kubectl --context <ctx> -n tools get ingress | grep <tool>
|
|
curl -sI https://<tool>.arcodange.lab | head -n1 # expected: HTTP/2 200 (or app login redirect)
|
|
|
|
# 4. Secret-backed tools: VSO created the k8s Secret
|
|
kubectl --context <ctx> -n tools get secret <tool>-db-credentials
|
|
# expected: the Secret exists with the keys the workload mounts
|
|
```
|
|
|
|
## Rollback
|
|
|
|
- **[HUMAN]** Revert the `tools/chart/values.yaml` entry (remove the `<tool>:` key). On the next sync ArgoCD **prunes** the `<tool>` Application — `prune: true` is set in `apps.yaml` — which removes the deployed workload from the `tools` namespace.
|
|
- **[HUMAN]** In a follow-up PR, delete the `tools/<tool>/` directory to remove the wrapper chart / Kustomize source.
|
|
- **[HUMAN]** For secret-backed tools, the Vault role/secret created by `tools/<tool>/iac/` is **not** removed by ArgoCD. Destroy it explicitly (`tofu -chdir=tools/<tool>/iac destroy`) or remove the IaC and let the workflow reconcile, and drop the `.gitea/workflows/<tool>.yaml` file.
|
|
- For a full cluster-level recovery (power cut, lost quorum) follow CLUSTER_RECOVERY.md.
|
|
|
|
## References
|
|
|
|
- [Tools guidebook](../guidebooks/tools/README.md) — how the app-of-apps meta-chart turns each `tools:` key into an ArgoCD Application.
|
|
- [Tools components](../guidebooks/tools/components.md) — the catalogue of platform components and what each provides.
|
|
- [Tools secrets & VSO](../guidebooks/tools/secrets-and-vso.md) — the Vault `app_roles` + VaultAuth/VaultDynamicSecret pattern used in step 4.
|
|
- [Tofu CI apply flow](../guidebooks/factory-provisioning/opentofu/ci-apply-flow.md) — what the `<tool>/iac` tofu-apply workflow does end to end.
|
|
- Real examples in the `tools` repo: [`prometheus`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/prometheus) and [`crowdsec`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/crowdsec) (wrapper-chart shape), the shared [`tool` library chart](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/tool), and [`clickhouse`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/clickhouse)/[`plausible`](https://gitea.arcodange.lab/arcodange-org/tools/src/branch/main/plausible) (Kustomize shape).
|
|
- [Set up a new app](new-app.md) — the sibling runbook for onboarding a business application (not a platform component).
|