Add a root AGENTS.md (ecosystem map of factory/tools/cms + agent operating rules + the persona cohort & workflow) and a new vibe/ knowledge base for LLM agents, modeled on tree-docs conventions and the factory house style. vibe/ folders (each with a README hub + contribution rules): - ADR/ optimized MADR-lite; canonical home going forward (doc/adr stays historical) - PRD/ one subfolder per PRD, mandatory STATUS.md, QA strategy for big ones - investigations/ single INV-NNN-slug.md, or stub + folder w/ notebooks - guidebooks/ tree-docs maps; lab-ecosystem guidebook of factory+tools+cms - runbooks/ [AGENT]/[HUMAN] step procedures (EN; doc/runbooks stays FR) - shareouts/ dated FR handouts (decks/mp4) Seed content (first ADR + PRD): a safe, production-like environment to rehearse risky changes and recovery without touching real prod — local-only sandbox (k3d + arm64 VMs) with a hard prod/sandbox isolation boundary. Includes INV-001 (prod blast-radius couplings), the ecosystem guidebook, and a FR shareout. Conventions enforced: no-tombstone rule, breadcrumb spine, bidirectional cross-links, theme:base mermaid (MCP-validated) + ordered-list-after-diagram. Built with a Workflow + persona cohort; 24 files, zero dead links. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
3.0 KiB
<Runbook title — imperative, e.g. "Run the local sandbox game-day">
Status: ⬜ Not started Audience: LLM agents (English). For the human-operator equivalent see the French doc/runbooks. Last Updated: 2026-06-23
TL;DR
Tip
Scope
<What this runbook covers, and explicitly what it does NOT cover. Name the systems touched (Gitea, Postgres, Vault, k3s, ArgoCD, …) and the <app> or environment in play.>
Preconditions
<Bulleted, verifiable preconditions that must hold before the procedure runs. Examples:>
- Working in a worktree under
.claude/worktrees/<slug>/(never the trunk). - Access to <Vault role / k3s context / Gitea repo> confirmed.
- <Any backup taken / snapshot exists / CLUSTER_RECOVERY.md unseal key available>.
Procedure
<Ordered steps. Each step is tagged [AGENT] (read-only/safe) or [HUMAN] (prod-mutating, requires explicit approval). Put copy-paste commands in fenced bash blocks owned by the step.>
-
[AGENT] <Inspect current state / dry-run — safe, no mutation.>
# read-only example kubectl --context <ctx> get pods -n <app> -
[AGENT] <Generate files, render manifests, run sandbox tests — safe.>
# safe generation / sandbox example tofu -chdir=<path> plan -
[HUMAN]
# prod-mutating example — only after approval tofu -chdir=<path> apply -
[HUMAN] <Any further live mutation, each individually gated.>
Verification
<How to confirm the runbook succeeded. Prefer [AGENT]-runnable, read-only checks with expected output.>
# verification example
kubectl --context <ctx> get application <app> -n argocd -o jsonpath='{.status.sync.status}'
# expected: Synced
Rollback
<How to undo each prod-mutating step if verification fails. Tag each rollback action [AGENT] or [HUMAN] just like the procedure. Reference CLUSTER_RECOVERY.md by name for full power-cut/cluster recovery (it lives outside this repo — name only, no link).>
References
- <Related guidebook page, e.g. Lab ecosystem>
- <Related ADR under doc/adr>
- <Human-operator equivalent under doc/runbooks>