# Dolibarr dedicated backup A backup strategy **dedicated to Dolibarr**, because the accounting data and the issued documents are critical and legally retained **10 years** — they warrant more than the generic platform backup. ## Why this exists (the gap it closes) On 2026-06-30 an audit of the Longhorn external backup found that **the erp documents volume had never been backed up offsite** (`lastBackupAt = never`): its Longhorn volume is enrolled only in the `default` recurring-job group, but the single backup job (`thrice-a-month-backup`) has `groups=[]`, so it serves *no* group — the erp volume (and erp-sandbox) fell through the crack. Only in-cluster Longhorn replicas protected `/var/www/documents` (issued invoice PDFs, supplier pieces, contracts, ECM) — which does not survive a cluster loss / corruption / power-cut. This tool backs up **both halves** of Dolibarr state to the existing object store (`s3://arcodange-backup`, GCS via the S3-compatible API), under `erp//`: | half | how | key | |---|---|---| | Postgres DB | `pg_dump -Fc` (restorable) | `erp//db/.dump` | | documents PVC | `tar -czf` of `/var/www/documents` (RWX, mounted read-only) | `erp//docs/.tar.gz` | then prunes to a **tiered retention**: daily for 30 days, monthly for 12 months, yearly for ~10 years. **Skip-if-unchanged:** each half carries a content fingerprint at `erp//.fp-{db,docs}` and is dumped+uploaded only if it **differs** from the last run — so a quiet ERP day re-uploads nothing. The fingerprint is over **durable business content only**: the DB side is `count + max(tms)` over every `tms` table *except* volatile ones (`llx_const`, `llx_user`, sessions/cron), and the documents side excludes `*/temp/*` (Dolibarr's constantly-regenerated stats cache) — from both the fingerprint *and* the tar. ## Safety (mirrors `ops/sandbox/sandbox-lifecycle.sh`) - **prod is read-only**: `pg_dump` and `tar` only read; the only writes go to the backup bucket, never to prod. The DB is read with the env's *own* dynamic creds (`vso-db-credentials`); prod and sandbox never cross. - **S3 creds are never exposed**: the GCS HMAC secret is copied into a *transient* secret in the app namespace (values stay base64), deleted on exit. The whole in-container script is shipped base64 — no secret is ever printed. ## Usage ```sh # one-shot backup + prune (run from anywhere; needs kubectl on the lab cluster) ops/backup/dolibarr-backup.sh backup --env prod ops/backup/dolibarr-backup.sh backup --env sandbox # what's in the store ops/backup/dolibarr-backup.sh list --env prod ``` `chart/files/backup-job.sh` is the in-container logic (env-driven: `BUCKET PREFIX DB PGHOST` + the mounted DB/S3 creds) — the single source of truth shared by this orchestrator and the scheduled CronJob (see "Automation" below). **Status:** the first real prod backup was taken 2026-06-30 (`erp/prod/db/…` 1.2 MB, `erp/prod/docs/…` 12.5 MB). Proven end-to-end live on the sandbox (dump + tar + GCS upload + retention prune). ## Restore (manual, for now) ```sh # DB: aws s3 cp s3://arcodange-backup/erp//db/.dump - | pg_restore -h -U -d --clean # docs: aws s3 cp s3://arcodange-backup/erp//docs/.tar.gz - | tar -C /var/www/documents -xzf - ``` The sandbox iso-prod refresh (`ops/sandbox/sandbox-lifecycle.sh`) is the natural restore-drill bench. A `restore` subcommand is wired next. ## Automation — the CronJob (gated on creds) The recurring form ships in the chart (`chart/templates/backup-cronjob.yaml`, `backup.enabled=false` by default): a daily **CronJob** (ConfigMap-mounted `backup-job.sh`) with its **own** S3 creds via a `VaultStaticSecret` — no cross-namespace borrowing of the Longhorn secret. To activate: 1. store the GCS HMAC creds (`AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` / `AWS_ENDPOINTS`, same shape as `longhorn-gcs-backup-credentials`) at `kvv2/` (default `erp/backup`); 2. grant the erp `auth` Vault role read on that path (a `tools` change) if its policy doesn't already cover it; 3. set `backup.enabled: true` (+ tune `schedule`). Until then, run the orchestrator above on demand / from a host cron — it works today by borrowing the Longhorn creds transiently. > The generic Longhorn gap (the orphaned `default` group) should be fixed too, as a > platform concern — but this dedicated, offsite, 10-year-retention backup is the > one that matches Dolibarr's legal criticality.