Commit Graph

1 Commits

Author SHA1 Message Date
8ec8fde67e feat(ops): dedicated Dolibarr backup (DB + documents → offsite GCS, 10y retention)
The accounting data + issued documents are legally retained 10 years and warrant a
backup dedicated to Dolibarr. An audit found the generic Longhorn external backup
NEVER covered the erp volume (its Longhorn volume sits in the orphaned `default`
recurring-job group; the only job has groups=[] → serves nothing; lastBackupAt=never).
So /var/www/documents (invoice PDFs, supplier pieces, contracts, ECM) had zero
offsite copy — only in-cluster replicas.

ops/backup/dolibarr-backup.sh (orchestrator) + ops/backup/backup-job.sh (in-container
logic, env-driven, single source of truth):
- pg_dump -Fc of the DB + tar of the documents PVC (RWX, read-only mount) ->
  s3://arcodange-backup/erp/<env>/{db,docs}/<ts>, then tiered prune (daily 30d /
  monthly 12m / yearly 10y).
- prod is READ-only (dump+tar read; writes go only to the backup bucket); the DB is
  read with the env's own dynamic creds; the GCS HMAC secret is copied transiently
  (base64, deleted on exit) and never printed; the whole script ships base64.
- fixes the aws-cli v2.23+ default-checksum incompatibility with GCS/S3-compat
  (SignatureDoesNotMatch) via AWS_*_CHECKSUM_*=when_required.

Proven live: sandbox end-to-end (dump+tar+upload+prune, verified in GCS, cleaned up)
and retention logic unit-tested (1100 daily -> 46 kept). The FIRST real prod backup
was taken (erp/prod/db 1.2 MB + erp/prod/docs 12.5 MB) — closing the gap now.

Automation (recurring CronJob in the chart + a dedicated erp Vault policy for its
own S3 creds) is the documented next step; the orchestrator works today on demand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-30 15:32:36 +02:00