--- name: dolibarr-data-snapshot description: Snapshot the read-only state of the Arcodange Dolibarr instance into a single content-addressable JSON file — status, thirdparties (list + per-id detail), invoices (list + per-id detail + per-id payments), recurring templates, products, bank accounts. Each snapshot includes a `content_hash` (sha256 of the data, EXCLUDING the captured-at timestamp) so two snapshots of identical state hash identically — drift detection is one comparison. Use cases: cohort-review evidence packs, archival before a known-risky change, time-series drift detection between two dates, point-in-time forensics. Use when the user asks "snapshot Dolibarr", "dump the state", "archiver l'ERP", "preuve cohort", "diff entre deux dates". Depends on the `dolibarr` skill. SKIP for one-shot reads (use the other workflow skills directly), for PDF / binary attachments (intentionally excluded — would bloat the snapshot), for write-side changes (this is read-only forensics), and for snapshotting NON-Dolibarr state (bank statements, k8s, etc. — those would be sibling skills). requires: bins: ["curl", "jq", "python3"] auth: true --- # dolibarr-data-snapshot — point-in-time JSON dump of Dolibarr read state **CLI shortcut:** `bin/arcodange snapshot [--out FILE | --print-only]` One script: [`snapshot.sh`](scripts/snapshot.sh). Pulls every read-only endpoint the `dolibarr-*` family uses and bundles into a single JSON file with a content hash. Read-only, no side effects. Depends on the [dolibarr](../dolibarr/SKILL.md) base skill. ## What's in the snapshot ```json { "schema_version": "1", "captured_at": "2026-05-28T21:58:50Z", "instance": "erp.arcodange.lab", "content_hash": "sha256:6b94cd312d55a693d3c533ae6c9a5abef2734dd5bca8d4b1bdd5ca6ea6fc1f9a", "data": { "status": { ... GET /status ... }, "thirdparties": { "list": [ ... GET /thirdparties ... ], "detail": { "1": { ... GET /thirdparties/1 ... }, ... } }, "invoices": { "list": [ ... ], "detail": { "12": { ... }, ... }, "payments": { "12": [ ... ], ... } }, "recurring_templates": { "1": { ... GET /invoices/templates/1 ... }, ... }, "products": [ ... GET /products ... ], "bank_accounts": [ ... GET /bankaccounts ... ] } } ``` **`content_hash` is the sha256 of `data` only** — it deliberately excludes `captured_at`, `schema_version`, `instance`, and the hash field itself. So two snapshots taken at different moments but reflecting **identical Dolibarr state** have the same `content_hash`. That's what makes drift detection trivial: ```bash jq -r .content_hash snap-2026-05.json snap-2026-06.json # Same hash → no data changed between the two captures. # Different hash → use `jq` / `diff` to find what moved. ``` ## Usage ```bash ./scripts/snapshot.sh # writes ./snapshot-YYYY-MM-DDTHHMMSSZ.json ./scripts/snapshot.sh --out /tmp/baseline.json ./scripts/snapshot.sh --print-only # stdout, no file (pipe-friendly) ./scripts/snapshot.sh --max-template-id 100 # raise the template-probe upper bound ``` Live output of the current Dolibarr (captured at [examples/snapshot-summary.txt](examples/snapshot-summary.txt) — the actual JSON is too big to commit verbatim, ~246 KB): ``` wrote ./snapshot-2026-05-28T215850Z.json (246186 bytes) sha256:6b94cd312d55a693d3c533ae6c9a5abef2734dd5bca8d4b1bdd5ca6ea6fc1f9a ``` Contents (V1 baseline): - 10 thirdparties (KissMetrics + 9 others — prospects / suppliers / etc.) - 5 invoices, all KM (1 avoir + 4 regular) - 5 per-invoice payment arrays - 1 recurring template (Kiss Metrics Invoice — frequency=0, see `dolibarr-recurring-templates`) - 2 products (KM-audit, KM-cloud-devops) - 3 bank accounts (QONTO, WISE EURO, G.RADUREAU CCA) ## What's intentionally excluded - **PDF attachments.** `/documents/download` returns base64 bodies up to ~MB each. Including them would 10×–100× the snapshot size. Workflow skills (`dolibarr-invoice-audit`) fetch PDFs on-demand. - **`users/info`** — leaks the `ai_agent` account internals. Out of scope for a read-only state dump. - **`/setup/modules`** — admin-only, not available to `ai_agent`. - **Anything that requires writing** (cron triggers, etc.). - **`/payments` list-all** — returns 501 (see base skill catalogue); we get payment data via per-invoice fetches. ## Use cases ### 1. Cohort-review evidence pack ```bash ./scripts/snapshot.sh --out evidence/dolibarr-2026-05-28.json # Send the file as proof of the billing state at that moment. # The content_hash signs the data. ``` ### 2. Drift detection between two dates ```bash ./scripts/snapshot.sh --out snap-may.json # ... a month passes ... ./scripts/snapshot.sh --out snap-jun.json jq -r .content_hash snap-may.json snap-jun.json # Different → something moved. Find what: diff <(jq -S .data snap-may.json) <(jq -S .data snap-jun.json) | head -50 ``` ### 3. Archive before a known-risky change Before manually firing the next M-N invoice, regenerating the PDF template, or any UI change with billing consequences: ```bash ./scripts/snapshot.sh --out before-change-$(date -u +%Y%m%d).json # Make the change ... ./scripts/snapshot.sh --out after-change-$(date -u +%Y%m%d).json # Diff to confirm only the intended state moved. ``` ## Performance On the current Arcodange instance (5 invoices, 10 thirdparties, 1 template), the snapshot completes in **~2 seconds** with one HTTP call per top-level resource + N calls for per-id fetches. At ~30 thirdparties + ~100 invoices + ~10 templates, expect ~150 calls and ~10 s. ## Out of scope - **Snapshotting bank statements** (Qonto / Wise CSV exports). Different data source — would be a sibling skill (`arcodange-bank-snapshot` or similar). - **Snapshotting Kubernetes state** of the ERP deployment. Sibling skill candidate (`arcodange-k8s-snapshot`). - **Schema migrations / drift in the Dolibarr DB itself.** That requires `dolibarr-postgres-readonly` or similar; out of scope here.