add dolibarr-tva-reconciliation, dolibarr-recurring-templates, dolibarr-data-snapshot

V3 bundle — three sibling skills under .claude/skills/, all read-only,
all depending on the dolibarr base skill.

dolibarr-tva-reconciliation:
- tva-by-month.sh: HT + TVA grouped by (year-month × tva_tx), ready
  for CA3 / CA12 transcription.
- tva-line-detail.sh: per-line audit trail with country-based bucket
  assignment (A1 domestic / A4 intra-UE autoliquidation / E2 export
  hors UE). Documents the French TVA mental model.
- Today every Arcodange line is E2 (KissMetrics, US, autoliquidation
  259-1° CGI). The skill scales for the day a French B2B is invoiced.

dolibarr-recurring-templates:
- list-templates.sh: probes /invoices/templates/{id} since there's no
  list endpoint. Stops after 5 consecutive empty responses.
- inspect-template.sh: full audit per template, with health checks.
- Surfaces that the "Kiss Metrics Invoice" template has frequency=0
  and nb_gen_done=0 — it is NOT auto-firing. Every KM invoice today
  was manually duplicated. Cohort-review implication: the deferred
  9-month cycle depends on Gabriel clicking "Generate" each month,
  not on a Dolibarr cron.

dolibarr-data-snapshot:
- snapshot.sh: bundles every read endpoint the dolibarr-* family uses
  into one JSON with a content_hash (sha256 of data only, excluding
  timestamp — so identical state hashes identically across runs).
- Use cases: cohort evidence packs, drift detection, archival before
  a known-risky UI change.
- V1 baseline summary captured at examples/snapshot-summary.txt
  (the ~246 KB snapshot file itself is intentionally not committed).

Also extends dolibarr/SKILL.md endpoint catalogue with
/invoices/templates/{id} (and its no-list-endpoint quirk + the
id-null sentinel for missing ids), plus links to the three new
sibling skills.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-29 00:01:06 +02:00
parent d34cba3fa0
commit f19b1d2ef2
16 changed files with 1554 additions and 1 deletions

View File

@@ -0,0 +1,114 @@
---
name: dolibarr-data-snapshot
description: Snapshot the read-only state of the Arcodange Dolibarr instance into a single content-addressable JSON file — status, thirdparties (list + per-id detail), invoices (list + per-id detail + per-id payments), recurring templates, products, bank accounts. Each snapshot includes a `content_hash` (sha256 of the data, EXCLUDING the captured-at timestamp) so two snapshots of identical state hash identically — drift detection is one comparison. Use cases: cohort-review evidence packs, archival before a known-risky change, time-series drift detection between two dates, point-in-time forensics. Use when the user asks "snapshot Dolibarr", "dump the state", "archiver l'ERP", "preuve cohort", "diff entre deux dates". Depends on the `dolibarr` skill. SKIP for one-shot reads (use the other workflow skills directly), for PDF / binary attachments (intentionally excluded — would bloat the snapshot), for write-side changes (this is read-only forensics), and for snapshotting NON-Dolibarr state (bank statements, k8s, etc. — those would be sibling skills).
requires:
bins: ["curl", "jq", "python3"]
auth: true
---
# dolibarr-data-snapshot — point-in-time JSON dump of Dolibarr read state
One script: [`snapshot.sh`](scripts/snapshot.sh). Pulls every read-only endpoint the `dolibarr-*` family uses and bundles into a single JSON file with a content hash. Read-only, no side effects.
Depends on the [dolibarr](../dolibarr/SKILL.md) base skill.
## What's in the snapshot
```json
{
"schema_version": "1",
"captured_at": "2026-05-28T21:58:50Z",
"instance": "erp.arcodange.lab",
"content_hash": "sha256:6b94cd312d55a693d3c533ae6c9a5abef2734dd5bca8d4b1bdd5ca6ea6fc1f9a",
"data": {
"status": { ... GET /status ... },
"thirdparties": { "list": [ ... GET /thirdparties ... ], "detail": { "1": { ... GET /thirdparties/1 ... }, ... } },
"invoices": { "list": [ ... ], "detail": { "12": { ... }, ... }, "payments": { "12": [ ... ], ... } },
"recurring_templates": { "1": { ... GET /invoices/templates/1 ... }, ... },
"products": [ ... GET /products ... ],
"bank_accounts": [ ... GET /bankaccounts ... ]
}
}
```
**`content_hash` is the sha256 of `data` only** — it deliberately excludes `captured_at`, `schema_version`, `instance`, and the hash field itself. So two snapshots taken at different moments but reflecting **identical Dolibarr state** have the same `content_hash`. That's what makes drift detection trivial:
```bash
jq -r .content_hash snap-2026-05.json snap-2026-06.json
# Same hash → no data changed between the two captures.
# Different hash → use `jq` / `diff` to find what moved.
```
## Usage
```bash
./scripts/snapshot.sh # writes ./snapshot-YYYY-MM-DDTHHMMSSZ.json
./scripts/snapshot.sh --out /tmp/baseline.json
./scripts/snapshot.sh --print-only # stdout, no file (pipe-friendly)
./scripts/snapshot.sh --max-template-id 100 # raise the template-probe upper bound
```
Live output of the current Dolibarr (captured at [examples/snapshot-summary.txt](examples/snapshot-summary.txt) — the actual JSON is too big to commit verbatim, ~246 KB):
```
wrote ./snapshot-2026-05-28T215850Z.json (246186 bytes)
sha256:6b94cd312d55a693d3c533ae6c9a5abef2734dd5bca8d4b1bdd5ca6ea6fc1f9a
```
Contents (V1 baseline):
- 10 thirdparties (KissMetrics + 9 others — prospects / suppliers / etc.)
- 5 invoices, all KM (1 avoir + 4 regular)
- 5 per-invoice payment arrays
- 1 recurring template (Kiss Metrics Invoice — frequency=0, see `dolibarr-recurring-templates`)
- 2 products (KM-audit, KM-cloud-devops)
- 3 bank accounts (QONTO, WISE EURO, G.RADUREAU CCA)
## What's intentionally excluded
- **PDF attachments.** `/documents/download` returns base64 bodies up to ~MB each. Including them would 10×100× the snapshot size. Workflow skills (`dolibarr-invoice-audit`) fetch PDFs on-demand.
- **`users/info`** — leaks the `ai_agent` account internals. Out of scope for a read-only state dump.
- **`/setup/modules`** — admin-only, not available to `ai_agent`.
- **Anything that requires writing** (cron triggers, etc.).
- **`/payments` list-all** — returns 501 (see base skill catalogue); we get payment data via per-invoice fetches.
## Use cases
### 1. Cohort-review evidence pack
```bash
./scripts/snapshot.sh --out evidence/dolibarr-2026-05-28.json
# Send the file as proof of the billing state at that moment.
# The content_hash signs the data.
```
### 2. Drift detection between two dates
```bash
./scripts/snapshot.sh --out snap-may.json
# ... a month passes ...
./scripts/snapshot.sh --out snap-jun.json
jq -r .content_hash snap-may.json snap-jun.json
# Different → something moved. Find what:
diff <(jq -S .data snap-may.json) <(jq -S .data snap-jun.json) | head -50
```
### 3. Archive before a known-risky change
Before manually firing the next M-N invoice, regenerating the PDF template, or any UI change with billing consequences:
```bash
./scripts/snapshot.sh --out before-change-$(date -u +%Y%m%d).json
# Make the change ...
./scripts/snapshot.sh --out after-change-$(date -u +%Y%m%d).json
# Diff to confirm only the intended state moved.
```
## Performance
On the current Arcodange instance (5 invoices, 10 thirdparties, 1 template), the snapshot completes in **~2 seconds** with one HTTP call per top-level resource + N calls for per-id fetches. At ~30 thirdparties + ~100 invoices + ~10 templates, expect ~150 calls and ~10 s.
## Out of scope
- **Snapshotting bank statements** (Qonto / Wise CSV exports). Different data source — would be a sibling skill (`arcodange-bank-snapshot` or similar).
- **Snapshotting Kubernetes state** of the ERP deployment. Sibling skill candidate (`arcodange-k8s-snapshot`).
- **Schema migrations / drift in the Dolibarr DB itself.** That requires `dolibarr-postgres-readonly` or similar; out of scope here.

View File

@@ -0,0 +1,24 @@
# dolibarr-data-snapshot - V1 baseline summary
#
# The actual JSON snapshot file is ~246 KB and is intentionally NOT committed.
# This file is the structural digest captured at the same moment.
captured_at: 2026-05-28T22:00:15Z
instance: erp.arcodange.lab
content_hash: sha256:6b94cd312d55a693d3c533ae6c9a5abef2734dd5bca8d4b1bdd5ca6ea6fc1f9a
schema_ver: 1
section count
------------------------------------------------
status 1 (dolibarr_version=22.0.4)
thirdparties.list 10
thirdparties.detail 10
invoices.list 5
invoices.detail 5
invoices.payments 5
recurring_templates 1 (ids: ['1'])
products 2
bank_accounts 3
# Stable content_hash check: run snapshot.sh twice quickly. Both content_hash
# values should be identical when Dolibarr state has not changed between runs.

View File

@@ -0,0 +1,174 @@
#!/usr/bin/env bash
# Snapshot the read-only state of the Arcodange Dolibarr into one JSON file.
#
# Usage:
# snapshot.sh [--out PATH] # default: ./snapshot-YYYY-MM-DDTHHMMSS.json
# snapshot.sh --print-only # write to stdout instead of a file
#
# The snapshot is content-addressable: it includes a SHA-256 of the
# serialized payload (computed AFTER stable key-sorting) so two snapshots
# of the same state hash identically. Useful for:
# - cohort review evidence packs (sign + send)
# - drift detection between dates (diff two snapshots)
# - archival before a known-risky change
#
# What's included (everything the dolibarr-* family reads):
# - status (Dolibarr version)
# - thirdparties (full list + detail)
# - invoices (full list + per-invoice detail + per-invoice payments)
# - recurring invoice templates (probed 1..MAX_TEMPLATE_ID)
# - products
# - bank accounts
#
# Excluded by design:
# - PDF attachments (binary, would bloat the snapshot ~50KB each)
# - users/info (would leak ai_agent details)
# - any non-read endpoints
#
# Requires: curl, jq, python3 (with hashlib — standard lib).
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
DOL_CURL="${SCRIPT_DIR}/../../dolibarr/scripts/dol-curl.sh"
MAX_TEMPLATE_ID=20
EMPTY_TPL_TOLERANCE=5
OUT=""
PRINT_ONLY=0
while [[ $# -gt 0 ]]; do
case "$1" in
--out) OUT="$2"; shift 2 ;;
--print-only) PRINT_ONLY=1; shift ;;
--max-template-id) MAX_TEMPLATE_ID="$2"; shift 2 ;;
-h|--help) sed -n '2,20p' "$0" | sed 's/^# \{0,1\}//'; exit 0 ;;
*) echo "snapshot.sh: unknown arg: $1" >&2; exit 2 ;;
esac
done
WORK="$(mktemp -d -t dolsnap.XXXXXX)"
trap 'rm -rf "${WORK}"' EXIT
fetch_into() {
local out_file="$1" path="$2"
"${DOL_CURL}" "${path}" > "${out_file}" 2>/dev/null || {
# On HTTP error, dol-curl prints body+stderr; capture body for record.
"${DOL_CURL}" "${path}" > "${out_file}" 2>&1 || true
}
}
# 1. Liveness + status
fetch_into "${WORK}/status.json" /status
# 2. Thirdparties (list + detail)
fetch_into "${WORK}/tps_list.json" /thirdparties
TP_IDS=$(python3 -c "
import json,sys
try: d = json.load(open(sys.argv[1]))
except: d = []
if isinstance(d, list): print(' '.join(str(t['id']) for t in d if t.get('id')))
" "${WORK}/tps_list.json")
mkdir -p "${WORK}/tps"
for id in ${TP_IDS}; do fetch_into "${WORK}/tps/${id}.json" "/thirdparties/${id}"; done
# 3. Invoices (list + detail + payments)
fetch_into "${WORK}/inv_list.json" '/invoices?limit=500&sortfield=t.datef&sortorder=ASC'
INV_IDS=$(python3 -c "
import json,sys
try: d = json.load(open(sys.argv[1]))
except: d = []
print(' '.join(str(i['id']) for i in d if i.get('id')))
" "${WORK}/inv_list.json")
mkdir -p "${WORK}/inv" "${WORK}/pay"
for id in ${INV_IDS}; do
fetch_into "${WORK}/inv/${id}.json" "/invoices/${id}"
fetch_into "${WORK}/pay/${id}.json" "/invoices/${id}/payments"
done
# 4. Recurring templates (probe)
mkdir -p "${WORK}/tpl"
CONSECUTIVE_EMPTY=0
for tid in $(seq 1 "${MAX_TEMPLATE_ID}"); do
fetch_into "${WORK}/tpl/${tid}.json" "/invoices/templates/${tid}"
REAL=$(python3 -c "import json,sys
try: d=json.load(open(sys.argv[1])); print('1' if d.get('id') else '0')
except: print('0')" "${WORK}/tpl/${tid}.json")
if [[ "${REAL}" == "1" ]]; then
CONSECUTIVE_EMPTY=0
else
CONSECUTIVE_EMPTY=$((CONSECUTIVE_EMPTY+1))
rm "${WORK}/tpl/${tid}.json"
[[ ${CONSECUTIVE_EMPTY} -ge ${EMPTY_TPL_TOLERANCE} ]] && break
fi
done
# 5. Products + bank accounts
fetch_into "${WORK}/products.json" /products
fetch_into "${WORK}/bankaccounts.json" /bankaccounts
# 6. Compose the snapshot
python3 - "${WORK}" <<'PY' > "${WORK}/snapshot.json"
import json, os, sys, datetime, hashlib
work = sys.argv[1]
def load(path, default):
try: return json.load(open(path))
except (FileNotFoundError, json.JSONDecodeError): return default
def load_dir(dirname):
out = {}
full = os.path.join(work, dirname)
if not os.path.isdir(full): return out
for fn in sorted(os.listdir(full)):
if not fn.endswith(".json"): continue
key = fn[:-len(".json")]
out[key] = load(os.path.join(full, fn), None)
return out
data = {
"status": load(os.path.join(work, "status.json"), {}),
"thirdparties": {
"list": load(os.path.join(work, "tps_list.json"), []),
"detail": load_dir("tps"),
},
"invoices": {
"list": load(os.path.join(work, "inv_list.json"), []),
"detail": load_dir("inv"),
"payments": load_dir("pay"),
},
"recurring_templates": load_dir("tpl"),
"products": load(os.path.join(work, "products.json"), []),
"bank_accounts": load(os.path.join(work, "bankaccounts.json"), []),
}
# content_hash is the sha256 of `data` only — excludes timestamp + metadata,
# so two snapshots of identical Dolibarr state hash identically.
# (Drift detection is then: compare content_hash, done.)
content_serialized = json.dumps(data, sort_keys=True, ensure_ascii=False).encode("utf-8")
content_hash = "sha256:" + hashlib.sha256(content_serialized).hexdigest()
payload = {
"schema_version": "1",
"captured_at": datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
"instance": "erp.arcodange.lab",
"content_hash": content_hash,
"data": data,
}
print(json.dumps(payload, indent=2, ensure_ascii=False, sort_keys=True))
PY
# 7. Output
if [[ "${PRINT_ONLY}" == "1" ]]; then
cat "${WORK}/snapshot.json"
else
if [[ -z "${OUT}" ]]; then
OUT="./snapshot-$(date -u +%Y-%m-%dT%H%M%SZ).json"
fi
cp "${WORK}/snapshot.json" "${OUT}"
SIZE=$(stat -f %z "${OUT}" 2>/dev/null || stat -c %s "${OUT}")
HASH=$(python3 -c "import json,sys; print(json.load(open(sys.argv[1]))['content_hash'])" "${OUT}")
echo "wrote ${OUT} (${SIZE} bytes)"
echo " ${HASH}"
fi