arcodange-bank-reco: add known-patterns.json catalog + bank-match annotation

V6.1 follow-up to the bank-reco V6 ship. Splits the BANK-ONLY bucket into
"known patterns" (intentional gaps, documented and classified) vs
"unknown" (real action items).

What the catalog covers today:
- FOUREZ Quentin → capital_deposit (apport en capital 1000 € initial,
  notaire FOUREZ centralisateur du dépôt). Maps to Dolibarr account 1013.
- URSSAF → social_charges (account 645100)
- MISTRAL.AI, CLAUDE.AI → ai_subscription (account 6262)
- Wise *Plan, qonto_fee → bank_fee (account 627)
- BALANCE_DEPOSIT / FEATURE_CHARGE on Wise → internal_topup (self-funding
  pair, often nets to zero)

Effect on the V6 baseline (Jan-May 2026):
- Before catalog: 8 BANK-ONLY mixed entries (noise + signal)
- After catalog:  7 known + 1 UNKNOWN (just the +2147 € KM Wise payment
  2026-05-29 that genuinely needs a Dolibarr entry)

The catalog is JSON (not YAML — stdlib only, no dependency). Schema
documented in SKILL.md. Pattern matches case-insensitive regex against
both bank label AND operation type. Optional filters: bank, side,
amount_min, amount_max.

Exit code now reflects only the UNKNOWN bank-only and dolibarr-only
counts — the verdict is no longer noisy because of intentional gaps.

Edit known-patterns.json as new recurring patterns emerge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-31 14:06:46 +02:00
parent f398003eae
commit 4b6a5f7529
4 changed files with 170 additions and 16 deletions

View File

@@ -79,9 +79,10 @@ for id in $(python3 -c "import json,sys; print(' '.join(str(r['id']) for r in js
done
# --- 4. Match in python ---
python3 - "${WORK}" "${SINCE}" "${UNTIL}" "${WINDOW}" "${INCLUDE_FEES}" <<'PY'
PATTERNS_FILE="${SCRIPT_DIR}/../known-patterns.json"
python3 - "${WORK}" "${SINCE}" "${UNTIL}" "${WINDOW}" "${INCLUDE_FEES}" "${PATTERNS_FILE}" <<'PY'
import json, sys, os, re, datetime, collections
work, since, until, window_days, include_fees = sys.argv[1:6]
work, since, until, window_days, include_fees, patterns_file = sys.argv[1:7]
window = int(window_days); include_fees = include_fees == "1"
since_d = datetime.date.fromisoformat(since); until_d = datetime.date.fromisoformat(until)
@@ -156,6 +157,32 @@ for m in [x for x in bank_movs if not x["matched_internal"]]:
p = candidates[0]
m["matched_dol"] = p; p["matched_bank"] = m
# 4f. Annotate non-matched movements with known-patterns catalog
patterns = []
if os.path.isfile(patterns_file):
try: patterns = json.load(open(patterns_file)).get("patterns", [])
except Exception as e: print(f" /!\\ failed to load {patterns_file}: {e}", file=sys.stderr)
def match_pattern(mov):
# Match against both the bank label AND the operation type — different
# banks surface useful info in different fields (Qonto puts the operation
# type in `op`, e.g. "qonto_fee"; Wise puts the activity type in `op`,
# e.g. "BALANCE_DEPOSIT", and the human title in `label`).
haystack = (mov.get("label") or "") + " | " + (mov.get("op") or "")
for pat in patterns:
if pat.get("bank") and pat["bank"] != mov["bank"].lower(): continue
if pat.get("side") and pat["side"] != ("credit" if mov["sign"]=="+" else "debit"): continue
amin = pat.get("amount_min"); amax = pat.get("amount_max")
if amin is not None and mov["amount"] < amin: continue
if amax is not None and mov["amount"] > amax: continue
if re.search(pat["pattern"], haystack, re.IGNORECASE):
return pat
return None
for m in bank_movs:
if m["matched_dol"] or m["matched_internal"]: continue
m["known"] = match_pattern(m)
# --- 5. Render ---
def fmt_bank(m):
return f" {m['bank']:<5} {m['date']} {m['sign']:<2}{m['amount']:>9.2f} {m['op'][:18]:<18} {m['label']}"
@@ -180,8 +207,19 @@ for m in sorted(internal, key=lambda m: m["date"]):
print(fmt_bank(m) + f" ↔ {other['bank']} {other['date']} {other['sign']}{other['amount']:.2f}")
print()
print(f"=== BANK-ONLY ({len(bank_only)} bank movements without Dolibarr counterpart) ===")
for m in sorted(bank_only, key=lambda m: m["date"]):
bank_known = [m for m in bank_only if m.get("known")]
bank_unknown = [m for m in bank_only if not m.get("known")]
print(f"=== BANK-ONLY — known patterns ({len(bank_known)}, intentional gaps documented in known-patterns.json) ===")
for m in sorted(bank_known, key=lambda m: m["date"]):
k = m["known"]
cls = k.get("classification","?")
print(fmt_bank(m) + f" [{cls}]")
print(f" └─ {k.get('note','')}")
print()
print(f"=== BANK-ONLY — unknown ({len(bank_unknown)}, NEEDS attention: missing supplier invoice / unrecorded payment / new pattern) ===")
for m in sorted(bank_unknown, key=lambda m: m["date"]):
print(fmt_bank(m))
print()
@@ -190,9 +228,10 @@ for p in sorted(dol_only, key=lambda p: p["date"]):
print(f" {p['side']:<8} {p['date']} {p['amount']:>9.2f} {p['ref']} (fk_account={p['fk_account']})")
print()
# Verdict
fails = len(bank_only) + len(dol_only)
# Verdict: only UNKNOWN bank-only and dolibarr-only count as "needs attention"
fails = len(bank_unknown) + len(dol_only)
print("-" * 80)
print(f"# {len(matched)} matched, {len(internal)} internal, {len(bank_only)} bank-only, {len(dol_only)} dolibarr-only")
print(f"# {len(matched)} matched, {len(internal)} internal, {len(bank_known)} bank-known, {len(bank_unknown)} bank-UNKNOWN, {len(dol_only)} dolibarr-only")
print(f"# patterns loaded from {patterns_file}: {len(patterns)} pattern(s)")
sys.exit(0 if fails == 0 else 1)
PY