add arcodange-email-ingest — Zoho Mail → Dolibarr supplier-invoice drafts
V8 — first inbound-side skill. Closes the loop from "bill arrives by email"
to "ready to enter in Dolibarr UI". Read-only at every layer.
What ships:
- arcodange-email-ingest/scripts/zoho-curl.sh OAuth wrapper with token cache
(50 min TTL, mode 600) — avoids
hitting Zoho OAuth rate limit on
every invocation.
- arcodange-email-ingest/scripts/email-list.sh List candidates in /Inbox/books
(where the books@ alias auto-
routes mail). --candidates-only
filter on supplier patterns or
attachments. --all-folders to
scan everything.
- arcodange-email-ingest/scripts/email-inspect.sh Pull message + attachments,
pdftotext on each PDF, heuristic
extract (supplier, ref, dates,
totals, VAT rate), emit Dolibarr
supplier-invoice draft JSON.
Architecture choice — Zoho API (not IMAP):
- books@arcodange.fr is an alias of gabrielradureau@arcodange.fr → one OAuth
refresh_token covers everything.
- Gmail folded in via forwarding (arcodange@gmail.com → books@) — no Google
API setup, no app-passwords, no second OAuth flow.
- Token-based auth, no SCA rabbit hole.
V8.0 baseline (in /Inbox/books):
- 3 candidates: Mistral AI facture, Anthropic Stripe receipt (Fwd Gmail),
INPI payment receipt (Fwd Gmail).
- Heuristic extraction is best-effort: works on amounts/refs for some
templates, misses others (Mistral PDF format, Stripe receipt layout).
- --save-pdf <DIR> lets the operator grab the PDFs for manual entry when
the heuristic falls short.
Rate-limit pitfall documented: Zoho OAuth refresh has an aggressive throttle
("too many requests continuously"). The cache file at $TMPDIR/zoho-access-$USER
(mode 600, 50 min TTL) prevents this; on 401 the wrapper auto-refreshes once
and retries.
V8.1+ ideas in SKILL.md out-of-scope:
- mark ingested emails (IMAP flag or Zoho label)
- body text extraction (inline-HTML invoices)
- per-template parsers or LLM-based extraction
- IMAP fallback for non-Zoho mailboxes
CLI: bin/arcodange email {list|inspect|curl} integrated.
Base updates: dolibarr/SKILL.md cross-link, dolibarr/README.md env schema
extended with ZOHO_CLIENT_ID/SECRET/REFRESH_TOKEN/DC.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
126
.claude/skills/arcodange-email-ingest/scripts/zoho-curl.sh
Executable file
126
.claude/skills/arcodange-email-ingest/scripts/zoho-curl.sh
Executable file
@@ -0,0 +1,126 @@
|
||||
#!/usr/bin/env bash
|
||||
# Read-only curl wrapper for the Zoho Mail API.
|
||||
#
|
||||
# Usage:
|
||||
# zoho-curl.sh <path> # e.g. zoho-curl.sh /accounts
|
||||
# zoho-curl.sh -i <path> # include curl's -i (response headers)
|
||||
# zoho-curl.sh -o file.json <path> # write body to file
|
||||
#
|
||||
# Reads credentials from ../../dolibarr/.env (the shared canonical file).
|
||||
# Required vars:
|
||||
# ZOHO_CLIENT_ID, ZOHO_CLIENT_SECRET, ZOHO_REFRESH_TOKEN, ZOHO_DC
|
||||
#
|
||||
# Token strategy: each invocation refreshes a short-lived access_token from
|
||||
# the refresh_token (Zoho access_tokens live 1h; the cost of refreshing on
|
||||
# every call is ~150 ms and avoids state on disk). On 401 from the mail API
|
||||
# we re-refresh once and retry (covers refresh-token rotation cases).
|
||||
#
|
||||
# Exits non-zero on HTTP >= 400 and writes body to stdout + a short message
|
||||
# to stderr — same shape as dol-curl.sh / bank-curl.sh.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
ENV_FILE="${SCRIPT_DIR}/../../dolibarr/.env"
|
||||
|
||||
if [[ ! -f "${ENV_FILE}" ]]; then
|
||||
echo "zoho-curl.sh: missing ${ENV_FILE}" >&2
|
||||
echo " Required vars: ZOHO_CLIENT_ID, ZOHO_CLIENT_SECRET, ZOHO_REFRESH_TOKEN, ZOHO_DC." >&2
|
||||
echo " See arcodange-email-ingest/SKILL.md for the OAuth setup." >&2
|
||||
exit 2
|
||||
fi
|
||||
set -a; source "${ENV_FILE}"; set +a
|
||||
|
||||
: "${ZOHO_CLIENT_ID:?zoho-curl.sh: ZOHO_CLIENT_ID not set in .env}"
|
||||
: "${ZOHO_CLIENT_SECRET:?zoho-curl.sh: ZOHO_CLIENT_SECRET not set in .env}"
|
||||
: "${ZOHO_REFRESH_TOKEN:?zoho-curl.sh: ZOHO_REFRESH_TOKEN not set in .env}"
|
||||
: "${ZOHO_DC:=eu}"
|
||||
|
||||
ACCOUNTS_BASE="https://accounts.zoho.${ZOHO_DC}"
|
||||
MAIL_BASE="https://mail.zoho.${ZOHO_DC}/api"
|
||||
|
||||
# Parse pass-through curl args (everything before the last positional)
|
||||
PASSTHRU=()
|
||||
while [[ $# -gt 1 ]]; do
|
||||
PASSTHRU+=("$1"); shift
|
||||
done
|
||||
if [[ $# -lt 1 ]]; then
|
||||
echo "zoho-curl.sh: missing API path. Example: zoho-curl.sh /accounts" >&2
|
||||
exit 2
|
||||
fi
|
||||
API_PATH="$1"
|
||||
|
||||
# Cache access_token in tmpfs to avoid hitting OAuth rate limits on every
|
||||
# zoho-curl invocation. Zoho access_tokens live 1h; we refresh after 50 min.
|
||||
CACHE_FILE="${TMPDIR:-/tmp}/zoho-access-$(whoami)"
|
||||
CACHE_TTL_SECONDS=$((50 * 60))
|
||||
|
||||
get_access_token() {
|
||||
if [[ -f "${CACHE_FILE}" ]]; then
|
||||
local age
|
||||
age=$(( $(date +%s) - $(stat -f %m "${CACHE_FILE}" 2>/dev/null || stat -c %Y "${CACHE_FILE}") ))
|
||||
if [[ ${age} -lt ${CACHE_TTL_SECONDS} ]]; then
|
||||
cat "${CACHE_FILE}"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
local token
|
||||
if ! token=$(curl -sS -X POST "${ACCOUNTS_BASE}/oauth/v2/token" \
|
||||
--max-time 15 \
|
||||
-d "grant_type=refresh_token" \
|
||||
-d "client_id=${ZOHO_CLIENT_ID}" \
|
||||
-d "client_secret=${ZOHO_CLIENT_SECRET}" \
|
||||
-d "refresh_token=${ZOHO_REFRESH_TOKEN}" \
|
||||
| python3 -c "
|
||||
import json, sys
|
||||
try: d = json.load(sys.stdin)
|
||||
except: sys.exit('failed to parse OAuth response')
|
||||
if 'access_token' not in d:
|
||||
sys.exit(f'OAuth refresh failed: {d}')
|
||||
print(d['access_token'])"); then
|
||||
return 1
|
||||
fi
|
||||
if [[ -z "${token}" ]]; then
|
||||
return 1
|
||||
fi
|
||||
# Store cache (mode 600) only on success
|
||||
printf '%s' "${token}" > "${CACHE_FILE}"
|
||||
chmod 600 "${CACHE_FILE}"
|
||||
printf '%s' "${token}"
|
||||
}
|
||||
|
||||
do_call() {
|
||||
local token="$1"
|
||||
local body_file="$2"
|
||||
local headers_file="$3"
|
||||
curl -sS \
|
||||
-H "Authorization: Zoho-oauthtoken ${token}" \
|
||||
-H "Accept: application/json" \
|
||||
--max-time 30 \
|
||||
-o "${body_file}" \
|
||||
-D "${headers_file}" \
|
||||
-w "%{http_code}" \
|
||||
${PASSTHRU[@]+"${PASSTHRU[@]}"} \
|
||||
"${MAIL_BASE}${API_PATH}"
|
||||
}
|
||||
|
||||
ACCESS_TOKEN=$(get_access_token)
|
||||
[[ -z "${ACCESS_TOKEN}" ]] && { echo "zoho-curl.sh: empty access_token" >&2; exit 1; }
|
||||
|
||||
BODY_FILE="$(mktemp -t zohocurl.XXXXXX)"
|
||||
HEADERS_FILE="$(mktemp -t zohohdr.XXXXXX)"
|
||||
trap 'rm -f "${BODY_FILE}" "${HEADERS_FILE}"' EXIT
|
||||
|
||||
HTTP_CODE=$(do_call "${ACCESS_TOKEN}" "${BODY_FILE}" "${HEADERS_FILE}")
|
||||
|
||||
# Retry once on 401 with a fresh token (handles edge cases of refresh-token rotation)
|
||||
if [[ "${HTTP_CODE}" == "401" ]]; then
|
||||
ACCESS_TOKEN=$(get_access_token)
|
||||
HTTP_CODE=$(do_call "${ACCESS_TOKEN}" "${BODY_FILE}" "${HEADERS_FILE}")
|
||||
fi
|
||||
|
||||
cat "${BODY_FILE}"
|
||||
if [[ "${HTTP_CODE}" -ge 400 ]]; then
|
||||
echo "zoho-curl.sh: HTTP ${HTTP_CODE} on ${API_PATH}" >&2
|
||||
exit 1
|
||||
fi
|
||||
Reference in New Issue
Block a user