arcodange-email-ingest V8.1: filter calendar invites + newsletter senders #10

Merged
arcodange merged 1 commits from claude/arcodange-email-ingest-v81 into main 2026-05-31 15:18:59 +02:00
Owner

Summary

V8.1 filters out the noise that V8.0's --candidates-only let through: calendar invites (Google Calendar Invitation: / Updated invitation: etc. — all carry an .ics attachment so the hasAttachment heuristic over-matches) and newsletter blast traffic (sender updates.<domain>, news@, newsletter@).

Effect on baseline

V8.0 V8.1
--all-folders --candidates-only total 27 (mixed signal + noise) 12 (actionable)
Calendar invites in output 10 entries 0
staying-ahead.ai newsletters 6 entries 0
Real supplier docs retained yes yes (Mistral, Anthropic, Darnis F1042, 3× Free Mobile, INPI…)

--mark-ingested deferred to V8.2

Originally V8.1 was meant to ship this too. The Zoho flag-set endpoint (PUT /api/accounts/{aid}/updatemessage) requires the OAuth scope ZohoMail.messages.UPDATE, which the current refresh_token doesn't have (READ-only scopes). The fix is for the user to regenerate the refresh_token via the Zoho Self-Client with the extra scope, then a one-line --mark-ingested flag on email-inspect.sh + a flagid == flag_info filter in is_candidate() becomes V8.2. Documented in SKILL.md.

Test plan

  • bin/arcodange email list --candidates-only → still shows 3 candidates in /Inbox/books
  • bin/arcodange email list --all-folders --candidates-only --limit 50 → ~12 entries, no Invitation: subjects, no updates.staying-ahead.ai senders
  • bin/arcodange email inspect 1775141901205014300 → still works unchanged
  • git diff --cached | grep -F <ZOHO_REFRESH_TOKEN> empty (verified pre-commit)
## Summary V8.1 filters out the noise that V8.0's `--candidates-only` let through: calendar invites (Google Calendar `Invitation:` / `Updated invitation:` etc. — all carry an `.ics` attachment so the hasAttachment heuristic over-matches) and newsletter blast traffic (sender `updates.<domain>`, `news@`, `newsletter@`). ### Effect on baseline | | V8.0 | V8.1 | |---|---|---| | `--all-folders --candidates-only` total | 27 (mixed signal + noise) | 12 (actionable) | | Calendar invites in output | 10 entries | 0 | | `staying-ahead.ai` newsletters | 6 entries | 0 | | Real supplier docs retained | yes | yes (Mistral, Anthropic, Darnis F1042, 3× Free Mobile, INPI…) | ### `--mark-ingested` deferred to V8.2 Originally V8.1 was meant to ship this too. The Zoho flag-set endpoint (`PUT /api/accounts/{aid}/updatemessage`) requires the OAuth scope `ZohoMail.messages.UPDATE`, which the current refresh_token doesn't have (READ-only scopes). The fix is for the user to regenerate the refresh_token via the Zoho Self-Client with the extra scope, then a one-line `--mark-ingested` flag on `email-inspect.sh` + a `flagid == flag_info` filter in `is_candidate()` becomes V8.2. Documented in SKILL.md. ## Test plan - [ ] `bin/arcodange email list --candidates-only` → still shows 3 candidates in `/Inbox/books` - [ ] `bin/arcodange email list --all-folders --candidates-only --limit 50` → ~12 entries, no `Invitation:` subjects, no `updates.staying-ahead.ai` senders - [ ] `bin/arcodange email inspect 1775141901205014300` → still works unchanged - [ ] `git diff --cached | grep -F <ZOHO_REFRESH_TOKEN>` empty (verified pre-commit)
arcodange added 1 commit 2026-05-31 15:18:50 +02:00
email-list.sh gains two hard-exclusion filters (applied before the
candidate test, regardless of attachments):

- EXCLUDE_PATTERN matches subjects starting with Invitation: / Updated
  invitation: / Canceled event: / Accepted: / Declined: / Tentative: /
  Maybe: (after stripping Re:/Fwd:/Tr: prefixes). Filters Google Calendar
  events that always carry an .ics attachment.
- EXCLUDE_SENDER matches updates.<domain>, noreply@*calendar, news@,
  newsletter@. Filters newsletter blast traffic.

Effect on --all-folders --candidates-only baseline: 27 noisy → 12
actionable (calendar invites + the staying-ahead.ai newsletter blast
removed). Real supplier docs intact: Darnis F1042 in /Notification, 3 Free
Mobile factures in /Inbox/abonnements, Mistral + Anthropic in /Inbox/books.

The originally-planned --mark-ingested feature is deferred to V8.2:
flag-set requires the Zoho OAuth scope ZohoMail.messages.UPDATE which our
read-only refresh_token doesn't have. Documented in SKILL.md: once the
user opts in to the wider scope, --mark-ingested becomes a one-line flag
on email-inspect.sh and is_candidate() learns to skip flag_info messages.

Captured the new --all-folders baseline at examples/email-list-all-folders.txt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
arcodange merged commit 444886b91a into main 2026-05-31 15:18:59 +02:00
arcodange deleted branch claude/arcodange-email-ingest-v81 2026-05-31 15:19:00 +02:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: arcodange-org/erp#10