Phase 2b — durable Postgres queue + worker (gated on DATABASE_URL)
Some checks failed
Docker Build / build-and-push-image (push) Has been cancelled

Adds the async dispatch infrastructure :

- Postgres pool + embedded migration (CREATE TABLE/INDEX IF NOT EXISTS
  gateway_jobs). Auto-applied at boot. lib/pq driver (matches webapp
  convention).
- queue.go : Enqueue (idempotent on UNIQUE(bot_slug, update_id) — handles
  Telegram redelivery), Pop with FOR UPDATE SKIP LOCKED, MarkDone,
  MarkFailed with exponential backoff (30s → 2m → 10m → 1h → dead at 5).
- worker.go : goroutine that drains the queue, dispatches via the same
  Handler interface as sync, schedules retries on failure, notifies the
  user once when a job goes to dead.
- BotConfig gains `async: bool`. Registry refuses bots with async=true
  if DATABASE_URL is unset (queue=nil).
- Server : when bot.Async, the webhook ack is immediate ; the update
  payload is enqueued for the worker.

When DATABASE_URL is unset (current default), queue/worker stay disabled
and only sync handlers (echo, http, auth) work — no breaking change to
the running cluster.

Refs ~/.claude/plans/pour-les-notifications-on-inherited-seal.md § Phase 2.
This commit is contained in:
2026-05-09 14:38:41 +02:00
parent f90d5efdae
commit 799e10dcc2
11 changed files with 445 additions and 21 deletions

View File

@@ -13,10 +13,11 @@ type Server struct {
auth *Auth
allowlist Allowlist
tg *TelegramClient
queue Queue
}
func NewServer(r *Registry, auth *Auth, allow Allowlist, tg *TelegramClient) *Server {
return &Server{registry: r, auth: auth, allowlist: allow, tg: tg}
func NewServer(r *Registry, auth *Auth, allow Allowlist, tg *TelegramClient, queue Queue) *Server {
return &Server{registry: r, auth: auth, allowlist: allow, tg: tg, queue: queue}
}
func (s *Server) Routes() http.Handler {
@@ -103,6 +104,32 @@ func (s *Server) botWebhook(w http.ResponseWriter, r *http.Request) {
}
}
// Async dispatch (Phase 2b) : enqueue + ack 200 immediately. The worker
// drains the queue and runs the handler asynchronously. Use this for
// handlers that may exceed Telegram's webhook timeout (~60s) or whose
// backend can be temporarily unreachable (e.g. Macbook Ollama dort).
if bot.Async {
if s.queue == nil {
log.Printf("bot=%s update=%d async requested but no queue configured", slug, update.UpdateID)
w.WriteHeader(http.StatusOK)
_, _ = fmt.Fprint(w, "{}")
return
}
// Re-marshal the update : we already decoded once, but the original
// body is preferable for the worker (preserves any unknown fields).
// Use the parsed struct re-encoded — fields we don't model are lost
// but we documented the lenient-decode tradeoff in feedback memory.
payload, _ := json.Marshal(update)
if err := EnqueueWithDefaults(r.Context(), s.queue, slug, bot.HandlerType(), update.UpdateID, payload); err != nil {
log.Printf("bot=%s update=%d enqueue error: %v", slug, update.UpdateID, err)
} else {
log.Printf("bot=%s update=%d enqueued (async)", slug, update.UpdateID)
}
w.WriteHeader(http.StatusOK)
_, _ = fmt.Fprint(w, "{}")
return
}
if err := bot.Handler.Handle(r.Context(), update, bot); err != nil {
log.Printf("bot=%s update=%d handler error: %v", slug, update.UpdateID, err)
}