Audit 2026-05-02 (Tâche 6 Phase A) had identified 3 inconsistent formats across the ADR corpus : - F1 list bullets : `* Status:` / `* Date:` / `* Deciders:` (11 ADRs) - F2 bold fields : `**Status:**` / `**Date:**` / `**Authors:**` (9 ADRs) - F3 dedicated section : `## Status\n**Value** ✅` (5 ADRs) Mixed metadata names (Authors / Deciders / Decision Date / Implementation Date / Implementation Status / Last Updated) and decorative emojis on status values made the corpus hard to scan or template against. Canonical format adopted (see adr/README.md for full template) : # NN. Title **Status:** <Proposed|Accepted|Implemented|Partially Implemented| Approved|Rejected|Deferred|Deprecated|Superseded by ADR-NNNN> **Date:** YYYY-MM-DD **Authors:** Name(s) [optional **Field:** ... lines] ## Context... Transformations applied (via /tmp/homogenize-adrs.py) : - F1 list bullets → bold fields - F2 cleanup : `**Deciders:**` → `**Authors:**`, strip status emojis - F3 sections : `## Status\n**Value** ✅` → `**Status:** Value` - Strip decorative emojis from `**Status:**` and `**Implementation Status:**` - Convert any `* Implementation Status:` / `* Last Updated:` / `* Decision Drivers:` / `* Decision Date:` to bold equivalents - Date typo fix : `2024-04-XX` → `2026-04-XX` for ADRs 0018, 0019 (already noted in PR #17 but here re-applied since branch starts from origin/main pre-PR17) - Normalize multiple blank lines after header (max 1) 21 / 23 ADRs modified. 0010 and 0012 were already conform. 0011 and 0014 do not exist in the repo (cf. README index update). Body content of each ADR is preserved unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3.4 KiB
3.4 KiB
Implement graceful shutdown with readiness endpoints
Status: Accepted Authors: Gabriel Radureau, AI Agent Date: 2026-04-03
Context and Problem Statement
We needed to implement a shutdown mechanism for dance-lessons-coach that provides:
- Clean resource cleanup
- Proper handling of in-flight requests
- Kubernetes/service mesh compatibility
- Minimal downtime for users
- Proper orchestration signaling
Decision Drivers
- Need for zero-data-loss shutdowns
- Desire for Kubernetes compatibility
- Requirement for proper resource cleanup
- Need for minimal user impact
- Desire for proper orchestration integration
Considered Options
- Graceful shutdown with readiness endpoints - Kubernetes-style shutdown
- Immediate shutdown - Simple but disruptive
- Delayed shutdown with queue draining - Complex but thorough
- Signal-based shutdown only - Basic graceful shutdown
Decision Outcome
Chosen option: "Graceful shutdown with readiness endpoints" because it provides the best combination of Kubernetes compatibility, proper resource cleanup, minimal user impact, and follows industry best practices for containerized services.
Pros and Cons of the Options
Graceful shutdown with readiness endpoints
- Good, because Kubernetes/service mesh compatible
- Good, because minimal user impact
- Good, because proper resource cleanup
- Good, because follows industry best practices
- Good, because allows proper orchestration
- Bad, because more complex to implement
- Bad, because requires additional endpoints
Immediate shutdown
- Good, because simplest to implement
- Bad, because disruptive to users
- Bad, because can lose in-flight requests
- Bad, because no resource cleanup
Delayed shutdown with queue draining
- Good, because very thorough
- Good, because minimal data loss
- Bad, because very complex
- Bad, because overkill for simple services
Signal-based shutdown only
- Good, because better than immediate shutdown
- Good, because allows some cleanup
- Bad, because not Kubernetes-compatible
- Bad, because still somewhat disruptive
Implementation Details
// Readiness context management
readyCtx, readyCancel := context.WithCancel(context.Background())
// Readiness endpoint handler
func (s *Server) handleReadiness(w http.ResponseWriter, r *http.Request) {
select {
case <-s.readyCtx.Done():
w.WriteHeader(http.StatusServiceUnavailable)
w.Write([]byte(`{"ready":false}`))
default:
w.Write([]byte(`{"ready":true}`))
}
}
// Shutdown sequence
func (s *Server) shutdown() {
// Cancel readiness - stop accepting new requests
readyCancel()
// Wait for shutdown timeout
shutdownCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Graceful server shutdown
s.server.Shutdown(shutdownCtx)
}
Links
Monitoring and Verification
# Check readiness during shutdown
while true; do curl -s http://localhost:8080/api/ready | jq; sleep 1; done
# Expected output during shutdown:
# {"ready":true}
# {"ready":true}
# {"ready":false} # When shutdown starts
# {"ready":false}
# ... (connection refused) # When server fully stopped