Files
dance-lessons-coach/adr/0005-graceful-shutdown.md
Gabriel Radureau db09d0ace1 📝 docs(adr): homogenize all 23 ADR headers to canonical format
Audit 2026-05-02 (Tâche 6 Phase A) had identified 3 inconsistent
formats across the ADR corpus :
- F1 list bullets : `* Status:` / `* Date:` / `* Deciders:` (11 ADRs)
- F2 bold fields : `**Status:**` / `**Date:**` / `**Authors:**` (9 ADRs)
- F3 dedicated section : `## Status\n**Value** ` (5 ADRs)

Mixed metadata names (Authors / Deciders / Decision Date / Implementation
Date / Implementation Status / Last Updated) and decorative emojis on
status values made the corpus hard to scan or template against.

Canonical format adopted (see adr/README.md for full template) :
    # NN. Title

    **Status:** <Proposed|Accepted|Implemented|Partially Implemented|
                  Approved|Rejected|Deferred|Deprecated|Superseded by ADR-NNNN>
    **Date:** YYYY-MM-DD
    **Authors:** Name(s)
    [optional **Field:** ... lines]

    ## Context...

Transformations applied (via /tmp/homogenize-adrs.py) :
- F1 list bullets → bold fields
- F2 cleanup : `**Deciders:**` → `**Authors:**`, strip status emojis
- F3 sections : `## Status\n**Value** ` → `**Status:** Value`
- Strip decorative emojis from `**Status:**` and `**Implementation Status:**`
- Convert any `* Implementation Status:` / `* Last Updated:` /
  `* Decision Drivers:` / `* Decision Date:` to bold equivalents
- Date typo fix : `2024-04-XX` → `2026-04-XX` for ADRs 0018, 0019
  (already noted in PR #17 but here re-applied since branch starts
  from origin/main pre-PR17)
- Normalize multiple blank lines after header (max 1)

21 / 23 ADRs modified. 0010 and 0012 were already conform.
0011 and 0014 do not exist in the repo (cf. README index update).

Body content of each ADR is preserved unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 00:27:42 +02:00

3.4 KiB

Implement graceful shutdown with readiness endpoints

Status: Accepted Authors: Gabriel Radureau, AI Agent Date: 2026-04-03

Context and Problem Statement

We needed to implement a shutdown mechanism for dance-lessons-coach that provides:

  • Clean resource cleanup
  • Proper handling of in-flight requests
  • Kubernetes/service mesh compatibility
  • Minimal downtime for users
  • Proper orchestration signaling

Decision Drivers

  • Need for zero-data-loss shutdowns
  • Desire for Kubernetes compatibility
  • Requirement for proper resource cleanup
  • Need for minimal user impact
  • Desire for proper orchestration integration

Considered Options

  • Graceful shutdown with readiness endpoints - Kubernetes-style shutdown
  • Immediate shutdown - Simple but disruptive
  • Delayed shutdown with queue draining - Complex but thorough
  • Signal-based shutdown only - Basic graceful shutdown

Decision Outcome

Chosen option: "Graceful shutdown with readiness endpoints" because it provides the best combination of Kubernetes compatibility, proper resource cleanup, minimal user impact, and follows industry best practices for containerized services.

Pros and Cons of the Options

Graceful shutdown with readiness endpoints

  • Good, because Kubernetes/service mesh compatible
  • Good, because minimal user impact
  • Good, because proper resource cleanup
  • Good, because follows industry best practices
  • Good, because allows proper orchestration
  • Bad, because more complex to implement
  • Bad, because requires additional endpoints

Immediate shutdown

  • Good, because simplest to implement
  • Bad, because disruptive to users
  • Bad, because can lose in-flight requests
  • Bad, because no resource cleanup

Delayed shutdown with queue draining

  • Good, because very thorough
  • Good, because minimal data loss
  • Bad, because very complex
  • Bad, because overkill for simple services

Signal-based shutdown only

  • Good, because better than immediate shutdown
  • Good, because allows some cleanup
  • Bad, because not Kubernetes-compatible
  • Bad, because still somewhat disruptive

Implementation Details

// Readiness context management
readyCtx, readyCancel := context.WithCancel(context.Background())

// Readiness endpoint handler
func (s *Server) handleReadiness(w http.ResponseWriter, r *http.Request) {
    select {
    case <-s.readyCtx.Done():
        w.WriteHeader(http.StatusServiceUnavailable)
        w.Write([]byte(`{"ready":false}`))
    default:
        w.Write([]byte(`{"ready":true}`))
    }
}

// Shutdown sequence
func (s *Server) shutdown() {
    // Cancel readiness - stop accepting new requests
    readyCancel()
    
    // Wait for shutdown timeout
    shutdownCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()
    
    // Graceful server shutdown
    s.server.Shutdown(shutdownCtx)
}

Monitoring and Verification

# Check readiness during shutdown
while true; do curl -s http://localhost:8080/api/ready | jq; sleep 1; done

# Expected output during shutdown:
# {"ready":true}
# {"ready":true}
# {"ready":false}  # When shutdown starts
# {"ready":false}
# ... (connection refused)  # When server fully stopped