Files
dance-lessons-coach/adr/0005-graceful-shutdown.md
Gabriel Radureau 95596b5e12 📝 docs: consolidate documentation and add comprehensive ADRs\n\n## Summary\nMajor documentation restructuring to improve clarity, reduce redundancy,
and preserve complete architectural context for AI/developer reference.\n\n## Changes\n\n### Documentation Consolidation 🗂️\n- Simplified README.md by ~100 lines (25% reduction)\n- Removed redundant sections (project structure, configuration, API docs)\n- Added strategic cross-references between README.md and AGENTS.md\n- README.md now focused on user onboarding and basic usage\n- AGENTS.md maintained as complete technical reference\n\n### Architecture Decision Records \n- Added comprehensive ADR directory with 9 decision records:\n  * 0001-go-1.26.1-standard.md\n  * 0002-chi-router.md\n  * 0003-zerolog-logging.md (enhanced with Zap analysis)\n  * 0004-interface-based-design.md\n  * 0005-graceful-shutdown.md\n  * 0006-configuration-management.md\n  * 0007-opentelemetry-integration.md\n  * 0008-bdd-testing.md\n  * 0009-hybrid-testing-approach.md\n- Added adr/README.md with guidelines and template\n- Enhanced Zerolog ADR with detailed performance benchmarking vs Zap\n\n### Content Organization 📝\n- README.md: User-focused guide with quick start and basic examples\n- AGENTS.md: Developer/AI-focused complete technical reference\n- ADR directory: Architectural decision history and rationale\n\n## Impact\n-  Better user onboarding experience\n-  Preserved complete technical context for AI agents\n-  Reduced maintenance burden through consolidation\n-  Improved discoverability of advanced documentation\n-  Established ADR process for future decisions\n\n## Related\n- Resolves documentation redundancy issues\n- Prepares for BDD implementation with clear context\n- Supports future Swagger integration decisions\n- Maintains project history for new contributors\n\nGenerated by Mistral Vibe.\nCo-Authored-By: Mistral Vibe <vibe@mistral.ai>
2026-04-04 15:48:27 +02:00

117 lines
3.4 KiB
Markdown

# Implement graceful shutdown with readiness endpoints
* Status: Accepted
* Deciders: Gabriel Radureau, AI Agent
* Date: 2026-04-03
## Context and Problem Statement
We needed to implement a shutdown mechanism for DanceLessonsCoach that provides:
- Clean resource cleanup
- Proper handling of in-flight requests
- Kubernetes/service mesh compatibility
- Minimal downtime for users
- Proper orchestration signaling
## Decision Drivers
* Need for zero-data-loss shutdowns
* Desire for Kubernetes compatibility
* Requirement for proper resource cleanup
* Need for minimal user impact
* Desire for proper orchestration integration
## Considered Options
* Graceful shutdown with readiness endpoints - Kubernetes-style shutdown
* Immediate shutdown - Simple but disruptive
* Delayed shutdown with queue draining - Complex but thorough
* Signal-based shutdown only - Basic graceful shutdown
## Decision Outcome
Chosen option: "Graceful shutdown with readiness endpoints" because it provides the best combination of Kubernetes compatibility, proper resource cleanup, minimal user impact, and follows industry best practices for containerized services.
## Pros and Cons of the Options
### Graceful shutdown with readiness endpoints
* Good, because Kubernetes/service mesh compatible
* Good, because minimal user impact
* Good, because proper resource cleanup
* Good, because follows industry best practices
* Good, because allows proper orchestration
* Bad, because more complex to implement
* Bad, because requires additional endpoints
### Immediate shutdown
* Good, because simplest to implement
* Bad, because disruptive to users
* Bad, because can lose in-flight requests
* Bad, because no resource cleanup
### Delayed shutdown with queue draining
* Good, because very thorough
* Good, because minimal data loss
* Bad, because very complex
* Bad, because overkill for simple services
### Signal-based shutdown only
* Good, because better than immediate shutdown
* Good, because allows some cleanup
* Bad, because not Kubernetes-compatible
* Bad, because still somewhat disruptive
## Implementation Details
```go
// Readiness context management
readyCtx, readyCancel := context.WithCancel(context.Background())
// Readiness endpoint handler
func (s *Server) handleReadiness(w http.ResponseWriter, r *http.Request) {
select {
case <-s.readyCtx.Done():
w.WriteHeader(http.StatusServiceUnavailable)
w.Write([]byte(`{"ready":false}`))
default:
w.Write([]byte(`{"ready":true}`))
}
}
// Shutdown sequence
func (s *Server) shutdown() {
// Cancel readiness - stop accepting new requests
readyCancel()
// Wait for shutdown timeout
shutdownCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Graceful server shutdown
s.server.Shutdown(shutdownCtx)
}
```
## Links
* [Kubernetes Graceful Shutdown](https://kubernetes.io/blog/2021/04/21/graceful-node-shutdown/)
* [VictoriaMetrics Readiness Patterns](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#how-to-shut-down)
* [Go HTTP Server Shutdown](https://pkg.go.dev/net/http#Server.Shutdown)
## Monitoring and Verification
```bash
# Check readiness during shutdown
while true; do curl -s http://localhost:8080/api/ready | jq; sleep 1; done
# Expected output during shutdown:
# {"ready":true}
# {"ready":true}
# {"ready":false} # When shutdown starts
# {"ready":false}
# ... (connection refused) # When server fully stopped
```