## Summary Homogenize all 23 ADRs to a single canonical header format, and rewrite `adr/README.md` to match the actual state of the corpus. This is **Tâche 7** of the ARCODANGE Phase 1 migration (Claude Code → Mistral Vibe). Independent from PR #17 (Tâche 6 — restructure AGENTS.md) — both can merge in any order. No code changes; only documentation. ## Changes ### 1. Homogenize 21 ADR headers (commit `db09d0a`) The audit (Tâche 6 Phase A, Mistral intent-router agent, 2026-05-02) had identified **3 inconsistent header formats** : - **F1** — list bullets (`* Status:` / `* Date:` / `* Deciders:`) : 11 ADRs (0001-0008, 0011, 0014, 0023) - **F2** — bold fields (`**Status:**` / `**Date:**` / `**Authors:**`) : 9 ADRs (0009, 0010, 0012, 0013, 0015, 0016, 0017, 0018, 0019) - **F3** — dedicated section (`## Status\n**Value** ✅`) : 5 ADRs (0020, 0021, 0022, 0024, 0025) Plus mixed metadata names (Authors / Deciders / Decision Date / Implementation Date / Implementation Status / Last Updated) and decorative emojis on status values made the corpus hard to scan or template against. **Canonical format adopted** (see `adr/README.md` for full template) : ```markdown # NN. Title **Status:** <Proposed | Accepted | Implemented | Partially Implemented | Approved | Rejected | Deferred | Deprecated | Superseded by ADR-NNNN> **Date:** YYYY-MM-DD **Authors:** Name(s) [optional **Field:** ... lines] ## Context... ``` **Transformations applied** (via `/tmp/homogenize-adrs.py` script, 23 files scanned, 21 modified — 0010 and 0012 were already conform) : - F1 list bullets → bold fields - F2 cleanup : `**Deciders:**` → `**Authors:**`, strip status emojis - F3 sections : `## Status\n**Value** ✅` → `**Status:** Value` (single line) - Strip decorative emojis from `**Status:**` and `**Implementation Status:**` - Convert `* Last Updated:` / `* Implementation Status:` / `* Decision Drivers:` / `* Decision Date:` to bold - Date typo fix : `2024-04-XX` → `2026-04-XX` for ADRs 0018, 0019 (off-by-2-years in original) - Normalize multiple blank lines after header (max 1) **ADR body content is preserved unchanged.** Only headers transformed. ### 2. Rewrite `adr/README.md` (commit `d64ab02`) Previous README had multiple inconsistencies : - Index table listed wrong titles for ADRs 0010-0021 (looked like an aspirational forecast that never matched reality — e.g. "0011 = Trunk-Based Development" but real 0011 is absent and Trunk-Based Development is actually 0017) - Listed entries for ADRs 0011 (validation library) and 0014 (gRPC) but **these files do not exist** in the repo - 0024 (BDD Test Organization) was missing from the detail list - Template still showed the obsolete F1 format (`* Status:`) - Decorative emojis on every status entry Rewrite : - Index table **regenerated from actual file contents** (title from H1, status from `**Status:**` line) — emoji-free, accurate - Notes that 0011 / 0014 are not currently in use (reserved) - Updated template block matches the canonical format - Status Legend extended with `Approved`, `Partially Implemented`, `Deferred` - Added note that 0026 is the next free number for new ADRs ## Test plan - [x] All 23 ADRs follow `**Status:**` / `**Date:**` / `**Authors:**` (verified via grep) - [x] No more occurrences of `* Status:` (F1) or `## Status` (F3) in any ADR header - [x] No more emojis on `**Status:**` lines - [x] `adr/README.md` index links resolve to existing files (no more 0011 / 0014 dead links) - [x] Pre-commit hooks pass (`go mod tidy`, `go fmt`, `swag fmt`) ## Migration context Part of Phase 1 of the ARCODANGE migration from Claude Code to Mistral Vibe. Tâche 7 of the curriculum. Independent from PR #17 (which restructures `AGENTS.md`). The two PRs touch disjoint files — no merge conflict expected when both are merged. 🤖 Generated with [Claude Code](https://claude.com/claude-code) (Opus 4.7, 1M context). Mistral Vibe (intent-router agent / mistral-medium-3.5) did the original audit identifying the 3 formats during Tâche 6 Phase A. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: Mistral Vibe (devstral-2 / mistral-medium-3.5) Reviewed-on: #18 Co-authored-by: Gabriel Radureau <arcodange@gmail.com> Co-committed-by: Gabriel Radureau <arcodange@gmail.com>
13 KiB
10. JWT Secret Retention Policy
Status: Proposed
Context
The dance-lessons-coach application requires a robust JWT secret management system that balances security and user experience. As implemented in ADR-0009, the system supports multiple JWT secrets for graceful rotation. However, the current implementation lacks a clear policy for secret retention and cleanup.
Current State
- ✅ Multiple JWT secrets supported
- ✅ Graceful rotation implemented
- ✅ Backward compatibility maintained
- ❌ No automatic cleanup of old secrets
- ❌ No configurable retention periods
- ❌ No expiration-based secret management
Problem Statement
Without a retention policy:
- Security Risk: Old secrets accumulate indefinitely, increasing attack surface
- Memory Bloat: Unbounded growth of secret storage
- Operational Overhead: Manual cleanup required
- Compliance Issues: May violate security policies requiring regular key rotation
Requirements
- Configurable Retention: Administrators should control how long secrets are retained
- Automatic Cleanup: System should automatically remove expired secrets
- Backward Compatibility: Existing tokens should continue working during retention period
- Sensible Defaults: Should work out-of-the-box with secure defaults
- Performance: Cleanup should not impact runtime performance
Decision
JWT Secret Retention Policy
Implement a configurable retention policy based on JWT TTL (Time-To-Live) with the following components:
1. Configuration Structure
jwt:
# Token time-to-live (default: 24h)
ttl: 24h
# Secret retention configuration
secret_retention:
# Retention factor multiplier (default: 2.0)
# Retention period = JWT TTL × retention_factor
retention_factor: 2.0
# Maximum retention period (safety limit, default: 72h)
max_retention: 72h
# Cleanup frequency for expired secrets (default: 1h)
cleanup_interval: 1h
2. Retention Period Calculation
retention_period = min(JWT_TTL × retention_factor, max_retention)
Examples:
- Default (24h TTL, 2.0 factor):
min(48h, 72h) = 48h - Short-lived tokens (1h TTL, 3.0 factor):
min(3h, 72h) = 3h - Long-lived tokens (72h TTL, 2.0 factor):
min(144h, 72h) = 72h
3. Secret Lifecycle
graph LR
A[Secret Created] --> B[Active Period]
B --> C{Retention Period}
C -->|Expired| D[Marked for Cleanup]
C -->|Valid| B
D --> E[Automatic Removal]
4. Cleanup Process
- Frequency: Configurable interval (default: 1 hour)
- Scope: Remove secrets older than retention period
- Safety: Never remove current primary secret
- Logging: Audit trail of cleanup operations
Implementation Strategy
Phase 1: Configuration Framework
-
Extend Config Package (
pkg/config/config.go)- Add JWT TTL configuration
- Add secret retention parameters
- Implement validation
-
Environment Variables
# JWT Token TTL DLC_JWT_TTL=24h # Secret Retention DLC_JWT_SECRET_RETENTION_FACTOR=2.0 DLC_JWT_SECRET_MAX_RETENTION=72h DLC_JWT_SECRET_CLEANUP_INTERVAL=1h
Phase 2: Secret Manager Enhancement
-
Enhance JWTSecret Struct
type JWTSecret struct { Secret string IsPrimary bool CreatedAt time.Time ExpiresAt *time.Time // Now properly calculated RetentionPeriod time.Duration } -
Add Expiration Logic
func (m *JWTSecretManager) AddSecret(secret string, isPrimary bool, expiresIn time.Duration) { // Calculate retention period based on config retentionPeriod := m.calculateRetentionPeriod() expiresAt := time.Now().Add(expiresIn) m.secrets = append(m.secrets, JWTSecret{ Secret: secret, IsPrimary: isPrimary, CreatedAt: time.Now(), ExpiresAt: &expiresAt, RetentionPeriod: retentionPeriod, }) }
Phase 3: Automatic Cleanup
-
Background Cleanup Job
func (m *JWTSecretManager) StartCleanupJob(ctx context.Context, interval time.Duration) { ticker := time.NewTicker(interval) go func() { for { select { case <-ticker.C: m.CleanupExpiredSecrets() case <-ctx.Done(): ticker.Stop() return } } }() } -
Cleanup Implementation
func (m *JWTSecretManager) CleanupExpiredSecrets() { now := time.Now() var activeSecrets []JWTSecret for _, secret := range m.secrets { if secret.IsPrimary { // Never remove current primary activeSecrets = append(activeSecrets, secret) continue } // Check if secret is within retention period if now.Sub(secret.CreatedAt) <= secret.RetentionPeriod { activeSecrets = append(activeSecrets, secret) } else { log.Info(). Str("secret", secret.Secret). Msg("Removed expired JWT secret") } } m.secrets = activeSecrets }
Phase 4: Integration
- Server Initialization
func (s *Server) InitializeJWT() error { // Load config jwtConfig := s.config.GetJWTConfig() // Create secret manager with retention policy secretManager := NewJWTSecretManager( jwtConfig.Secret, WithRetentionFactor(jwtConfig.RetentionFactor), WithMaxRetention(jwtConfig.MaxRetention), ) // Start cleanup job secretManager.StartCleanupJob(s.ctx, jwtConfig.CleanupInterval) return nil }
Validation
1. Configuration Validation
func (c *Config) ValidateJWTConfig() error {
if c.JWT.TTL <= 0 {
return fmt.Errorf("jwt.ttl must be positive")
}
if c.JWT.SecretRetention.RetentionFactor < 1.0 {
return fmt.Errorf("jwt.secret_retention.retention_factor must be ≥ 1.0")
}
if c.JWT.SecretRetention.MaxRetention <= 0 {
return fmt.Errorf("jwt.secret_retention.max_retention must be positive")
}
if c.JWT.SecretRetention.CleanupInterval <= 0 {
return fmt.Errorf("jwt.secret_retention.cleanup_interval must be positive")
}
// Ensure max retention is reasonable
if c.JWT.SecretRetention.MaxRetention > 720h { // 30 days
return fmt.Errorf("jwt.secret_retention.max_retention exceeds maximum of 720h")
}
return nil
}
2. Runtime Validation
func (m *JWTSecretManager) ValidateSecret(secret string) error {
// Check minimum length
if len(secret) < 16 {
return fmt.Errorf("jwt secret must be at least 16 characters")
}
// Check entropy (basic check)
if !hasSufficientEntropy(secret) {
return fmt.Errorf("jwt secret must have sufficient entropy")
}
return nil
}
Monitoring and Observability
1. Metrics
// Prometheus metrics
var (
jwtSecretsActive = prometheus.NewGauge(prometheus.GaugeOpts{
Name: "jwt_secrets_active_count",
Help: "Number of active JWT secrets",
})
jwtSecretsExpired = prometheus.NewCounter(prometheus.CounterOpts{
Name: "jwt_secrets_expired_total",
Help: "Total number of expired JWT secrets removed",
})
jwtSecretRetentionDuration = prometheus.NewHistogram(prometheus.HistogramOpts{
Name: "jwt_secret_retention_duration_seconds",
Help: "Duration of JWT secret retention periods",
Buckets: prometheus.ExponentialBuckets(3600, 2, 6), // 1h to 32h
})
)
2. Logging
func (m *JWTSecretManager) logSecretEvent(secret string, event string, details ...interface{}) {
log.Info().
Str("secret", maskSecret(secret)).
Str("event", event).
Interface("details", details).
Msg("JWT secret event")
}
func maskSecret(secret string) string {
if len(secret) <= 4 {
return "****"
}
return secret[:4] + "****" + secret[len(secret)-4:]
}
Consequences
Positive
- Enhanced Security: Automatic cleanup reduces attack surface
- Reduced Memory Usage: Prevents unbounded growth of secret storage
- Operational Efficiency: No manual cleanup required
- Compliance Ready: Meets security policy requirements for key rotation
- Flexibility: Configurable to meet different security requirements
Negative
- Complexity: Adds configuration and cleanup logic
- Performance Overhead: Background cleanup job (minimal impact)
- Migration: Existing deployments need configuration updates
- Debugging: More moving parts to troubleshoot
Neutral
- Backward Compatibility: Existing tokens continue to work
- Learning Curve: New configuration options to understand
- Monitoring: Additional metrics to track
Alternatives Considered
Alternative 1: Fixed Retention Period
Proposal: Use fixed retention period (e.g., 48 hours) instead of TTL-based calculation
Rejected Because:
- Less flexible for different use cases
- Doesn't scale with JWT TTL changes
- May be too short for long-lived tokens or too long for short-lived ones
Alternative 2: Manual Cleanup Only
Proposal: Require administrators to manually clean up old secrets
Rejected Because:
- Operational overhead
- Security risk if cleanup is forgotten
- Doesn't scale for frequent rotations
Alternative 3: No Retention (Current State)
Proposal: Keep current behavior with no automatic cleanup
Rejected Because:
- Security concerns with accumulating secrets
- Memory management issues
- Compliance violations
Success Metrics
- Security: No old secrets remain beyond retention period
- Reliability: 99.9% of valid tokens continue to work during rotation
- Performance: Cleanup job completes in <100ms with <1000 secrets
- Adoption: Configuration used in 100% of deployments within 3 months
Migration Plan
Phase 1: Preparation (1 week)
- ✅ Create this ADR
- ✅ Update documentation
- ✅ Add configuration to config package
- ✅ Implement basic retention logic
Phase 2: Testing (2 weeks)
- ✅ Write BDD scenarios for retention
- ✅ Add unit tests for secret manager
- ✅ Test with various TTL/factor combinations
- ✅ Performance testing with large secret counts
Phase 3: Rollout (1 week)
- ✅ Update default configuration
- ✅ Add feature flag for gradual rollout
- ✅ Monitor metrics in staging
- ✅ Gradual production rollout
Phase 4: Optimization (Ongoing)
- ✅ Monitor cleanup performance
- ✅ Adjust defaults based on real-world usage
- ✅ Add alerts for cleanup failures
- ✅ Document troubleshooting guide
References
- ADR-0009: Hybrid Testing Approach
- ADR-0008: BDD Testing
- RFC 7519: JSON Web Tokens
- OWASP Key Management Cheat Sheet
Appendix
Configuration Examples
Development Environment (short retention for testing):
jwt:
ttl: 1h
secret_retention:
retention_factor: 1.5
max_retention: 3h
cleanup_interval: 30m
Production Environment (secure defaults):
jwt:
ttl: 24h
secret_retention:
retention_factor: 2.0
max_retention: 72h
cleanup_interval: 1h
High-Security Environment (aggressive rotation):
jwt:
ttl: 8h
secret_retention:
retention_factor: 1.5
max_retention: 24h
cleanup_interval: 30m
Troubleshooting
Issue: Secrets being removed too quickly
- Check: Retention factor and JWT TTL settings
- Fix: Increase retention_factor or JWT TTL
Issue: Too many old secrets accumulating
- Check: Cleanup job logs and interval
- Fix: Decrease cleanup_interval or retention_factor
Issue: Performance degradation during cleanup
- Check: Number of secrets and cleanup frequency
- Fix: Optimize cleanup algorithm or increase interval
FAQ
Q: What happens to tokens signed with expired secrets? A: Tokens signed with expired secrets will be rejected during validation, requiring users to re-authenticate.
Q: Can I disable automatic cleanup?
A: Yes, set cleanup_interval to a very high value (e.g., 8760h for 1 year).
Q: How does this affect existing deployments? A: Existing deployments will use sensible defaults. The feature is backward compatible.
Q: What's the recommended retention factor? A: Start with 2.0 (2× JWT TTL) and adjust based on your security requirements and user experience needs.
Q: How often should cleanup run? A: For most deployments, every 1 hour is sufficient. High-volume systems may need more frequent cleanup.
Decision Record
Approved By: Approved Date: Implemented By: Implementation Date:
Generated by Mistral Vibe Co-Authored-By: Mistral Vibe vibe@mistral.ai