## Summary Homogenize all 23 ADRs to a single canonical header format, and rewrite `adr/README.md` to match the actual state of the corpus. This is **Tâche 7** of the ARCODANGE Phase 1 migration (Claude Code → Mistral Vibe). Independent from PR #17 (Tâche 6 — restructure AGENTS.md) — both can merge in any order. No code changes; only documentation. ## Changes ### 1. Homogenize 21 ADR headers (commit `db09d0a`) The audit (Tâche 6 Phase A, Mistral intent-router agent, 2026-05-02) had identified **3 inconsistent header formats** : - **F1** — list bullets (`* Status:` / `* Date:` / `* Deciders:`) : 11 ADRs (0001-0008, 0011, 0014, 0023) - **F2** — bold fields (`**Status:**` / `**Date:**` / `**Authors:**`) : 9 ADRs (0009, 0010, 0012, 0013, 0015, 0016, 0017, 0018, 0019) - **F3** — dedicated section (`## Status\n**Value** ✅`) : 5 ADRs (0020, 0021, 0022, 0024, 0025) Plus mixed metadata names (Authors / Deciders / Decision Date / Implementation Date / Implementation Status / Last Updated) and decorative emojis on status values made the corpus hard to scan or template against. **Canonical format adopted** (see `adr/README.md` for full template) : ```markdown # NN. Title **Status:** <Proposed | Accepted | Implemented | Partially Implemented | Approved | Rejected | Deferred | Deprecated | Superseded by ADR-NNNN> **Date:** YYYY-MM-DD **Authors:** Name(s) [optional **Field:** ... lines] ## Context... ``` **Transformations applied** (via `/tmp/homogenize-adrs.py` script, 23 files scanned, 21 modified — 0010 and 0012 were already conform) : - F1 list bullets → bold fields - F2 cleanup : `**Deciders:**` → `**Authors:**`, strip status emojis - F3 sections : `## Status\n**Value** ✅` → `**Status:** Value` (single line) - Strip decorative emojis from `**Status:**` and `**Implementation Status:**` - Convert `* Last Updated:` / `* Implementation Status:` / `* Decision Drivers:` / `* Decision Date:` to bold - Date typo fix : `2024-04-XX` → `2026-04-XX` for ADRs 0018, 0019 (off-by-2-years in original) - Normalize multiple blank lines after header (max 1) **ADR body content is preserved unchanged.** Only headers transformed. ### 2. Rewrite `adr/README.md` (commit `d64ab02`) Previous README had multiple inconsistencies : - Index table listed wrong titles for ADRs 0010-0021 (looked like an aspirational forecast that never matched reality — e.g. "0011 = Trunk-Based Development" but real 0011 is absent and Trunk-Based Development is actually 0017) - Listed entries for ADRs 0011 (validation library) and 0014 (gRPC) but **these files do not exist** in the repo - 0024 (BDD Test Organization) was missing from the detail list - Template still showed the obsolete F1 format (`* Status:`) - Decorative emojis on every status entry Rewrite : - Index table **regenerated from actual file contents** (title from H1, status from `**Status:**` line) — emoji-free, accurate - Notes that 0011 / 0014 are not currently in use (reserved) - Updated template block matches the canonical format - Status Legend extended with `Approved`, `Partially Implemented`, `Deferred` - Added note that 0026 is the next free number for new ADRs ## Test plan - [x] All 23 ADRs follow `**Status:**` / `**Date:**` / `**Authors:**` (verified via grep) - [x] No more occurrences of `* Status:` (F1) or `## Status` (F3) in any ADR header - [x] No more emojis on `**Status:**` lines - [x] `adr/README.md` index links resolve to existing files (no more 0011 / 0014 dead links) - [x] Pre-commit hooks pass (`go mod tidy`, `go fmt`, `swag fmt`) ## Migration context Part of Phase 1 of the ARCODANGE migration from Claude Code to Mistral Vibe. Tâche 7 of the curriculum. Independent from PR #17 (which restructures `AGENTS.md`). The two PRs touch disjoint files — no merge conflict expected when both are merged. 🤖 Generated with [Claude Code](https://claude.com/claude-code) (Opus 4.7, 1M context). Mistral Vibe (intent-router agent / mistral-medium-3.5) did the original audit identifying the 3 formats during Tâche 6 Phase A. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: Mistral Vibe (devstral-2 / mistral-medium-3.5) Reviewed-on: #18 Co-authored-by: Gabriel Radureau <arcodange@gmail.com> Co-committed-by: Gabriel Radureau <arcodange@gmail.com>
16 KiB
ADR 0022: Rate Limiting and Cache Strategy
Status: Proposed
Context
As the dance-lessons-coach application grows and potentially serves multiple users simultaneously, we need to implement rate limiting to:
- Prevent abuse of API endpoints
- Protect against DDoS attacks
- Ensure fair usage across all users
- Maintain system stability under load
- Provide consistent performance
Additionally, we need a caching strategy to:
- Reduce database load for frequently accessed data
- Improve response times for common requests
- Support horizontal scaling with shared cache
- Handle cache invalidation properly
Decision
We will implement a multi-phase caching and rate limiting strategy with the following components:
Phase 1: In-Memory Cache with TTL Support
Library Selection: We will use github.com/patrickmn/go-cache for in-memory caching because:
✅ Pros:
- Simple, lightweight, and well-maintained
- Built-in TTL (Time-To-Live) support
- Thread-safe by default
- No external dependencies
- Good performance for single-instance applications
- Supports automatic expiration
❌ Cons:
- Not shared between multiple instances
- Memory-bound (not persistent)
- Limited advanced features
Implementation Plan:
type CacheService interface {
Set(key string, value interface{}, expiration time.Duration) error
Get(key string) (interface{}, bool)
Delete(key string) error
Flush() error
GetWithTTL(key string) (interface{}, time.Duration, bool)
}
type InMemoryCacheService struct {
cache *cache.Cache
defaultTTL time.Duration
cleanupInterval time.Duration
}
Use Cases:
- JWT token validation results
- User session data
- Frequently accessed greet messages
- API response caching for idempotent endpoints
Phase 2: Redis-Compatible Shared Cache
Library Selection: We will use github.com/redis/go-redis/v9 with a Redis-compatible open-source alternative:
Primary Choice: Dragonfly (https://www.dragonflydb.io/)
- Redis-compatible
- Open-source (Apache 2.0 license)
- Written in C++ with multi-threaded architecture
- 25x higher throughput than Redis
- Lower latency
- Drop-in Redis replacement
Fallback Choice: KeyDB (https://keydb.dev/)
- Multi-threaded Redis fork
- Open-source (GPL license)
- Better performance than Redis
- Full Redis API compatibility
Implementation Plan:
type RedisCacheService struct {
client *redis.Client
defaultTTL time.Duration
prefix string
}
func NewRedisCacheService(config *config.CacheConfig) (*RedisCacheService, error) {
client := redis.NewClient(&redis.Options{
Addr: config.Host + ":" + strconv.Itoa(config.Port),
Password: config.Password,
DB: config.Database,
PoolSize: config.PoolSize,
})
// Test connection
_, err := client.Ping(context.Background()).Result()
if err != nil {
return nil, fmt.Errorf("failed to connect to Redis: %w", err)
}
return &RedisCacheService{
client: client,
defaultTTL: config.DefaultTTL,
prefix: config.Prefix,
}, nil
}
Configuration:
cache:
# In-memory cache configuration
in_memory:
enabled: true
default_ttl: 5m
cleanup_interval: 10m
max_items: 10000
# Redis-compatible cache configuration
redis:
enabled: false
host: "localhost"
port: 6379
password: ""
database: 0
pool_size: 10
default_ttl: 5m
prefix: "dlc:"
use_dragonfly: true # Set to false to use KeyDB
Phase 3: Rate Limiting Implementation
Library Selection: We will use github.com/ulule/limiter/v3 because:
✅ Pros:
- Multiple storage backends (in-memory, Redis, etc.)
- Sliding window algorithm
- Distributed rate limiting support
- Configurable rate limits
- Middleware support for Chi router
- Good performance
Implementation Plan:
// Rate limit configuration
type RateLimitConfig struct {
Enabled bool `mapstructure:"enabled"`
RequestsPerHour int `mapstructure:"requests_per_hour"`
BurstLimit int `mapstructure:"burst_limit"`
IPWhitelist []string `mapstructure:"ip_whitelist"`
EndpointSpecific map[string]struct {
RequestsPerHour int `mapstructure:"requests_per_hour"`
BurstLimit int `mapstructure:"burst_limit"`
} `mapstructure:"endpoint_specific"`
}
// Rate limiter service
type RateLimiterService struct {
limiter *limiter.Limiter
store limiter.Store
config *RateLimitConfig
}
func NewRateLimiterService(config *RateLimitConfig) (*RateLimiterService, error) {
var store limiter.Store
// Use Redis if available, otherwise use in-memory
if config.UseRedis {
// Initialize Redis store
store, err = limiter.NewStoreRedisWithOptions(&limiter.StoreOptions{
Prefix: config.RedisPrefix,
// ... other Redis options
})
} else {
// Use in-memory store
store = limiter.NewStoreMemory()
}
if err != nil {
return nil, fmt.Errorf("failed to create rate limiter store: %w", err)
}
// Create rate limiter
rate := limiter.Rate{
Period: time.Hour,
Limit: int64(config.RequestsPerHour),
}
return &RateLimiterService{
limiter: limiter.New(store, rate),
store: store,
config: config,
}, nil
}
Chi Middleware:
func RateLimitMiddleware(limiter *RateLimiterService) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Skip rate limiting for whitelisted IPs
clientIP := r.Header.Get("X-Real-IP")
if clientIP == "" {
clientIP = r.RemoteAddr
}
for _, allowedIP := range limiter.config.IPWhitelist {
if clientIP == allowedIP {
next.ServeHTTP(w, r)
return
}
}
// Get rate limit context
context, err := limiter.limiter.Get(r.Context(), clientIP)
if err != nil {
log.Error().Err(err).Str("ip", clientIP).Msg("Rate limit error")
http.Error(w, "Internal server error", http.StatusInternalServerError)
return
}
// Check if rate limit is exceeded
if context.Reached > 0 {
w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
w.Header().Set("X-RateLimit-Remaining", "0")
w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))
http.Error(w, "Too many requests", http.StatusTooManyRequests)
return
}
// Set rate limit headers
w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
w.Header().Set("X-RateLimit-Remaining", strconv.Itoa(limiter.config.RequestsPerHour-int(context.Reached)))
w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))
next.ServeHTTP(w, r)
})
}
}
Phase 4: Cache Invalidation Strategy
Approach: Hybrid cache invalidation with multiple strategies:
-
Time-Based Expiration (TTL)
- All cache entries have a TTL
- Automatic expiration prevents stale data
- Default TTL: 5 minutes for most data
-
Event-Based Invalidation
- Cache keys are invalidated on specific events
- Example: User data cache invalidated on user update
- Uses pub/sub pattern for distributed invalidation
-
Versioned Cache Keys
- Cache keys include data version
- When data changes, version increments
- Old cache entries naturally expire
-
Write-Through Caching
- Data written to database and cache simultaneously
- Ensures cache is always up-to-date
- Used for critical data that must be consistent
Cache Key Strategy:
func GetCacheKey(prefix, entityType, entityID string) string {
return fmt.Sprintf("%s:%s:%s", prefix, entityType, entityID)
}
// Example: "dlc:user:123"
// Example: "dlc:jwt:validation:token_hash"
Implementation Phases
Phase 1: In-Memory Cache (Current Sprint)
- ✅ Research and select in-memory cache library
- ✅ Implement cache interface and in-memory service
- ✅ Add cache configuration to config package
- ✅ Implement basic cache operations (set, get, delete)
- ✅ Add TTL support and automatic cleanup
- ✅ Cache JWT validation results
- ✅ Add cache metrics and monitoring
Phase 2: Redis-Compatible Cache (Next Sprint)
- ✅ Set up Dragonfly/KeyDB in development environment
- ✅ Implement Redis cache service
- ✅ Add configuration for Redis connection
- ✅ Implement cache fallback strategy (Redis → in-memory)
- ✅ Add health checks for Redis connection
- ✅ Implement distributed cache invalidation
Phase 3: Rate Limiting (Following Sprint)
- ✅ Research and select rate limiting library
- ✅ Implement rate limiter service
- ✅ Add rate limit configuration
- ✅ Implement Chi middleware for rate limiting
- ✅ Add rate limit headers to responses
- ✅ Implement IP whitelisting
- ✅ Add endpoint-specific rate limits
Phase 4: Advanced Features (Future)
- ✅ Cache warming for critical data
- ✅ Two-level caching (Redis + in-memory)
- ✅ Cache compression for large objects
- ✅ Rate limit exemptions for admin users
- ✅ Dynamic rate limit adjustment
- ✅ Cache analytics and usage patterns
Configuration
# Cache configuration
cache:
in_memory:
enabled: true
default_ttl: "5m"
cleanup_interval: "10m"
max_items: 10000
redis:
enabled: false
host: "localhost"
port: 6379
password: ""
database: 0
pool_size: 10
default_ttl: "5m"
prefix: "dlc:"
use_dragonfly: true
# Rate limiting configuration
rate_limiting:
enabled: true
requests_per_hour: 1000
burst_limit: 100
ip_whitelist:
- "127.0.0.1"
- "::1"
endpoint_specific:
"/api/v1/auth/login":
requests_per_hour: 100
burst_limit: 10
"/api/v1/auth/register":
requests_per_hour: 50
burst_limit: 5
Monitoring and Metrics
Cache Metrics:
- Cache hit/miss ratio
- Average cache latency
- Cache size and memory usage
- Eviction rate
- TTL distribution
Rate Limit Metrics:
- Requests allowed vs rejected
- Rate limit exceeded events
- Top limited IPs
- Endpoint-specific rate limit usage
Prometheus Metrics:
var (
cacheHits = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "cache_hits_total",
Help: "Number of cache hits",
}, []string{"cache_type", "entity_type"})
cacheMisses = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "cache_misses_total",
Help: "Number of cache misses",
}, []string{"cache_type", "entity_type"})
rateLimitExceeded = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "rate_limit_exceeded_total",
Help: "Number of rate limit exceeded events",
}, []string{"endpoint", "ip"})
)
Security Considerations
-
Cache Security:
- Never cache sensitive user data (passwords, tokens)
- Use separate cache prefixes for different data types
- Implement cache key hashing for sensitive data
- Set appropriate TTLs to limit exposure
-
Rate Limit Security:
- Prevent rate limit bypass attacks
- Use X-Real-IP header for proper IP detection
- Implement rate limit for authentication endpoints
- Log rate limit violations for security monitoring
-
Redis Security:
- Use authentication if enabled
- Implement TLS for Redis connections
- Use separate database numbers for different environments
- Limit Redis commands to prevent abuse
Performance Considerations
-
Cache Performance:
- Benchmark cache operations
- Monitor cache latency
- Optimize cache key size
- Use appropriate data structures
-
Rate Limit Performance:
- Use efficient rate limiting algorithm
- Minimize middleware overhead
- Cache rate limit decisions
- Batch rate limit checks where possible
-
Memory Management:
- Set reasonable cache size limits
- Monitor memory usage
- Implement cache eviction policies
- Use memory-efficient data structures
Migration Strategy
From No Cache to In-Memory Cache
- Implement cache interface and in-memory service
- Add cache configuration with sensible defaults
- Gradually add caching to critical endpoints
- Monitor cache performance and hit ratios
- Adjust TTLs based on usage patterns
From In-Memory to Redis Cache
- Set up Dragonfly/KeyDB in development
- Implement Redis cache service
- Add fallback logic (Redis → in-memory)
- Test with both caches enabled
- Gradually migrate to Redis-only
- Monitor distributed cache performance
From No Rate Limiting to Rate Limiting
- Implement rate limiter with generous limits
- Add monitoring for rate limit events
- Gradually tighten limits based on usage
- Add IP whitelist for critical services
- Implement endpoint-specific limits
- Monitor and adjust as needed
Alternatives Considered
Cache Libraries
github.com/bluele/gcache- More features but more complexgithub.com/allegro/bigcache- High performance but no TTLgithub.com/coocood/freecache- Very fast but limited API
Redis Alternatives
- Redis Enterprise - Commercial, not open-source
- Memcached - No persistence, simpler protocol
- Couchbase - More complex, document-oriented
Rate Limiting Libraries
golang.org/x/time/rate- Simple but no distributed supportgithub.com/juju/ratelimit- Good but limited features- Custom implementation - Too much development effort
Success Metrics
-
Cache Effectiveness:
- Cache hit ratio > 80%
- Average cache latency < 1ms
- Memory usage within limits
-
Rate Limiting Effectiveness:
- < 1% of legitimate requests blocked
- Effective protection against abuse
- No impact on normal usage patterns
-
System Stability:
- Reduced database load by 50%
- Consistent response times under load
- No cache-related outages
Risks and Mitigations
| Risk | Mitigation |
|---|---|
| Cache stampede | Implement cache warming and fallback logic |
| Memory exhaustion | Set reasonable cache size limits and monitor usage |
| Redis failure | Implement fallback to in-memory cache |
| Rate limit false positives | Start with generous limits and monitor |
| Performance degradation | Benchmark before and after implementation |
| Cache inconsistency | Use appropriate invalidation strategies |
Future Enhancements
- Cache Pre-warming - Load frequently used data at startup
- Two-Level Caching - Local cache + distributed cache
- Cache Compression - For large cache objects
- Dynamic Rate Limits - Adjust based on system load
- User-Specific Rate Limits - Different limits for different user tiers
- Cache Analytics - Detailed usage patterns and optimization
References
- go-cache documentation
- Dragonfly documentation
- KeyDB documentation
- limiter/v3 documentation
- Chi middleware documentation
Decision Drivers
- Simplicity - Easy to implement and maintain
- Performance - Minimal impact on response times
- Scalability - Support for horizontal scaling
- Reliability - Graceful degradation on failures
- Open Source - Preference for open-source solutions
- Community - Active development and support
Conclusion
This ADR proposes a comprehensive caching and rate limiting strategy that will significantly improve the performance, scalability, and reliability of the dance-lessons-coach application. The phased approach allows for gradual implementation and testing, minimizing risk while delivering value at each stage.
The combination of in-memory caching for single-instance deployments and Redis-compatible caching for distributed environments provides flexibility for different deployment scenarios. The rate limiting implementation will protect the application from abuse while maintaining a good user experience.
This strategy aligns with our architectural principles of simplicity, performance, and scalability while using well-established open-source technologies with strong community support.