Full pass over all 25 ADRs to align documentation with actual implementation state. Changes by ADR: README index: completely rewritten — previous table mapped numbers to wrong titles from 0010 onward. 0008 (BDD Testing): added note that flat features/ structure and godog CLI invocation are superseded by ADR-0024; framework decision stands. 0009 (Hybrid Testing): renamed from "Combine BDD and Swagger-based testing" to "BDD Testing with OpenAPI Documentation"; clarified that the SDK-testing layer was never built and has no open issue. 0013 (OpenAPI/Swagger): removed leftover merge conflict artifact (=======) and duplicated 60-line block. 0015 (Cobra CLI): fixed status contradiction — body said "Implemented" while footer said "Proposed". Now Accepted. 0018 (User Management): status Proposed → Accepted; system is fully implemented (JWT, bcrypt, GORM repos all present). 0019 (PostgreSQL): status Proposed → Accepted (Partial); added warning that sqlite_repository.go and gorm/driver/sqlite still present contrary to ADR intent. 0021 (JWT Retention): fixed wrong cross-reference (previously cited ADR-0009 "Hybrid Testing" as source of JWT multi-secret support); fixed title number from "10" to "21"; clarified that base JWT is implemented but the retention cleanup job is not. 0022 (Rate Limiting/Cache): added warning block linking to open Gitea issue #13; changed all 20 false ✅ implementation checkboxes to ❌. 0023 (Config Hot Reloading): added note that BDD scenarios exist for this feature but the feature itself is not yet implemented. 0024 (BDD Organization): status Proposed → Accepted; modular domain structure is fully built. 0025 (BDD Scenario Isolation): status Proposed → Accepted (Partial); Phase 1 done, Phase 2 blocked on ADR-0022. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
17 KiB
ADR 0022: Rate Limiting and Cache Strategy
Status
Proposed 🟡
⚠️ Not yet implemented. Gitea issue #13 ("feat: Implement Rate Limiting and Caching Strategy") is open and tracks this work.
go-cache,redis, andulule/limiterare absent fromgo.mod. The phase checkboxes below are corrected to reflect actual status.
Context
As the dance-lessons-coach application grows and potentially serves multiple users simultaneously, we need to implement rate limiting to:
- Prevent abuse of API endpoints
- Protect against DDoS attacks
- Ensure fair usage across all users
- Maintain system stability under load
- Provide consistent performance
Additionally, we need a caching strategy to:
- Reduce database load for frequently accessed data
- Improve response times for common requests
- Support horizontal scaling with shared cache
- Handle cache invalidation properly
Decision
We will implement a multi-phase caching and rate limiting strategy with the following components:
Phase 1: In-Memory Cache with TTL Support
Library Selection: We will use github.com/patrickmn/go-cache for in-memory caching because:
✅ Pros:
- Simple, lightweight, and well-maintained
- Built-in TTL (Time-To-Live) support
- Thread-safe by default
- No external dependencies
- Good performance for single-instance applications
- Supports automatic expiration
❌ Cons:
- Not shared between multiple instances
- Memory-bound (not persistent)
- Limited advanced features
Implementation Plan:
type CacheService interface {
Set(key string, value interface{}, expiration time.Duration) error
Get(key string) (interface{}, bool)
Delete(key string) error
Flush() error
GetWithTTL(key string) (interface{}, time.Duration, bool)
}
type InMemoryCacheService struct {
cache *cache.Cache
defaultTTL time.Duration
cleanupInterval time.Duration
}
Use Cases:
- JWT token validation results
- User session data
- Frequently accessed greet messages
- API response caching for idempotent endpoints
Phase 2: Redis-Compatible Shared Cache
Library Selection: We will use github.com/redis/go-redis/v9 with a Redis-compatible open-source alternative:
Primary Choice: Dragonfly (https://www.dragonflydb.io/)
- Redis-compatible
- Open-source (Apache 2.0 license)
- Written in C++ with multi-threaded architecture
- 25x higher throughput than Redis
- Lower latency
- Drop-in Redis replacement
Fallback Choice: KeyDB (https://keydb.dev/)
- Multi-threaded Redis fork
- Open-source (GPL license)
- Better performance than Redis
- Full Redis API compatibility
Implementation Plan:
type RedisCacheService struct {
client *redis.Client
defaultTTL time.Duration
prefix string
}
func NewRedisCacheService(config *config.CacheConfig) (*RedisCacheService, error) {
client := redis.NewClient(&redis.Options{
Addr: config.Host + ":" + strconv.Itoa(config.Port),
Password: config.Password,
DB: config.Database,
PoolSize: config.PoolSize,
})
// Test connection
_, err := client.Ping(context.Background()).Result()
if err != nil {
return nil, fmt.Errorf("failed to connect to Redis: %w", err)
}
return &RedisCacheService{
client: client,
defaultTTL: config.DefaultTTL,
prefix: config.Prefix,
}, nil
}
Configuration:
cache:
# In-memory cache configuration
in_memory:
enabled: true
default_ttl: 5m
cleanup_interval: 10m
max_items: 10000
# Redis-compatible cache configuration
redis:
enabled: false
host: "localhost"
port: 6379
password: ""
database: 0
pool_size: 10
default_ttl: 5m
prefix: "dlc:"
use_dragonfly: true # Set to false to use KeyDB
Phase 3: Rate Limiting Implementation
Library Selection: We will use github.com/ulule/limiter/v3 because:
✅ Pros:
- Multiple storage backends (in-memory, Redis, etc.)
- Sliding window algorithm
- Distributed rate limiting support
- Configurable rate limits
- Middleware support for Chi router
- Good performance
Implementation Plan:
// Rate limit configuration
type RateLimitConfig struct {
Enabled bool `mapstructure:"enabled"`
RequestsPerHour int `mapstructure:"requests_per_hour"`
BurstLimit int `mapstructure:"burst_limit"`
IPWhitelist []string `mapstructure:"ip_whitelist"`
EndpointSpecific map[string]struct {
RequestsPerHour int `mapstructure:"requests_per_hour"`
BurstLimit int `mapstructure:"burst_limit"`
} `mapstructure:"endpoint_specific"`
}
// Rate limiter service
type RateLimiterService struct {
limiter *limiter.Limiter
store limiter.Store
config *RateLimitConfig
}
func NewRateLimiterService(config *RateLimitConfig) (*RateLimiterService, error) {
var store limiter.Store
// Use Redis if available, otherwise use in-memory
if config.UseRedis {
// Initialize Redis store
store, err = limiter.NewStoreRedisWithOptions(&limiter.StoreOptions{
Prefix: config.RedisPrefix,
// ... other Redis options
})
} else {
// Use in-memory store
store = limiter.NewStoreMemory()
}
if err != nil {
return nil, fmt.Errorf("failed to create rate limiter store: %w", err)
}
// Create rate limiter
rate := limiter.Rate{
Period: time.Hour,
Limit: int64(config.RequestsPerHour),
}
return &RateLimiterService{
limiter: limiter.New(store, rate),
store: store,
config: config,
}, nil
}
Chi Middleware:
func RateLimitMiddleware(limiter *RateLimiterService) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Skip rate limiting for whitelisted IPs
clientIP := r.Header.Get("X-Real-IP")
if clientIP == "" {
clientIP = r.RemoteAddr
}
for _, allowedIP := range limiter.config.IPWhitelist {
if clientIP == allowedIP {
next.ServeHTTP(w, r)
return
}
}
// Get rate limit context
context, err := limiter.limiter.Get(r.Context(), clientIP)
if err != nil {
log.Error().Err(err).Str("ip", clientIP).Msg("Rate limit error")
http.Error(w, "Internal server error", http.StatusInternalServerError)
return
}
// Check if rate limit is exceeded
if context.Reached > 0 {
w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
w.Header().Set("X-RateLimit-Remaining", "0")
w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))
http.Error(w, "Too many requests", http.StatusTooManyRequests)
return
}
// Set rate limit headers
w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
w.Header().Set("X-RateLimit-Remaining", strconv.Itoa(limiter.config.RequestsPerHour-int(context.Reached)))
w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))
next.ServeHTTP(w, r)
})
}
}
Phase 4: Cache Invalidation Strategy
Approach: Hybrid cache invalidation with multiple strategies:
-
Time-Based Expiration (TTL)
- All cache entries have a TTL
- Automatic expiration prevents stale data
- Default TTL: 5 minutes for most data
-
Event-Based Invalidation
- Cache keys are invalidated on specific events
- Example: User data cache invalidated on user update
- Uses pub/sub pattern for distributed invalidation
-
Versioned Cache Keys
- Cache keys include data version
- When data changes, version increments
- Old cache entries naturally expire
-
Write-Through Caching
- Data written to database and cache simultaneously
- Ensures cache is always up-to-date
- Used for critical data that must be consistent
Cache Key Strategy:
func GetCacheKey(prefix, entityType, entityID string) string {
return fmt.Sprintf("%s:%s:%s", prefix, entityType, entityID)
}
// Example: "dlc:user:123"
// Example: "dlc:jwt:validation:token_hash"
Implementation Phases
Phase 1: In-Memory Cache (Current Sprint)
- ❌ Research and select in-memory cache library
- ❌ Implement cache interface and in-memory service
- ❌ Add cache configuration to config package
- ❌ Implement basic cache operations (set, get, delete)
- ❌ Add TTL support and automatic cleanup
- ❌ Cache JWT validation results
- ❌ Add cache metrics and monitoring
Phase 2: Redis-Compatible Cache (Next Sprint)
- ❌ Set up Dragonfly/KeyDB in development environment
- ❌ Implement Redis cache service
- ❌ Add configuration for Redis connection
- ❌ Implement cache fallback strategy (Redis → in-memory)
- ❌ Add health checks for Redis connection
- ❌ Implement distributed cache invalidation
Phase 3: Rate Limiting (Following Sprint)
- ❌ Research and select rate limiting library
- ❌ Implement rate limiter service
- ❌ Add rate limit configuration
- ❌ Implement Chi middleware for rate limiting
- ❌ Add rate limit headers to responses
- ❌ Implement IP whitelisting
- ❌ Add endpoint-specific rate limits
Phase 4: Advanced Features (Future)
- ❌ Cache warming for critical data
- ❌ Two-level caching (Redis + in-memory)
- ❌ Cache compression for large objects
- ❌ Rate limit exemptions for admin users
- ❌ Dynamic rate limit adjustment
- ❌ Cache analytics and usage patterns
Configuration
# Cache configuration
cache:
in_memory:
enabled: true
default_ttl: "5m"
cleanup_interval: "10m"
max_items: 10000
redis:
enabled: false
host: "localhost"
port: 6379
password: ""
database: 0
pool_size: 10
default_ttl: "5m"
prefix: "dlc:"
use_dragonfly: true
# Rate limiting configuration
rate_limiting:
enabled: true
requests_per_hour: 1000
burst_limit: 100
ip_whitelist:
- "127.0.0.1"
- "::1"
endpoint_specific:
"/api/v1/auth/login":
requests_per_hour: 100
burst_limit: 10
"/api/v1/auth/register":
requests_per_hour: 50
burst_limit: 5
Monitoring and Metrics
Cache Metrics:
- Cache hit/miss ratio
- Average cache latency
- Cache size and memory usage
- Eviction rate
- TTL distribution
Rate Limit Metrics:
- Requests allowed vs rejected
- Rate limit exceeded events
- Top limited IPs
- Endpoint-specific rate limit usage
Prometheus Metrics:
var (
cacheHits = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "cache_hits_total",
Help: "Number of cache hits",
}, []string{"cache_type", "entity_type"})
cacheMisses = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "cache_misses_total",
Help: "Number of cache misses",
}, []string{"cache_type", "entity_type"})
rateLimitExceeded = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "rate_limit_exceeded_total",
Help: "Number of rate limit exceeded events",
}, []string{"endpoint", "ip"})
)
Security Considerations
-
Cache Security:
- Never cache sensitive user data (passwords, tokens)
- Use separate cache prefixes for different data types
- Implement cache key hashing for sensitive data
- Set appropriate TTLs to limit exposure
-
Rate Limit Security:
- Prevent rate limit bypass attacks
- Use X-Real-IP header for proper IP detection
- Implement rate limit for authentication endpoints
- Log rate limit violations for security monitoring
-
Redis Security:
- Use authentication if enabled
- Implement TLS for Redis connections
- Use separate database numbers for different environments
- Limit Redis commands to prevent abuse
Performance Considerations
-
Cache Performance:
- Benchmark cache operations
- Monitor cache latency
- Optimize cache key size
- Use appropriate data structures
-
Rate Limit Performance:
- Use efficient rate limiting algorithm
- Minimize middleware overhead
- Cache rate limit decisions
- Batch rate limit checks where possible
-
Memory Management:
- Set reasonable cache size limits
- Monitor memory usage
- Implement cache eviction policies
- Use memory-efficient data structures
Migration Strategy
From No Cache to In-Memory Cache
- Implement cache interface and in-memory service
- Add cache configuration with sensible defaults
- Gradually add caching to critical endpoints
- Monitor cache performance and hit ratios
- Adjust TTLs based on usage patterns
From In-Memory to Redis Cache
- Set up Dragonfly/KeyDB in development
- Implement Redis cache service
- Add fallback logic (Redis → in-memory)
- Test with both caches enabled
- Gradually migrate to Redis-only
- Monitor distributed cache performance
From No Rate Limiting to Rate Limiting
- Implement rate limiter with generous limits
- Add monitoring for rate limit events
- Gradually tighten limits based on usage
- Add IP whitelist for critical services
- Implement endpoint-specific limits
- Monitor and adjust as needed
Alternatives Considered
Cache Libraries
github.com/bluele/gcache- More features but more complexgithub.com/allegro/bigcache- High performance but no TTLgithub.com/coocood/freecache- Very fast but limited API
Redis Alternatives
- Redis Enterprise - Commercial, not open-source
- Memcached - No persistence, simpler protocol
- Couchbase - More complex, document-oriented
Rate Limiting Libraries
golang.org/x/time/rate- Simple but no distributed supportgithub.com/juju/ratelimit- Good but limited features- Custom implementation - Too much development effort
Success Metrics
-
Cache Effectiveness:
- Cache hit ratio > 80%
- Average cache latency < 1ms
- Memory usage within limits
-
Rate Limiting Effectiveness:
- < 1% of legitimate requests blocked
- Effective protection against abuse
- No impact on normal usage patterns
-
System Stability:
- Reduced database load by 50%
- Consistent response times under load
- No cache-related outages
Risks and Mitigations
| Risk | Mitigation |
|---|---|
| Cache stampede | Implement cache warming and fallback logic |
| Memory exhaustion | Set reasonable cache size limits and monitor usage |
| Redis failure | Implement fallback to in-memory cache |
| Rate limit false positives | Start with generous limits and monitor |
| Performance degradation | Benchmark before and after implementation |
| Cache inconsistency | Use appropriate invalidation strategies |
Future Enhancements
- Cache Pre-warming - Load frequently used data at startup
- Two-Level Caching - Local cache + distributed cache
- Cache Compression - For large cache objects
- Dynamic Rate Limits - Adjust based on system load
- User-Specific Rate Limits - Different limits for different user tiers
- Cache Analytics - Detailed usage patterns and optimization
References
- go-cache documentation
- Dragonfly documentation
- KeyDB documentation
- limiter/v3 documentation
- Chi middleware documentation
Decision Drivers
- Simplicity - Easy to implement and maintain
- Performance - Minimal impact on response times
- Scalability - Support for horizontal scaling
- Reliability - Graceful degradation on failures
- Open Source - Preference for open-source solutions
- Community - Active development and support
Conclusion
This ADR proposes a comprehensive caching and rate limiting strategy that will significantly improve the performance, scalability, and reliability of the dance-lessons-coach application. The phased approach allows for gradual implementation and testing, minimizing risk while delivering value at each stage.
The combination of in-memory caching for single-instance deployments and Redis-compatible caching for distributed environments provides flexibility for different deployment scenarios. The rate limiting implementation will protect the application from abuse while maintaining a good user experience.
This strategy aligns with our architectural principles of simplicity, performance, and scalability while using well-established open-source technologies with strong community support.