Files
dance-lessons-coach/adr/0022-rate-limiting-cache-strategy.md
Gabriel Radureau 73a3af1552 📝 docs: audit and correct all ADR statuses and content
Full pass over all 25 ADRs to align documentation with actual
implementation state. Changes by ADR:

README index: completely rewritten — previous table mapped numbers to
wrong titles from 0010 onward.

0008 (BDD Testing): added note that flat features/ structure and godog
CLI invocation are superseded by ADR-0024; framework decision stands.

0009 (Hybrid Testing): renamed from "Combine BDD and Swagger-based
testing" to "BDD Testing with OpenAPI Documentation"; clarified that
the SDK-testing layer was never built and has no open issue.

0013 (OpenAPI/Swagger): removed leftover merge conflict artifact
(=======) and duplicated 60-line block.

0015 (Cobra CLI): fixed status contradiction — body said "Implemented"
while footer said "Proposed". Now Accepted.

0018 (User Management): status Proposed → Accepted; system is fully
implemented (JWT, bcrypt, GORM repos all present).

0019 (PostgreSQL): status Proposed → Accepted (Partial); added warning
that sqlite_repository.go and gorm/driver/sqlite still present contrary
to ADR intent.

0021 (JWT Retention): fixed wrong cross-reference (previously cited
ADR-0009 "Hybrid Testing" as source of JWT multi-secret support); fixed
title number from "10" to "21"; clarified that base JWT is implemented
but the retention cleanup job is not.

0022 (Rate Limiting/Cache): added warning block linking to open Gitea
issue #13; changed all 20 false  implementation checkboxes to .

0023 (Config Hot Reloading): added note that BDD scenarios exist for
this feature but the feature itself is not yet implemented.

0024 (BDD Organization): status Proposed → Accepted; modular domain
structure is fully built.

0025 (BDD Scenario Isolation): status Proposed → Accepted (Partial);
Phase 1 done, Phase 2 blocked on ADR-0022.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 23:26:09 +02:00

17 KiB

ADR 0022: Rate Limiting and Cache Strategy

Status

Proposed 🟡

⚠️ Not yet implemented. Gitea issue #13 ("feat: Implement Rate Limiting and Caching Strategy") is open and tracks this work. go-cache, redis, and ulule/limiter are absent from go.mod. The phase checkboxes below are corrected to reflect actual status.

Context

As the dance-lessons-coach application grows and potentially serves multiple users simultaneously, we need to implement rate limiting to:

  1. Prevent abuse of API endpoints
  2. Protect against DDoS attacks
  3. Ensure fair usage across all users
  4. Maintain system stability under load
  5. Provide consistent performance

Additionally, we need a caching strategy to:

  1. Reduce database load for frequently accessed data
  2. Improve response times for common requests
  3. Support horizontal scaling with shared cache
  4. Handle cache invalidation properly

Decision

We will implement a multi-phase caching and rate limiting strategy with the following components:

Phase 1: In-Memory Cache with TTL Support

Library Selection: We will use github.com/patrickmn/go-cache for in-memory caching because:

Pros:

  • Simple, lightweight, and well-maintained
  • Built-in TTL (Time-To-Live) support
  • Thread-safe by default
  • No external dependencies
  • Good performance for single-instance applications
  • Supports automatic expiration

Cons:

  • Not shared between multiple instances
  • Memory-bound (not persistent)
  • Limited advanced features

Implementation Plan:

type CacheService interface {
    Set(key string, value interface{}, expiration time.Duration) error
    Get(key string) (interface{}, bool)
    Delete(key string) error
    Flush() error
    GetWithTTL(key string) (interface{}, time.Duration, bool)
}

type InMemoryCacheService struct {
    cache *cache.Cache
    defaultTTL time.Duration
    cleanupInterval time.Duration
}

Use Cases:

  • JWT token validation results
  • User session data
  • Frequently accessed greet messages
  • API response caching for idempotent endpoints

Phase 2: Redis-Compatible Shared Cache

Library Selection: We will use github.com/redis/go-redis/v9 with a Redis-compatible open-source alternative:

Primary Choice: Dragonfly (https://www.dragonflydb.io/)

  • Redis-compatible
  • Open-source (Apache 2.0 license)
  • Written in C++ with multi-threaded architecture
  • 25x higher throughput than Redis
  • Lower latency
  • Drop-in Redis replacement

Fallback Choice: KeyDB (https://keydb.dev/)

  • Multi-threaded Redis fork
  • Open-source (GPL license)
  • Better performance than Redis
  • Full Redis API compatibility

Implementation Plan:

type RedisCacheService struct {
    client *redis.Client
    defaultTTL time.Duration
    prefix string
}

func NewRedisCacheService(config *config.CacheConfig) (*RedisCacheService, error) {
    client := redis.NewClient(&redis.Options{
        Addr:     config.Host + ":" + strconv.Itoa(config.Port),
        Password: config.Password,
        DB:       config.Database,
        PoolSize: config.PoolSize,
    })
    
    // Test connection
    _, err := client.Ping(context.Background()).Result()
    if err != nil {
        return nil, fmt.Errorf("failed to connect to Redis: %w", err)
    }
    
    return &RedisCacheService{
        client: client,
        defaultTTL: config.DefaultTTL,
        prefix: config.Prefix,
    }, nil
}

Configuration:

cache:
  # In-memory cache configuration
  in_memory:
    enabled: true
    default_ttl: 5m
    cleanup_interval: 10m
    max_items: 10000
  
  # Redis-compatible cache configuration
  redis:
    enabled: false
    host: "localhost"
    port: 6379
    password: ""
    database: 0
    pool_size: 10
    default_ttl: 5m
    prefix: "dlc:"
    use_dragonfly: true  # Set to false to use KeyDB

Phase 3: Rate Limiting Implementation

Library Selection: We will use github.com/ulule/limiter/v3 because:

Pros:

  • Multiple storage backends (in-memory, Redis, etc.)
  • Sliding window algorithm
  • Distributed rate limiting support
  • Configurable rate limits
  • Middleware support for Chi router
  • Good performance

Implementation Plan:

// Rate limit configuration
type RateLimitConfig struct {
    Enabled          bool          `mapstructure:"enabled"`
    RequestsPerHour  int           `mapstructure:"requests_per_hour"`
    BurstLimit        int           `mapstructure:"burst_limit"`
    IPWhitelist       []string      `mapstructure:"ip_whitelist"`
    EndpointSpecific  map[string]struct {
        RequestsPerHour int `mapstructure:"requests_per_hour"`
        BurstLimit      int `mapstructure:"burst_limit"`
    } `mapstructure:"endpoint_specific"`
}

// Rate limiter service
type RateLimiterService struct {
    limiter *limiter.Limiter
    store   limiter.Store
    config  *RateLimitConfig
}

func NewRateLimiterService(config *RateLimitConfig) (*RateLimiterService, error) {
    var store limiter.Store
    
    // Use Redis if available, otherwise use in-memory
    if config.UseRedis {
        // Initialize Redis store
        store, err = limiter.NewStoreRedisWithOptions(&limiter.StoreOptions{
            Prefix: config.RedisPrefix,
            // ... other Redis options
        })
    } else {
        // Use in-memory store
        store = limiter.NewStoreMemory()
    }
    
    if err != nil {
        return nil, fmt.Errorf("failed to create rate limiter store: %w", err)
    }
    
    // Create rate limiter
    rate := limiter.Rate{
        Period: time.Hour,
        Limit:  int64(config.RequestsPerHour),
    }
    
    return &RateLimiterService{
        limiter: limiter.New(store, rate),
        store:   store,
        config:  config,
    }, nil
}

Chi Middleware:

func RateLimitMiddleware(limiter *RateLimiterService) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            // Skip rate limiting for whitelisted IPs
            clientIP := r.Header.Get("X-Real-IP")
            if clientIP == "" {
                clientIP = r.RemoteAddr
            }
            
            for _, allowedIP := range limiter.config.IPWhitelist {
                if clientIP == allowedIP {
                    next.ServeHTTP(w, r)
                    return
                }
            }
            
            // Get rate limit context
            context, err := limiter.limiter.Get(r.Context(), clientIP)
            if err != nil {
                log.Error().Err(err).Str("ip", clientIP).Msg("Rate limit error")
                http.Error(w, "Internal server error", http.StatusInternalServerError)
                return
            }
            
            // Check if rate limit is exceeded
            if context.Reached > 0 {
                w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
                w.Header().Set("X-RateLimit-Remaining", "0")
                w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))
                
                http.Error(w, "Too many requests", http.StatusTooManyRequests)
                return
            }
            
            // Set rate limit headers
            w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
            w.Header().Set("X-RateLimit-Remaining", strconv.Itoa(limiter.config.RequestsPerHour-int(context.Reached)))
            w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))
            
            next.ServeHTTP(w, r)
        })
    }
}

Phase 4: Cache Invalidation Strategy

Approach: Hybrid cache invalidation with multiple strategies:

  1. Time-Based Expiration (TTL)

    • All cache entries have a TTL
    • Automatic expiration prevents stale data
    • Default TTL: 5 minutes for most data
  2. Event-Based Invalidation

    • Cache keys are invalidated on specific events
    • Example: User data cache invalidated on user update
    • Uses pub/sub pattern for distributed invalidation
  3. Versioned Cache Keys

    • Cache keys include data version
    • When data changes, version increments
    • Old cache entries naturally expire
  4. Write-Through Caching

    • Data written to database and cache simultaneously
    • Ensures cache is always up-to-date
    • Used for critical data that must be consistent

Cache Key Strategy:

func GetCacheKey(prefix, entityType, entityID string) string {
    return fmt.Sprintf("%s:%s:%s", prefix, entityType, entityID)
}

// Example: "dlc:user:123"
// Example: "dlc:jwt:validation:token_hash"

Implementation Phases

Phase 1: In-Memory Cache (Current Sprint)

  • Research and select in-memory cache library
  • Implement cache interface and in-memory service
  • Add cache configuration to config package
  • Implement basic cache operations (set, get, delete)
  • Add TTL support and automatic cleanup
  • Cache JWT validation results
  • Add cache metrics and monitoring

Phase 2: Redis-Compatible Cache (Next Sprint)

  • Set up Dragonfly/KeyDB in development environment
  • Implement Redis cache service
  • Add configuration for Redis connection
  • Implement cache fallback strategy (Redis → in-memory)
  • Add health checks for Redis connection
  • Implement distributed cache invalidation

Phase 3: Rate Limiting (Following Sprint)

  • Research and select rate limiting library
  • Implement rate limiter service
  • Add rate limit configuration
  • Implement Chi middleware for rate limiting
  • Add rate limit headers to responses
  • Implement IP whitelisting
  • Add endpoint-specific rate limits

Phase 4: Advanced Features (Future)

  • Cache warming for critical data
  • Two-level caching (Redis + in-memory)
  • Cache compression for large objects
  • Rate limit exemptions for admin users
  • Dynamic rate limit adjustment
  • Cache analytics and usage patterns

Configuration

# Cache configuration
cache:
  in_memory:
    enabled: true
    default_ttl: "5m"
    cleanup_interval: "10m"
    max_items: 10000
  
  redis:
    enabled: false
    host: "localhost"
    port: 6379
    password: ""
    database: 0
    pool_size: 10
    default_ttl: "5m"
    prefix: "dlc:"
    use_dragonfly: true

# Rate limiting configuration
rate_limiting:
  enabled: true
  requests_per_hour: 1000
  burst_limit: 100
  ip_whitelist:
    - "127.0.0.1"
    - "::1"
  endpoint_specific:
    "/api/v1/auth/login":
      requests_per_hour: 100
      burst_limit: 10
    "/api/v1/auth/register":
      requests_per_hour: 50
      burst_limit: 5

Monitoring and Metrics

Cache Metrics:

  • Cache hit/miss ratio
  • Average cache latency
  • Cache size and memory usage
  • Eviction rate
  • TTL distribution

Rate Limit Metrics:

  • Requests allowed vs rejected
  • Rate limit exceeded events
  • Top limited IPs
  • Endpoint-specific rate limit usage

Prometheus Metrics:

var (
    cacheHits = prometheus.NewCounterVec(prometheus.CounterOpts{
        Name: "cache_hits_total",
        Help: "Number of cache hits",
    }, []string{"cache_type", "entity_type"})
    
    cacheMisses = prometheus.NewCounterVec(prometheus.CounterOpts{
        Name: "cache_misses_total",
        Help: "Number of cache misses",
    }, []string{"cache_type", "entity_type"})
    
    rateLimitExceeded = prometheus.NewCounterVec(prometheus.CounterOpts{
        Name: "rate_limit_exceeded_total",
        Help: "Number of rate limit exceeded events",
    }, []string{"endpoint", "ip"})
)

Security Considerations

  1. Cache Security:

    • Never cache sensitive user data (passwords, tokens)
    • Use separate cache prefixes for different data types
    • Implement cache key hashing for sensitive data
    • Set appropriate TTLs to limit exposure
  2. Rate Limit Security:

    • Prevent rate limit bypass attacks
    • Use X-Real-IP header for proper IP detection
    • Implement rate limit for authentication endpoints
    • Log rate limit violations for security monitoring
  3. Redis Security:

    • Use authentication if enabled
    • Implement TLS for Redis connections
    • Use separate database numbers for different environments
    • Limit Redis commands to prevent abuse

Performance Considerations

  1. Cache Performance:

    • Benchmark cache operations
    • Monitor cache latency
    • Optimize cache key size
    • Use appropriate data structures
  2. Rate Limit Performance:

    • Use efficient rate limiting algorithm
    • Minimize middleware overhead
    • Cache rate limit decisions
    • Batch rate limit checks where possible
  3. Memory Management:

    • Set reasonable cache size limits
    • Monitor memory usage
    • Implement cache eviction policies
    • Use memory-efficient data structures

Migration Strategy

From No Cache to In-Memory Cache

  1. Implement cache interface and in-memory service
  2. Add cache configuration with sensible defaults
  3. Gradually add caching to critical endpoints
  4. Monitor cache performance and hit ratios
  5. Adjust TTLs based on usage patterns

From In-Memory to Redis Cache

  1. Set up Dragonfly/KeyDB in development
  2. Implement Redis cache service
  3. Add fallback logic (Redis → in-memory)
  4. Test with both caches enabled
  5. Gradually migrate to Redis-only
  6. Monitor distributed cache performance

From No Rate Limiting to Rate Limiting

  1. Implement rate limiter with generous limits
  2. Add monitoring for rate limit events
  3. Gradually tighten limits based on usage
  4. Add IP whitelist for critical services
  5. Implement endpoint-specific limits
  6. Monitor and adjust as needed

Alternatives Considered

Cache Libraries

  1. github.com/bluele/gcache - More features but more complex
  2. github.com/allegro/bigcache - High performance but no TTL
  3. github.com/coocood/freecache - Very fast but limited API

Redis Alternatives

  1. Redis Enterprise - Commercial, not open-source
  2. Memcached - No persistence, simpler protocol
  3. Couchbase - More complex, document-oriented

Rate Limiting Libraries

  1. golang.org/x/time/rate - Simple but no distributed support
  2. github.com/juju/ratelimit - Good but limited features
  3. Custom implementation - Too much development effort

Success Metrics

  1. Cache Effectiveness:

    • Cache hit ratio > 80%
    • Average cache latency < 1ms
    • Memory usage within limits
  2. Rate Limiting Effectiveness:

    • < 1% of legitimate requests blocked
    • Effective protection against abuse
    • No impact on normal usage patterns
  3. System Stability:

    • Reduced database load by 50%
    • Consistent response times under load
    • No cache-related outages

Risks and Mitigations

Risk Mitigation
Cache stampede Implement cache warming and fallback logic
Memory exhaustion Set reasonable cache size limits and monitor usage
Redis failure Implement fallback to in-memory cache
Rate limit false positives Start with generous limits and monitor
Performance degradation Benchmark before and after implementation
Cache inconsistency Use appropriate invalidation strategies

Future Enhancements

  1. Cache Pre-warming - Load frequently used data at startup
  2. Two-Level Caching - Local cache + distributed cache
  3. Cache Compression - For large cache objects
  4. Dynamic Rate Limits - Adjust based on system load
  5. User-Specific Rate Limits - Different limits for different user tiers
  6. Cache Analytics - Detailed usage patterns and optimization

References

Decision Drivers

  1. Simplicity - Easy to implement and maintain
  2. Performance - Minimal impact on response times
  3. Scalability - Support for horizontal scaling
  4. Reliability - Graceful degradation on failures
  5. Open Source - Preference for open-source solutions
  6. Community - Active development and support

Conclusion

This ADR proposes a comprehensive caching and rate limiting strategy that will significantly improve the performance, scalability, and reliability of the dance-lessons-coach application. The phased approach allows for gradual implementation and testing, minimizing risk while delivering value at each stage.

The combination of in-memory caching for single-instance deployments and Redis-compatible caching for distributed environments provides flexibility for different deployment scenarios. The rate limiting implementation will protect the application from abuse while maintaining a good user experience.

This strategy aligns with our architectural principles of simplicity, performance, and scalability while using well-established open-source technologies with strong community support.