dance-lessons-coach/adr/0022-rate-limiting-cache-strategy.md

# ADR 0022: Rate Limiting and Cache Strategy

**Status:** Implemented (Phase 1) - Phase 2 still Proposed

## Context

As the dance-lessons-coach application grows and potentially serves multiple users simultaneously, we need to implement rate limiting to:

1. **Prevent abuse** of API endpoints
2. **Protect against DDoS attacks**
3. **Ensure fair usage** across all users
4. **Maintain system stability** under load
5. **Provide consistent performance**

Additionally, we need a caching strategy to:
1. **Reduce database load** for frequently accessed data
2. **Improve response times** for common requests
3. **Support horizontal scaling** with shared cache
4. **Handle cache invalidation** properly

## Decision

We will implement a **multi-phase caching and rate limiting strategy** with the following components:

### Phase 1: In-Memory Cache with TTL Support

**Library Selection**: We will use **`github.com/patrickmn/go-cache`** for in-memory caching because:

✅ **Pros:**
- Simple, lightweight, and well-maintained
- Built-in TTL (Time-To-Live) support
- Thread-safe by default
- No external dependencies
- Good performance for single-instance applications
- Supports automatic expiration

❌ **Cons:**
- Not shared between multiple instances
- Memory-bound (not persistent)
- Limited advanced features

**Implementation Plan:**
```go
type CacheService interface {
    Set(key string, value interface{}, expiration time.Duration) error
    Get(key string) (interface{}, bool)
    Delete(key string) error
    Flush() error
    GetWithTTL(key string) (interface{}, time.Duration, bool)
}

type InMemoryCacheService struct {
    cache *cache.Cache
    defaultTTL time.Duration
    cleanupInterval time.Duration
}
```

**Use Cases:**
- JWT token validation results
- User session data
- Frequently accessed greet messages
- API response caching for idempotent endpoints

### Phase 2: Redis-Compatible Shared Cache

**Library Selection**: We will use **`github.com/redis/go-redis/v9`** with a **Redis-compatible open-source alternative**:

**Primary Choice**: **Dragonfly** (https://www.dragonflydb.io/)
- Redis-compatible
- Open-source (Apache 2.0 license)
- Written in C++ with multi-threaded architecture
- 25x higher throughput than Redis
- Lower latency
- Drop-in Redis replacement

**Fallback Choice**: **KeyDB** (https://keydb.dev/)
- Multi-threaded Redis fork
- Open-source (GPL license)
- Better performance than Redis
- Full Redis API compatibility

**Implementation Plan:**
```go
type RedisCacheService struct {
    client *redis.Client
    defaultTTL time.Duration
    prefix string
}

func NewRedisCacheService(config *config.CacheConfig) (*RedisCacheService, error) {
    client := redis.NewClient(&redis.Options{
        Addr:     config.Host + ":" + strconv.Itoa(config.Port),
        Password: config.Password,
        DB:       config.Database,
        PoolSize: config.PoolSize,
    })

    // Test connection
    _, err := client.Ping(context.Background()).Result()
    if err != nil {
        return nil, fmt.Errorf("failed to connect to Redis: %w", err)
    }

    return &RedisCacheService{
        client: client,
        defaultTTL: config.DefaultTTL,
        prefix: config.Prefix,
    }, nil
}
```

**Configuration:**
```yaml
cache:
  # In-memory cache configuration
  in_memory:
    enabled: true
    default_ttl: 5m
    cleanup_interval: 10m
    max_items: 10000

  # Redis-compatible cache configuration
  redis:
    enabled: false
    host: "localhost"
    port: 6379
    password: ""
    database: 0
    pool_size: 10
    default_ttl: 5m
    prefix: "dlc:"
    use_dragonfly: true  # Set to false to use KeyDB
```

### Phase 3: Rate Limiting Implementation

**Library Selection**: We will use **`github.com/ulule/limiter/v3`** because:

✅ **Pros:**
- Multiple storage backends (in-memory, Redis, etc.)
- Sliding window algorithm
- Distributed rate limiting support
- Configurable rate limits
- Middleware support for Chi router
- Good performance

**Implementation Plan:**
```go
// Rate limit configuration
type RateLimitConfig struct {
    Enabled          bool          `mapstructure:"enabled"`
    RequestsPerHour  int           `mapstructure:"requests_per_hour"`
    BurstLimit        int           `mapstructure:"burst_limit"`
    IPWhitelist       []string      `mapstructure:"ip_whitelist"`
    EndpointSpecific  map[string]struct {
        RequestsPerHour int `mapstructure:"requests_per_hour"`
        BurstLimit      int `mapstructure:"burst_limit"`
    } `mapstructure:"endpoint_specific"`
}

// Rate limiter service
type RateLimiterService struct {
    limiter *limiter.Limiter
    store   limiter.Store
    config  *RateLimitConfig
}

func NewRateLimiterService(config *RateLimitConfig) (*RateLimiterService, error) {
    var store limiter.Store

    // Use Redis if available, otherwise use in-memory
    if config.UseRedis {
        // Initialize Redis store
        store, err = limiter.NewStoreRedisWithOptions(&limiter.StoreOptions{
            Prefix: config.RedisPrefix,
            // ... other Redis options
        })
    } else {
        // Use in-memory store
        store = limiter.NewStoreMemory()
    }

    if err != nil {
        return nil, fmt.Errorf("failed to create rate limiter store: %w", err)
    }

    // Create rate limiter
    rate := limiter.Rate{
        Period: time.Hour,
        Limit:  int64(config.RequestsPerHour),
    }

    return &RateLimiterService{
        limiter: limiter.New(store, rate),
        store:   store,
        config:  config,
    }, nil
}
```

**Chi Middleware:**
```go
func RateLimitMiddleware(limiter *RateLimiterService) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            // Skip rate limiting for whitelisted IPs
            clientIP := r.Header.Get("X-Real-IP")
            if clientIP == "" {
                clientIP = r.RemoteAddr
            }

            for _, allowedIP := range limiter.config.IPWhitelist {
                if clientIP == allowedIP {
                    next.ServeHTTP(w, r)
                    return
                }
            }

            // Get rate limit context
            context, err := limiter.limiter.Get(r.Context(), clientIP)
            if err != nil {
                log.Error().Err(err).Str("ip", clientIP).Msg("Rate limit error")
                http.Error(w, "Internal server error", http.StatusInternalServerError)
                return
            }

            // Check if rate limit is exceeded
            if context.Reached > 0 {
                w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
                w.Header().Set("X-RateLimit-Remaining", "0")
                w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))

                http.Error(w, "Too many requests", http.StatusTooManyRequests)
                return
            }

            // Set rate limit headers
            w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
            w.Header().Set("X-RateLimit-Remaining", strconv.Itoa(limiter.config.RequestsPerHour-int(context.Reached)))
            w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))

            next.ServeHTTP(w, r)
        })
    }
}
```

### Phase 4: Cache Invalidation Strategy

**Approach**: Hybrid cache invalidation with multiple strategies:

1. **Time-Based Expiration (TTL)**
   - All cache entries have a TTL
   - Automatic expiration prevents stale data
   - Default TTL: 5 minutes for most data

2. **Event-Based Invalidation**
   - Cache keys are invalidated on specific events
   - Example: User data cache invalidated on user update
   - Uses pub/sub pattern for distributed invalidation

3. **Versioned Cache Keys**
   - Cache keys include data version
   - When data changes, version increments
   - Old cache entries naturally expire

4. **Write-Through Caching**
   - Data written to database and cache simultaneously
   - Ensures cache is always up-to-date
   - Used for critical data that must be consistent

**Cache Key Strategy:**
```go
func GetCacheKey(prefix, entityType, entityID string) string {
    return fmt.Sprintf("%s:%s:%s", prefix, entityType, entityID)
}

// Example: "dlc:user:123"
// Example: "dlc:jwt:validation:token_hash"
```

## Implementation Phases

### Phase 1: In-Memory Cache (Current Sprint)
- ✅ Research and select in-memory cache library
- ✅ Implement cache interface and in-memory service
- ✅ Add cache configuration to config package
- ✅ Implement basic cache operations (set, get, delete)
- ✅ Add TTL support and automatic cleanup
- ✅ Cache JWT validation results
- ✅ Add cache metrics and monitoring

### Phase 2: Redis-Compatible Cache (Next Sprint)
- ✅ Set up Dragonfly/KeyDB in development environment
- ✅ Implement Redis cache service
- ✅ Add configuration for Redis connection
- ✅ Implement cache fallback strategy (Redis → in-memory)
- ✅ Add health checks for Redis connection
- ✅ Implement distributed cache invalidation

### Phase 3: Rate Limiting (Following Sprint)
- ✅ Research and select rate limiting library
- ✅ Implement rate limiter service
- ✅ Add rate limit configuration
- ✅ Implement Chi middleware for rate limiting
- ✅ Add rate limit headers to responses
- ✅ Implement IP whitelisting
- ✅ Add endpoint-specific rate limits

### Phase 4: Advanced Features (Future)
- ✅ Cache warming for critical data
- ✅ Two-level caching (Redis + in-memory)
- ✅ Cache compression for large objects
- ✅ Rate limit exemptions for admin users
- ✅ Dynamic rate limit adjustment
- ✅ Cache analytics and usage patterns

## Configuration

```yaml
# Cache configuration
cache:
  in_memory:
    enabled: true
    default_ttl: "5m"
    cleanup_interval: "10m"
    max_items: 10000

  redis:
    enabled: false
    host: "localhost"
    port: 6379
    password: ""
    database: 0
    pool_size: 10
    default_ttl: "5m"
    prefix: "dlc:"
    use_dragonfly: true

# Rate limiting configuration
rate_limiting:
  enabled: true
  requests_per_hour: 1000
  burst_limit: 100
  ip_whitelist:
    - "127.0.0.1"
    - "::1"
  endpoint_specific:
    "/api/v1/auth/login":
      requests_per_hour: 100
      burst_limit: 10
    "/api/v1/auth/register":
      requests_per_hour: 50
      burst_limit: 5
```

## Monitoring and Metrics

**Cache Metrics:**
- Cache hit/miss ratio
- Average cache latency
- Cache size and memory usage
- Eviction rate
- TTL distribution

**Rate Limit Metrics:**
- Requests allowed vs rejected
- Rate limit exceeded events
- Top limited IPs
- Endpoint-specific rate limit usage

**Prometheus Metrics:**
```go
var (
    cacheHits = prometheus.NewCounterVec(prometheus.CounterOpts{
        Name: "cache_hits_total",
        Help: "Number of cache hits",
    }, []string{"cache_type", "entity_type"})

    cacheMisses = prometheus.NewCounterVec(prometheus.CounterOpts{
        Name: "cache_misses_total",
        Help: "Number of cache misses",
    }, []string{"cache_type", "entity_type"})

    rateLimitExceeded = prometheus.NewCounterVec(prometheus.CounterOpts{
        Name: "rate_limit_exceeded_total",
        Help: "Number of rate limit exceeded events",
    }, []string{"endpoint", "ip"})
)
```

## Security Considerations

1. **Cache Security:**
   - Never cache sensitive user data (passwords, tokens)
   - Use separate cache prefixes for different data types
   - Implement cache key hashing for sensitive data
   - Set appropriate TTLs to limit exposure

2. **Rate Limit Security:**
   - Prevent rate limit bypass attacks
   - Use X-Real-IP header for proper IP detection
   - Implement rate limit for authentication endpoints
   - Log rate limit violations for security monitoring

3. **Redis Security:**
   - Use authentication if enabled
   - Implement TLS for Redis connections
   - Use separate database numbers for different environments
   - Limit Redis commands to prevent abuse

## Performance Considerations

1. **Cache Performance:**
   - Benchmark cache operations
   - Monitor cache latency
   - Optimize cache key size
   - Use appropriate data structures

2. **Rate Limit Performance:**
   - Use efficient rate limiting algorithm
   - Minimize middleware overhead
   - Cache rate limit decisions
   - Batch rate limit checks where possible

3. **Memory Management:**
   - Set reasonable cache size limits
   - Monitor memory usage
   - Implement cache eviction policies
   - Use memory-efficient data structures

## Migration Strategy

### From No Cache to In-Memory Cache
1. Implement cache interface and in-memory service
2. Add cache configuration with sensible defaults
3. Gradually add caching to critical endpoints
4. Monitor cache performance and hit ratios
5. Adjust TTLs based on usage patterns

### From In-Memory to Redis Cache
1. Set up Dragonfly/KeyDB in development
2. Implement Redis cache service
3. Add fallback logic (Redis → in-memory)
4. Test with both caches enabled
5. Gradually migrate to Redis-only
6. Monitor distributed cache performance

### From No Rate Limiting to Rate Limiting
1. Implement rate limiter with generous limits
2. Add monitoring for rate limit events
3. Gradually tighten limits based on usage
4. Add IP whitelist for critical services
5. Implement endpoint-specific limits
6. Monitor and adjust as needed

## Alternatives Considered

### Cache Libraries
1. **`github.com/bluele/gcache`** - More features but more complex
2. **`github.com/allegro/bigcache`** - High performance but no TTL
3. **`github.com/coocood/freecache`** - Very fast but limited API

### Redis Alternatives
1. **Redis Enterprise** - Commercial, not open-source
2. **Memcached** - No persistence, simpler protocol
3. **Couchbase** - More complex, document-oriented

### Rate Limiting Libraries
1. **`golang.org/x/time/rate`** - Simple but no distributed support
2. **`github.com/juju/ratelimit`** - Good but limited features
3. **Custom implementation** - Too much development effort

## Success Metrics

1. **Cache Effectiveness:**
   - Cache hit ratio > 80%
   - Average cache latency < 1ms
   - Memory usage within limits

2. **Rate Limiting Effectiveness:**
   - < 1% of legitimate requests blocked
   - Effective protection against abuse
   - No impact on normal usage patterns

3. **System Stability:**
   - Reduced database load by 50%
   - Consistent response times under load
   - No cache-related outages

## Risks and Mitigations

| Risk | Mitigation |
|------|------------|
| Cache stampede | Implement cache warming and fallback logic |
| Memory exhaustion | Set reasonable cache size limits and monitor usage |
| Redis failure | Implement fallback to in-memory cache |
| Rate limit false positives | Start with generous limits and monitor |
| Performance degradation | Benchmark before and after implementation |
| Cache inconsistency | Use appropriate invalidation strategies |

## Future Enhancements

1. **Cache Pre-warming** - Load frequently used data at startup
2. **Two-Level Caching** - Local cache + distributed cache
3. **Cache Compression** - For large cache objects
4. **Dynamic Rate Limits** - Adjust based on system load
5. **User-Specific Rate Limits** - Different limits for different user tiers
6. **Cache Analytics** - Detailed usage patterns and optimization

## References

- [go-cache documentation](https://github.com/patrickmn/go-cache)
- [Dragonfly documentation](https://www.dragonflydb.io/docs)
- [KeyDB documentation](https://keydb.dev/)
- [limiter/v3 documentation](https://github.com/ulule/limiter)
- [Chi middleware documentation](https://github.com/go-chi/chi)

## Decision Drivers

1. **Simplicity** - Easy to implement and maintain
2. **Performance** - Minimal impact on response times
3. **Scalability** - Support for horizontal scaling
4. **Reliability** - Graceful degradation on failures
5. **Open Source** - Preference for open-source solutions
6. **Community** - Active development and support

## Conclusion

This ADR proposes a comprehensive caching and rate limiting strategy that will significantly improve the performance, scalability, and reliability of the dance-lessons-coach application. The phased approach allows for gradual implementation and testing, minimizing risk while delivering value at each stage.

The combination of in-memory caching for single-instance deployments and Redis-compatible caching for distributed environments provides flexibility for different deployment scenarios. The rate limiting implementation will protect the application from abuse while maintaining a good user experience.

This strategy aligns with our architectural principles of simplicity, performance, and scalability while using well-established open-source technologies with strong community support.