diff --git a/adr/0023-config-hot-reloading.md b/adr/0023-config-hot-reloading.md new file mode 100644 index 0000000..47f6e79 --- /dev/null +++ b/adr/0023-config-hot-reloading.md @@ -0,0 +1,264 @@ +# Config Hot Reloading Strategy + +* Status: Proposed +* Deciders: Gabriel Radureau, AI Agent +* Date: 2026-04-05 + +## Context and Problem Statement + +The dance-lessons-coach application currently loads configuration once at startup using Viper, which supports file-based configuration, environment variables, and defaults. However, the current implementation does not support runtime configuration changes without restarting the application. + +We need to determine whether and how to implement config hot reloading - the ability to detect changes to the optional `config.yaml` file and apply those changes without requiring a full application restart. + +## Decision Drivers + +* **Development convenience**: Hot reloading would allow developers to change configuration without restarting the server during development +* **Production flexibility**: Ability to adjust certain configuration parameters without downtime +* **Complexity**: Hot reloading adds significant complexity to the codebase +* **Safety**: Some configuration changes require careful handling to avoid runtime errors +* **Viper capabilities**: Viper already supports file watching through `viper.WatchConfig()` +* **Configuration scope**: Not all configuration parameters can or should be hot-reloaded + +## Considered Options + +### Option 1: Full Hot Reloading with Viper WatchConfig + +Implement comprehensive hot reloading using Viper's built-in `WatchConfig()` functionality to monitor the config file and automatically reload when changes are detected. + +### Option 2: Selective Hot Reloading + +Only allow hot reloading for specific configuration sections that are safe to change at runtime (e.g., logging level, feature flags) while requiring restart for others (e.g., server host/port, database credentials). + +### Option 3: Manual Reload Endpoint + +Add an admin endpoint (e.g., `POST /api/admin/reload-config`) that triggers configuration reload when called, giving explicit control over when reloading happens. + +### Option 4: No Hot Reloading + +Maintain the current approach of loading configuration only at startup, requiring application restart for any configuration changes. + +## Decision Outcome + +Chosen option: **"Selective Hot Reloading"** because it provides the benefits of runtime configuration changes while maintaining safety and control. This approach: + +* Allows safe configuration changes without restart +* Prevents dangerous runtime changes to critical parameters +* Leverages Viper's existing capabilities +* Provides a clear boundary between hot-reloadable and non-hot-reloadable settings + +## Implementation Strategy + +### Hot-Reloadable Configuration + +The following configuration parameters will support hot reloading: + +* **Logging level** (`logging.level`) +* **Feature flags** (`api.v2_enabled`) +* **Telemetry sampling** (`telemetry.sampler.type`, `telemetry.sampler.ratio`) +* **JWT TTL** (`auth.jwt.ttl`) + +### Non-Hot-Reloadable Configuration + +These parameters will require application restart: + +* **Server settings** (`server.host`, `server.port`) +* **Database credentials** (`database.*`) +* **JWT secret** (`auth.jwt_secret`) +* **Admin credentials** (`auth.admin_master_password`) + +### Implementation Plan + +```go +// Add to config package +type ConfigManager struct { + config *Config + viper *viper.Viper + changeChan chan struct{} + stopChan chan struct{} +} + +func NewConfigManager() (*ConfigManager, error) { + // Initialize Viper and load initial config + // Start file watcher if config file exists +} + +func (cm *ConfigManager) StartWatching() { + if cm.viper != nil { + cm.viper.WatchConfig() + cm.viper.OnConfigChange(func(e fsnotify.Event) { + cm.handleConfigChange() + }) + } +} + +func (cm *ConfigManager) handleConfigChange() { + // Reload only safe configuration sections + // Update logging level if changed + // Update feature flags if changed + // Notify other components of changes + + log.Info().Msg("Configuration reloaded (partial)") +} + +// Safe getter methods that work with hot reloading +func (cm *ConfigManager) GetLogLevel() string { + // Return current value, potentially updated via hot reload +} +``` + +### Configuration File Monitoring + +```go +// In main application setup +func main() { + configManager, err := config.NewConfigManager() + if err != nil { + log.Fatal().Err(err).Msg("Failed to initialize config") + } + + // Start watching for config changes + configManager.StartWatching() + + // Use configManager throughout application instead of direct config access +} +``` + +## Pros and Cons of the Options + +### Option 1: Full Hot Reloading with Viper WatchConfig + +* **Good**: Maximum flexibility for configuration changes +* **Good**: Leverages Viper's built-in capabilities +* **Good**: Good for development workflow +* **Bad**: High risk of runtime errors from unsafe changes +* **Bad**: Complex to implement safely +* **Bad**: Hard to debug configuration-related issues + +### Option 2: Selective Hot Reloading (Chosen) + +* **Good**: Safe approach with clear boundaries +* **Good**: Balances flexibility and stability +* **Good**: Easier to implement and maintain +* **Good**: Clear documentation of what can be changed +* **Bad**: More complex than no hot reloading +* **Bad**: Requires careful design of config access patterns + +### Option 3: Manual Reload Endpoint + +* **Good**: Explicit control over when reloading happens +* **Good**: Can be secured with authentication +* **Good**: Good for production environments +* **Bad**: Less convenient for development +* **Bad**: Requires additional API endpoint management +* **Bad**: Still needs same safety considerations as automatic reloading + +### Option 4: No Hot Reloading + +* **Good**: Simplest approach +* **Good**: No risk of runtime configuration errors +* **Good**: Easier to reason about application state +* **Bad**: Requires restart for any configuration change +* **Bad**: Less flexible for production adjustments +* **Bad**: Slower development iteration + +## Configuration Change Handling + +### Safe Change Pattern + +```go +// Example: Logging level change +func (cm *ConfigManager) handleConfigChange() { + // Get new config values + newConfig := &Config{} + if err := cm.viper.Unmarshal(newConfig); err != nil { + log.Error().Err(err).Msg("Failed to unmarshal new config") + return + } + + // Apply safe changes + if newConfig.Logging.Level != cm.config.Logging.Level { + if err := cm.applyLogLevelChange(newConfig.Logging.Level); err != nil { + log.Error().Err(err).Msg("Failed to apply log level change") + } + } + + // Update other safe parameters... +} + +func (cm *ConfigManager) applyLogLevelChange(newLevel string) error { + // Validate new level + level := parseLogLevel(newLevel) + + // Apply change + zerolog.SetGlobalLevel(level) + cm.config.Logging.Level = newLevel + + log.Info().Str("new_level", newLevel).Msg("Log level updated") + return nil +} +``` + +### Error Handling + +* Invalid configuration changes are logged but don't crash the application +* Failed changes revert to previous known-good values +* Critical errors during reload trigger application shutdown +* All changes are logged for audit purposes + +## Links + +* [Viper WatchConfig Documentation](https://github.com/spf13/viper#watching-and-re-reading-config-files) +* [Viper OnConfigChange](https://github.com/spf13/viper#example-of-watching-a-config-file) +* [ADR-0006: Configuration Management](0006-configuration-management.md) + +## Configuration File Example with Hot-Reloadable Settings + +```yaml +# config.yaml - These settings can be hot-reloaded +server: + host: "0.0.0.0" + port: 8080 + +logging: + level: "info" # Can be changed without restart + json: false + output: "" + +api: + v2_enabled: false # Can be changed without restart + +telemetry: + enabled: false + sampler: + type: "parentbased_always_on" # Can be changed without restart + ratio: 1.0 +``` + +## Migration Plan + +1. **Phase 1**: Implement ConfigManager wrapper around existing config +2. **Phase 2**: Add selective hot reloading for logging level +3. **Phase 3**: Extend to feature flags and telemetry settings +4. **Phase 4**: Add documentation and examples +5. **Phase 5**: Update all components to use ConfigManager instead of direct config access + +## Monitoring and Observability + +* Log all configuration changes with timestamps +* Include previous and new values in change logs +* Add metrics for configuration reload events +* Provide admin endpoint to view current configuration + +## Security Considerations + +* Config file permissions should be restrictive +* Hot reloading should be disabled in production by default +* Configuration changes should be audited +* Sensitive parameters should never be hot-reloadable + +## Future Enhancements + +* Configuration change webhooks +* Configuration versioning and rollback +* Configuration validation before applying changes +* Multi-file configuration support \ No newline at end of file diff --git a/adr/README.md b/adr/README.md index 9f0b55f..d193039 100644 --- a/adr/README.md +++ b/adr/README.md @@ -81,6 +81,7 @@ Chosen option: "[Option 1]" because [justification] * [0020-docker-build-strategy.md](0020-docker-build-strategy.md) - Docker Build Strategy: Traditional vs Buildx * [0021-jwt-secret-retention-policy.md](0021-jwt-secret-retention-policy.md) - JWT Secret Retention Policy with Configurable TTL and Retention * [0022-rate-limiting-cache-strategy.md](0022-rate-limiting-cache-strategy.md) - Rate Limiting and Cache Strategy with Multi-Phase Implementation +* [0023-config-hot-reloading.md](0023-config-hot-reloading.md) - Config Hot Reloading Strategy ## How to Add a New ADR