📝 docs: add ADR-0023 for config hot reloading strategy
Adds Architecture Decision Record 0023 proposing selective hot reloading for configuration changes. The ADR analyzes different approaches and recommends implementing hot reloading only for safe parameters like logging level, feature flags, and telemetry settings while requiring restart for critical parameters like server settings and credentials. The ADR includes: - Problem statement and decision drivers - Analysis of 4 different approaches - Detailed implementation strategy - Safety considerations and error handling - Migration plan and future enhancements Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>
This commit is contained in:
264
adr/0023-config-hot-reloading.md
Normal file
264
adr/0023-config-hot-reloading.md
Normal file
@@ -0,0 +1,264 @@
|
||||
# Config Hot Reloading Strategy
|
||||
|
||||
* Status: Proposed
|
||||
* Deciders: Gabriel Radureau, AI Agent
|
||||
* Date: 2026-04-05
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
The dance-lessons-coach application currently loads configuration once at startup using Viper, which supports file-based configuration, environment variables, and defaults. However, the current implementation does not support runtime configuration changes without restarting the application.
|
||||
|
||||
We need to determine whether and how to implement config hot reloading - the ability to detect changes to the optional `config.yaml` file and apply those changes without requiring a full application restart.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
* **Development convenience**: Hot reloading would allow developers to change configuration without restarting the server during development
|
||||
* **Production flexibility**: Ability to adjust certain configuration parameters without downtime
|
||||
* **Complexity**: Hot reloading adds significant complexity to the codebase
|
||||
* **Safety**: Some configuration changes require careful handling to avoid runtime errors
|
||||
* **Viper capabilities**: Viper already supports file watching through `viper.WatchConfig()`
|
||||
* **Configuration scope**: Not all configuration parameters can or should be hot-reloaded
|
||||
|
||||
## Considered Options
|
||||
|
||||
### Option 1: Full Hot Reloading with Viper WatchConfig
|
||||
|
||||
Implement comprehensive hot reloading using Viper's built-in `WatchConfig()` functionality to monitor the config file and automatically reload when changes are detected.
|
||||
|
||||
### Option 2: Selective Hot Reloading
|
||||
|
||||
Only allow hot reloading for specific configuration sections that are safe to change at runtime (e.g., logging level, feature flags) while requiring restart for others (e.g., server host/port, database credentials).
|
||||
|
||||
### Option 3: Manual Reload Endpoint
|
||||
|
||||
Add an admin endpoint (e.g., `POST /api/admin/reload-config`) that triggers configuration reload when called, giving explicit control over when reloading happens.
|
||||
|
||||
### Option 4: No Hot Reloading
|
||||
|
||||
Maintain the current approach of loading configuration only at startup, requiring application restart for any configuration changes.
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: **"Selective Hot Reloading"** because it provides the benefits of runtime configuration changes while maintaining safety and control. This approach:
|
||||
|
||||
* Allows safe configuration changes without restart
|
||||
* Prevents dangerous runtime changes to critical parameters
|
||||
* Leverages Viper's existing capabilities
|
||||
* Provides a clear boundary between hot-reloadable and non-hot-reloadable settings
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### Hot-Reloadable Configuration
|
||||
|
||||
The following configuration parameters will support hot reloading:
|
||||
|
||||
* **Logging level** (`logging.level`)
|
||||
* **Feature flags** (`api.v2_enabled`)
|
||||
* **Telemetry sampling** (`telemetry.sampler.type`, `telemetry.sampler.ratio`)
|
||||
* **JWT TTL** (`auth.jwt.ttl`)
|
||||
|
||||
### Non-Hot-Reloadable Configuration
|
||||
|
||||
These parameters will require application restart:
|
||||
|
||||
* **Server settings** (`server.host`, `server.port`)
|
||||
* **Database credentials** (`database.*`)
|
||||
* **JWT secret** (`auth.jwt_secret`)
|
||||
* **Admin credentials** (`auth.admin_master_password`)
|
||||
|
||||
### Implementation Plan
|
||||
|
||||
```go
|
||||
// Add to config package
|
||||
type ConfigManager struct {
|
||||
config *Config
|
||||
viper *viper.Viper
|
||||
changeChan chan struct{}
|
||||
stopChan chan struct{}
|
||||
}
|
||||
|
||||
func NewConfigManager() (*ConfigManager, error) {
|
||||
// Initialize Viper and load initial config
|
||||
// Start file watcher if config file exists
|
||||
}
|
||||
|
||||
func (cm *ConfigManager) StartWatching() {
|
||||
if cm.viper != nil {
|
||||
cm.viper.WatchConfig()
|
||||
cm.viper.OnConfigChange(func(e fsnotify.Event) {
|
||||
cm.handleConfigChange()
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func (cm *ConfigManager) handleConfigChange() {
|
||||
// Reload only safe configuration sections
|
||||
// Update logging level if changed
|
||||
// Update feature flags if changed
|
||||
// Notify other components of changes
|
||||
|
||||
log.Info().Msg("Configuration reloaded (partial)")
|
||||
}
|
||||
|
||||
// Safe getter methods that work with hot reloading
|
||||
func (cm *ConfigManager) GetLogLevel() string {
|
||||
// Return current value, potentially updated via hot reload
|
||||
}
|
||||
```
|
||||
|
||||
### Configuration File Monitoring
|
||||
|
||||
```go
|
||||
// In main application setup
|
||||
func main() {
|
||||
configManager, err := config.NewConfigManager()
|
||||
if err != nil {
|
||||
log.Fatal().Err(err).Msg("Failed to initialize config")
|
||||
}
|
||||
|
||||
// Start watching for config changes
|
||||
configManager.StartWatching()
|
||||
|
||||
// Use configManager throughout application instead of direct config access
|
||||
}
|
||||
```
|
||||
|
||||
## Pros and Cons of the Options
|
||||
|
||||
### Option 1: Full Hot Reloading with Viper WatchConfig
|
||||
|
||||
* **Good**: Maximum flexibility for configuration changes
|
||||
* **Good**: Leverages Viper's built-in capabilities
|
||||
* **Good**: Good for development workflow
|
||||
* **Bad**: High risk of runtime errors from unsafe changes
|
||||
* **Bad**: Complex to implement safely
|
||||
* **Bad**: Hard to debug configuration-related issues
|
||||
|
||||
### Option 2: Selective Hot Reloading (Chosen)
|
||||
|
||||
* **Good**: Safe approach with clear boundaries
|
||||
* **Good**: Balances flexibility and stability
|
||||
* **Good**: Easier to implement and maintain
|
||||
* **Good**: Clear documentation of what can be changed
|
||||
* **Bad**: More complex than no hot reloading
|
||||
* **Bad**: Requires careful design of config access patterns
|
||||
|
||||
### Option 3: Manual Reload Endpoint
|
||||
|
||||
* **Good**: Explicit control over when reloading happens
|
||||
* **Good**: Can be secured with authentication
|
||||
* **Good**: Good for production environments
|
||||
* **Bad**: Less convenient for development
|
||||
* **Bad**: Requires additional API endpoint management
|
||||
* **Bad**: Still needs same safety considerations as automatic reloading
|
||||
|
||||
### Option 4: No Hot Reloading
|
||||
|
||||
* **Good**: Simplest approach
|
||||
* **Good**: No risk of runtime configuration errors
|
||||
* **Good**: Easier to reason about application state
|
||||
* **Bad**: Requires restart for any configuration change
|
||||
* **Bad**: Less flexible for production adjustments
|
||||
* **Bad**: Slower development iteration
|
||||
|
||||
## Configuration Change Handling
|
||||
|
||||
### Safe Change Pattern
|
||||
|
||||
```go
|
||||
// Example: Logging level change
|
||||
func (cm *ConfigManager) handleConfigChange() {
|
||||
// Get new config values
|
||||
newConfig := &Config{}
|
||||
if err := cm.viper.Unmarshal(newConfig); err != nil {
|
||||
log.Error().Err(err).Msg("Failed to unmarshal new config")
|
||||
return
|
||||
}
|
||||
|
||||
// Apply safe changes
|
||||
if newConfig.Logging.Level != cm.config.Logging.Level {
|
||||
if err := cm.applyLogLevelChange(newConfig.Logging.Level); err != nil {
|
||||
log.Error().Err(err).Msg("Failed to apply log level change")
|
||||
}
|
||||
}
|
||||
|
||||
// Update other safe parameters...
|
||||
}
|
||||
|
||||
func (cm *ConfigManager) applyLogLevelChange(newLevel string) error {
|
||||
// Validate new level
|
||||
level := parseLogLevel(newLevel)
|
||||
|
||||
// Apply change
|
||||
zerolog.SetGlobalLevel(level)
|
||||
cm.config.Logging.Level = newLevel
|
||||
|
||||
log.Info().Str("new_level", newLevel).Msg("Log level updated")
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
|
||||
* Invalid configuration changes are logged but don't crash the application
|
||||
* Failed changes revert to previous known-good values
|
||||
* Critical errors during reload trigger application shutdown
|
||||
* All changes are logged for audit purposes
|
||||
|
||||
## Links
|
||||
|
||||
* [Viper WatchConfig Documentation](https://github.com/spf13/viper#watching-and-re-reading-config-files)
|
||||
* [Viper OnConfigChange](https://github.com/spf13/viper#example-of-watching-a-config-file)
|
||||
* [ADR-0006: Configuration Management](0006-configuration-management.md)
|
||||
|
||||
## Configuration File Example with Hot-Reloadable Settings
|
||||
|
||||
```yaml
|
||||
# config.yaml - These settings can be hot-reloaded
|
||||
server:
|
||||
host: "0.0.0.0"
|
||||
port: 8080
|
||||
|
||||
logging:
|
||||
level: "info" # Can be changed without restart
|
||||
json: false
|
||||
output: ""
|
||||
|
||||
api:
|
||||
v2_enabled: false # Can be changed without restart
|
||||
|
||||
telemetry:
|
||||
enabled: false
|
||||
sampler:
|
||||
type: "parentbased_always_on" # Can be changed without restart
|
||||
ratio: 1.0
|
||||
```
|
||||
|
||||
## Migration Plan
|
||||
|
||||
1. **Phase 1**: Implement ConfigManager wrapper around existing config
|
||||
2. **Phase 2**: Add selective hot reloading for logging level
|
||||
3. **Phase 3**: Extend to feature flags and telemetry settings
|
||||
4. **Phase 4**: Add documentation and examples
|
||||
5. **Phase 5**: Update all components to use ConfigManager instead of direct config access
|
||||
|
||||
## Monitoring and Observability
|
||||
|
||||
* Log all configuration changes with timestamps
|
||||
* Include previous and new values in change logs
|
||||
* Add metrics for configuration reload events
|
||||
* Provide admin endpoint to view current configuration
|
||||
|
||||
## Security Considerations
|
||||
|
||||
* Config file permissions should be restrictive
|
||||
* Hot reloading should be disabled in production by default
|
||||
* Configuration changes should be audited
|
||||
* Sensitive parameters should never be hot-reloadable
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
* Configuration change webhooks
|
||||
* Configuration versioning and rollback
|
||||
* Configuration validation before applying changes
|
||||
* Multi-file configuration support
|
||||
@@ -81,6 +81,7 @@ Chosen option: "[Option 1]" because [justification]
|
||||
* [0020-docker-build-strategy.md](0020-docker-build-strategy.md) - Docker Build Strategy: Traditional vs Buildx
|
||||
* [0021-jwt-secret-retention-policy.md](0021-jwt-secret-retention-policy.md) - JWT Secret Retention Policy with Configurable TTL and Retention
|
||||
* [0022-rate-limiting-cache-strategy.md](0022-rate-limiting-cache-strategy.md) - Rate Limiting and Cache Strategy with Multi-Phase Implementation
|
||||
* [0023-config-hot-reloading.md](0023-config-hot-reloading.md) - Config Hot Reloading Strategy
|
||||
|
||||
## How to Add a New ADR
|
||||
|
||||
|
||||
Reference in New Issue
Block a user