Implements the first phase of ADR-0023 selective hot-reloading. Adds viper.WatchConfig wiring + an OnConfigChange handler that re-unmarshals the Config struct on file changes and applies the hot-reloadable subset. Phase 1 reloadable field: logging.level — re-applied via SetupLogging on every change. The remaining 3 fields listed in ADR-0023 (api.v2_enabled, telemetry sampler type/ratio, auth.jwt.ttl) follow the same pattern and will land in subsequent phase PRs without further infrastructure work. Changes: - pkg/config/config.go : Config struct gets unexported viper + reloadMu fields; new WatchAndApply(ctx) method starts the watcher and stops it on context cancel. Defensive: no-op when no config file is in use. - pkg/server/server.go Run() : calls WatchAndApply(rootCtx) so the watcher stops on graceful shutdown. - pkg/config/config_hot_reload_test.go (new) : 3 unit tests covering end-to-end reload, no-config-file no-op, nil-viper no-op. Race detector clean. - adr/0023-config-hot-reloading.md : Status → Phase 1 Implemented; remaining fields explicitly Proposed for follow-up phases. Verifier verdict: APPROVE. Race detector passes. Full BDD suite still green. The @flaky scenario in features/config/config_hot_reloading.feature remains @flaky for now — activating it requires reliable cross-process file-watching behaviour which is sensitive to filesystem semantics on the CI runner; the unit test exercises the same code path deterministically.
265 lines
9.0 KiB
Markdown
265 lines
9.0 KiB
Markdown
# Config Hot Reloading Strategy
|
|
|
|
**Status:** Phase 1 Implemented (2026-05-05 — `logging.level` hot-reloadable via `Config.WatchAndApply` in `pkg/config/config.go`, wired in `pkg/server/server.go Run`. Remaining fields — `api.v2_enabled`, telemetry sampler, `auth.jwt.ttl` — Proposed for follow-up phases following the same pattern.)
|
|
**Authors:** Gabriel Radureau, AI Agent
|
|
**Date:** 2026-04-05
|
|
**Last Updated:** 2026-05-05
|
|
|
|
## Context and Problem Statement
|
|
|
|
The dance-lessons-coach application currently loads configuration once at startup using Viper, which supports file-based configuration, environment variables, and defaults. However, the current implementation does not support runtime configuration changes without restarting the application.
|
|
|
|
We need to determine whether and how to implement config hot reloading - the ability to detect changes to the optional `config.yaml` file and apply those changes without requiring a full application restart.
|
|
|
|
## Decision Drivers
|
|
|
|
* **Development convenience**: Hot reloading would allow developers to change configuration without restarting the server during development
|
|
* **Production flexibility**: Ability to adjust certain configuration parameters without downtime
|
|
* **Complexity**: Hot reloading adds significant complexity to the codebase
|
|
* **Safety**: Some configuration changes require careful handling to avoid runtime errors
|
|
* **Viper capabilities**: Viper already supports file watching through `viper.WatchConfig()`
|
|
* **Configuration scope**: Not all configuration parameters can or should be hot-reloaded
|
|
|
|
## Considered Options
|
|
|
|
### Option 1: Full Hot Reloading with Viper WatchConfig
|
|
|
|
Implement comprehensive hot reloading using Viper's built-in `WatchConfig()` functionality to monitor the config file and automatically reload when changes are detected.
|
|
|
|
### Option 2: Selective Hot Reloading
|
|
|
|
Only allow hot reloading for specific configuration sections that are safe to change at runtime (e.g., logging level, feature flags) while requiring restart for others (e.g., server host/port, database credentials).
|
|
|
|
### Option 3: Manual Reload Endpoint
|
|
|
|
Add an admin endpoint (e.g., `POST /api/admin/reload-config`) that triggers configuration reload when called, giving explicit control over when reloading happens.
|
|
|
|
### Option 4: No Hot Reloading
|
|
|
|
Maintain the current approach of loading configuration only at startup, requiring application restart for any configuration changes.
|
|
|
|
## Decision Outcome
|
|
|
|
Chosen option: **"Selective Hot Reloading"** because it provides the benefits of runtime configuration changes while maintaining safety and control. This approach:
|
|
|
|
* Allows safe configuration changes without restart
|
|
* Prevents dangerous runtime changes to critical parameters
|
|
* Leverages Viper's existing capabilities
|
|
* Provides a clear boundary between hot-reloadable and non-hot-reloadable settings
|
|
|
|
## Implementation Strategy
|
|
|
|
### Hot-Reloadable Configuration
|
|
|
|
The following configuration parameters will support hot reloading:
|
|
|
|
* **Logging level** (`logging.level`)
|
|
* **Feature flags** (`api.v2_enabled`)
|
|
* **Telemetry sampling** (`telemetry.sampler.type`, `telemetry.sampler.ratio`)
|
|
* **JWT TTL** (`auth.jwt.ttl`)
|
|
|
|
### Non-Hot-Reloadable Configuration
|
|
|
|
These parameters will require application restart:
|
|
|
|
* **Server settings** (`server.host`, `server.port`)
|
|
* **Database credentials** (`database.*`)
|
|
* **JWT secret** (`auth.jwt_secret`)
|
|
* **Admin credentials** (`auth.admin_master_password`)
|
|
|
|
### Implementation Plan
|
|
|
|
```go
|
|
// Add to config package
|
|
type ConfigManager struct {
|
|
config *Config
|
|
viper *viper.Viper
|
|
changeChan chan struct{}
|
|
stopChan chan struct{}
|
|
}
|
|
|
|
func NewConfigManager() (*ConfigManager, error) {
|
|
// Initialize Viper and load initial config
|
|
// Start file watcher if config file exists
|
|
}
|
|
|
|
func (cm *ConfigManager) StartWatching() {
|
|
if cm.viper != nil {
|
|
cm.viper.WatchConfig()
|
|
cm.viper.OnConfigChange(func(e fsnotify.Event) {
|
|
cm.handleConfigChange()
|
|
})
|
|
}
|
|
}
|
|
|
|
func (cm *ConfigManager) handleConfigChange() {
|
|
// Reload only safe configuration sections
|
|
// Update logging level if changed
|
|
// Update feature flags if changed
|
|
// Notify other components of changes
|
|
|
|
log.Info().Msg("Configuration reloaded (partial)")
|
|
}
|
|
|
|
// Safe getter methods that work with hot reloading
|
|
func (cm *ConfigManager) GetLogLevel() string {
|
|
// Return current value, potentially updated via hot reload
|
|
}
|
|
```
|
|
|
|
### Configuration File Monitoring
|
|
|
|
```go
|
|
// In main application setup
|
|
func main() {
|
|
configManager, err := config.NewConfigManager()
|
|
if err != nil {
|
|
log.Fatal().Err(err).Msg("Failed to initialize config")
|
|
}
|
|
|
|
// Start watching for config changes
|
|
configManager.StartWatching()
|
|
|
|
// Use configManager throughout application instead of direct config access
|
|
}
|
|
```
|
|
|
|
## Pros and Cons of the Options
|
|
|
|
### Option 1: Full Hot Reloading with Viper WatchConfig
|
|
|
|
* **Good**: Maximum flexibility for configuration changes
|
|
* **Good**: Leverages Viper's built-in capabilities
|
|
* **Good**: Good for development workflow
|
|
* **Bad**: High risk of runtime errors from unsafe changes
|
|
* **Bad**: Complex to implement safely
|
|
* **Bad**: Hard to debug configuration-related issues
|
|
|
|
### Option 2: Selective Hot Reloading (Chosen)
|
|
|
|
* **Good**: Safe approach with clear boundaries
|
|
* **Good**: Balances flexibility and stability
|
|
* **Good**: Easier to implement and maintain
|
|
* **Good**: Clear documentation of what can be changed
|
|
* **Bad**: More complex than no hot reloading
|
|
* **Bad**: Requires careful design of config access patterns
|
|
|
|
### Option 3: Manual Reload Endpoint
|
|
|
|
* **Good**: Explicit control over when reloading happens
|
|
* **Good**: Can be secured with authentication
|
|
* **Good**: Good for production environments
|
|
* **Bad**: Less convenient for development
|
|
* **Bad**: Requires additional API endpoint management
|
|
* **Bad**: Still needs same safety considerations as automatic reloading
|
|
|
|
### Option 4: No Hot Reloading
|
|
|
|
* **Good**: Simplest approach
|
|
* **Good**: No risk of runtime configuration errors
|
|
* **Good**: Easier to reason about application state
|
|
* **Bad**: Requires restart for any configuration change
|
|
* **Bad**: Less flexible for production adjustments
|
|
* **Bad**: Slower development iteration
|
|
|
|
## Configuration Change Handling
|
|
|
|
### Safe Change Pattern
|
|
|
|
```go
|
|
// Example: Logging level change
|
|
func (cm *ConfigManager) handleConfigChange() {
|
|
// Get new config values
|
|
newConfig := &Config{}
|
|
if err := cm.viper.Unmarshal(newConfig); err != nil {
|
|
log.Error().Err(err).Msg("Failed to unmarshal new config")
|
|
return
|
|
}
|
|
|
|
// Apply safe changes
|
|
if newConfig.Logging.Level != cm.config.Logging.Level {
|
|
if err := cm.applyLogLevelChange(newConfig.Logging.Level); err != nil {
|
|
log.Error().Err(err).Msg("Failed to apply log level change")
|
|
}
|
|
}
|
|
|
|
// Update other safe parameters...
|
|
}
|
|
|
|
func (cm *ConfigManager) applyLogLevelChange(newLevel string) error {
|
|
// Validate new level
|
|
level := parseLogLevel(newLevel)
|
|
|
|
// Apply change
|
|
zerolog.SetGlobalLevel(level)
|
|
cm.config.Logging.Level = newLevel
|
|
|
|
log.Info().Str("new_level", newLevel).Msg("Log level updated")
|
|
return nil
|
|
}
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
* Invalid configuration changes are logged but don't crash the application
|
|
* Failed changes revert to previous known-good values
|
|
* Critical errors during reload trigger application shutdown
|
|
* All changes are logged for audit purposes
|
|
|
|
## Links
|
|
|
|
* [Viper WatchConfig Documentation](https://github.com/spf13/viper#watching-and-re-reading-config-files)
|
|
* [Viper OnConfigChange](https://github.com/spf13/viper#example-of-watching-a-config-file)
|
|
* [ADR-0006: Configuration Management](0006-configuration-management.md)
|
|
|
|
## Configuration File Example with Hot-Reloadable Settings
|
|
|
|
```yaml
|
|
# config.yaml - These settings can be hot-reloaded
|
|
server:
|
|
host: "0.0.0.0"
|
|
port: 8080
|
|
|
|
logging:
|
|
level: "info" # Can be changed without restart
|
|
json: false
|
|
output: ""
|
|
|
|
api:
|
|
v2_enabled: false # Can be changed without restart
|
|
|
|
telemetry:
|
|
enabled: false
|
|
sampler:
|
|
type: "parentbased_always_on" # Can be changed without restart
|
|
ratio: 1.0
|
|
```
|
|
|
|
## Migration Plan
|
|
|
|
1. **Phase 1**: Implement ConfigManager wrapper around existing config
|
|
2. **Phase 2**: Add selective hot reloading for logging level
|
|
3. **Phase 3**: Extend to feature flags and telemetry settings
|
|
4. **Phase 4**: Add documentation and examples
|
|
5. **Phase 5**: Update all components to use ConfigManager instead of direct config access
|
|
|
|
## Monitoring and Observability
|
|
|
|
* Log all configuration changes with timestamps
|
|
* Include previous and new values in change logs
|
|
* Add metrics for configuration reload events
|
|
* Provide admin endpoint to view current configuration
|
|
|
|
## Security Considerations
|
|
|
|
* Config file permissions should be restrictive
|
|
* Hot reloading should be disabled in production by default
|
|
* Configuration changes should be audited
|
|
* Sensitive parameters should never be hot-reloadable
|
|
|
|
## Future Enhancements
|
|
|
|
* Configuration change webhooks
|
|
* Configuration versioning and rollback
|
|
* Configuration validation before applying changes
|
|
* Multi-file configuration support |