feat(config): hot-reload Phase 1 — logging.level (ADR-0023) #42

Merged
arcodange merged 1 commits from feat/adr-0023-config-hot-reload-mvp into main 2026-05-05 08:45:20 +02:00
Owner

Stretch sprint of autonomous trainer day 2026-05-05

Implements Phase 1 of ADR-0023 selective hot-reloading. The infrastructure (viper.WatchConfig + OnConfigChange callback + thread-safe re-unmarshal) is now in place. The first field wired through it is logging.level.

Remaining fields listed in ADR-0023 (api.v2_enabled, telemetry sampler, auth.jwt.ttl) follow the same pattern and will land in subsequent phase PRs without further infrastructure work.

Why a phased rollout

Hot-reloading any field touches a different runtime invariant:

  • logging.level — global zerolog state, single setter, no concurrency hazard ✓
  • api.v2_enabled — request routing, must coordinate with the chi router rebuild
  • telemetry sampler — global tracer provider, requires ratio re-init
  • auth.jwt.ttl — affects JWT validation in-flight; needs careful invariant

Shipping the foundation + one trivial field gives us a vetted infrastructure to extend, rather than landing 4 fields together with 4× risk.

Changes

pkg/config/config.go

  • Config struct gets unexported viper *viper.Viper and reloadMu sync.RWMutex fields (with mapstructure:"-").
  • New method WatchAndApply(ctx):
    • Returns early (no-op) if no viper attached or no config file in use — production containers without a mounted config.yaml are safe.
    • Sets OnConfigChange to re-unmarshal under the mutex, then call SetupLogging().
    • Calls viper.WatchConfig().
    • Spawns a goroutine that clears the callback when ctx is cancelled.

pkg/server/server.go Run()

  • Calls s.config.WatchAndApply(rootCtx) after the JWT cleanup loop start so it stops on graceful shutdown.

Tests (pkg/config/config_hot_reload_test.go, new)

  • TestWatchAndApply_LoggingLevel — end-to-end: write file → watcher fires → in-memory level flips. Polls up to 2s.
  • TestWatchAndApply_NoFileNoOp — no config file in use → no panic.
  • TestWatchAndApply_NilViperNoOpConfig{} constructed manually → no panic.

go test -race ./pkg/config/... passes.

Out of scope

  • Activating features/config/config_hot_reloading.feature (@flaky scenarios) — file-watching cross-process semantics are sensitive to filesystem behaviour on the CI runner; the unit test exercises the same code path deterministically. Will revisit when activating the next field.
  • The remaining 3 ADR-0023 fields — separate phase PRs, each ~50 lines.
  • An admin endpoint to trigger reload manually — not requested yet.

Verifier verdict (skill-driven, manual run)

APPROVE

  • Dim A (code quality): APPROVE — WatchAndApply is 35 lines, mostly defensive checks + the OnConfigChange closure; race detector clean; mutex used correctly.
  • Dim B (docs homogeneity): APPROVE — ADR-0023 status accurately reflects what landed; cross-refs (pkg/config/config.go, pkg/server/server.go Run) match the actual file locations.
  • Dim C (test reliability): APPROVE — 3 tests cover happy path + 2 defensive paths; happy path uses polling with deadline (no fixed sleep); race detector passes.

Test plan

  • go test -race ./pkg/config/... passes
  • Full BDD suite still green
  • go vet ./... clean
  • Pre-commit hooks (go mod tidy + go fmt + swag fmt) successful
  • Manual smoke: ./bin/server running, edit config.yaml logging.level → check level changes in logs without restart
## Stretch sprint of autonomous trainer day 2026-05-05 Implements **Phase 1** of [ADR-0023](adr/0023-config-hot-reloading.md) selective hot-reloading. The infrastructure (`viper.WatchConfig` + `OnConfigChange` callback + thread-safe re-unmarshal) is now in place. The first field wired through it is `logging.level`. Remaining fields listed in ADR-0023 (`api.v2_enabled`, telemetry sampler, `auth.jwt.ttl`) follow the same pattern and will land in subsequent phase PRs **without further infrastructure work**. ## Why a phased rollout Hot-reloading any field touches a different runtime invariant: - `logging.level` — global zerolog state, single setter, no concurrency hazard ✓ - `api.v2_enabled` — request routing, must coordinate with the chi router rebuild - telemetry sampler — global tracer provider, requires ratio re-init - `auth.jwt.ttl` — affects JWT validation in-flight; needs careful invariant Shipping the foundation + one trivial field gives us a vetted infrastructure to extend, rather than landing 4 fields together with 4× risk. ## Changes ### `pkg/config/config.go` - `Config` struct gets unexported `viper *viper.Viper` and `reloadMu sync.RWMutex` fields (with `mapstructure:"-"`). - New method `WatchAndApply(ctx)`: - Returns early (no-op) if no viper attached or no config file in use — production containers without a mounted `config.yaml` are safe. - Sets `OnConfigChange` to re-unmarshal under the mutex, then call `SetupLogging()`. - Calls `viper.WatchConfig()`. - Spawns a goroutine that clears the callback when `ctx` is cancelled. ### `pkg/server/server.go Run()` - Calls `s.config.WatchAndApply(rootCtx)` after the JWT cleanup loop start so it stops on graceful shutdown. ### Tests (`pkg/config/config_hot_reload_test.go`, new) - `TestWatchAndApply_LoggingLevel` — end-to-end: write file → watcher fires → in-memory level flips. Polls up to 2s. - `TestWatchAndApply_NoFileNoOp` — no config file in use → no panic. - `TestWatchAndApply_NilViperNoOp` — `Config{}` constructed manually → no panic. `go test -race ./pkg/config/...` passes. ## Out of scope - Activating `features/config/config_hot_reloading.feature` (`@flaky` scenarios) — file-watching cross-process semantics are sensitive to filesystem behaviour on the CI runner; the unit test exercises the same code path deterministically. Will revisit when activating the next field. - The remaining 3 ADR-0023 fields — separate phase PRs, each ~50 lines. - An admin endpoint to trigger reload manually — not requested yet. ## Verifier verdict (skill-driven, manual run) **APPROVE** - **Dim A** (code quality): APPROVE — `WatchAndApply` is 35 lines, mostly defensive checks + the OnConfigChange closure; race detector clean; mutex used correctly. - **Dim B** (docs homogeneity): APPROVE — ADR-0023 status accurately reflects what landed; cross-refs (`pkg/config/config.go`, `pkg/server/server.go Run`) match the actual file locations. - **Dim C** (test reliability): APPROVE — 3 tests cover happy path + 2 defensive paths; happy path uses polling with deadline (no fixed sleep); race detector passes. ## Test plan - [x] `go test -race ./pkg/config/...` passes - [x] Full BDD suite still green - [x] `go vet ./...` clean - [x] Pre-commit hooks (go mod tidy + go fmt + swag fmt) successful - [ ] Manual smoke: `./bin/server` running, edit `config.yaml` `logging.level` → check level changes in logs without restart
arcodange added 1 commit 2026-05-05 08:45:14 +02:00
Implements the first phase of ADR-0023 selective hot-reloading. Adds
viper.WatchConfig wiring + an OnConfigChange handler that re-unmarshals
the Config struct on file changes and applies the hot-reloadable subset.

Phase 1 reloadable field: logging.level — re-applied via SetupLogging
on every change. The remaining 3 fields listed in ADR-0023
(api.v2_enabled, telemetry sampler type/ratio, auth.jwt.ttl) follow the
same pattern and will land in subsequent phase PRs without further
infrastructure work.

Changes:
- pkg/config/config.go : Config struct gets unexported viper + reloadMu
  fields; new WatchAndApply(ctx) method starts the watcher and stops it
  on context cancel. Defensive: no-op when no config file is in use.
- pkg/server/server.go Run() : calls WatchAndApply(rootCtx) so the
  watcher stops on graceful shutdown.
- pkg/config/config_hot_reload_test.go (new) : 3 unit tests covering
  end-to-end reload, no-config-file no-op, nil-viper no-op. Race
  detector clean.
- adr/0023-config-hot-reloading.md : Status → Phase 1 Implemented;
  remaining fields explicitly Proposed for follow-up phases.

Verifier verdict: APPROVE. Race detector passes. Full BDD suite still
green. The @flaky scenario in features/config/config_hot_reloading.feature
remains @flaky for now — activating it requires reliable cross-process
file-watching behaviour which is sensitive to filesystem semantics on
the CI runner; the unit test exercises the same code path deterministically.
arcodange merged commit b33ad236e1 into main 2026-05-05 08:45:20 +02:00
arcodange deleted branch feat/adr-0023-config-hot-reload-mvp 2026-05-05 08:45:20 +02:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: arcodange/dance-lessons-coach#42