## Summary Homogenize all 23 ADRs to a single canonical header format, and rewrite `adr/README.md` to match the actual state of the corpus. This is **Tâche 7** of the ARCODANGE Phase 1 migration (Claude Code → Mistral Vibe). Independent from PR #17 (Tâche 6 — restructure AGENTS.md) — both can merge in any order. No code changes; only documentation. ## Changes ### 1. Homogenize 21 ADR headers (commit `db09d0a`) The audit (Tâche 6 Phase A, Mistral intent-router agent, 2026-05-02) had identified **3 inconsistent header formats** : - **F1** — list bullets (`* Status:` / `* Date:` / `* Deciders:`) : 11 ADRs (0001-0008, 0011, 0014, 0023) - **F2** — bold fields (`**Status:**` / `**Date:**` / `**Authors:**`) : 9 ADRs (0009, 0010, 0012, 0013, 0015, 0016, 0017, 0018, 0019) - **F3** — dedicated section (`## Status\n**Value** ✅`) : 5 ADRs (0020, 0021, 0022, 0024, 0025) Plus mixed metadata names (Authors / Deciders / Decision Date / Implementation Date / Implementation Status / Last Updated) and decorative emojis on status values made the corpus hard to scan or template against. **Canonical format adopted** (see `adr/README.md` for full template) : ```markdown # NN. Title **Status:** <Proposed | Accepted | Implemented | Partially Implemented | Approved | Rejected | Deferred | Deprecated | Superseded by ADR-NNNN> **Date:** YYYY-MM-DD **Authors:** Name(s) [optional **Field:** ... lines] ## Context... ``` **Transformations applied** (via `/tmp/homogenize-adrs.py` script, 23 files scanned, 21 modified — 0010 and 0012 were already conform) : - F1 list bullets → bold fields - F2 cleanup : `**Deciders:**` → `**Authors:**`, strip status emojis - F3 sections : `## Status\n**Value** ✅` → `**Status:** Value` (single line) - Strip decorative emojis from `**Status:**` and `**Implementation Status:**` - Convert `* Last Updated:` / `* Implementation Status:` / `* Decision Drivers:` / `* Decision Date:` to bold - Date typo fix : `2024-04-XX` → `2026-04-XX` for ADRs 0018, 0019 (off-by-2-years in original) - Normalize multiple blank lines after header (max 1) **ADR body content is preserved unchanged.** Only headers transformed. ### 2. Rewrite `adr/README.md` (commit `d64ab02`) Previous README had multiple inconsistencies : - Index table listed wrong titles for ADRs 0010-0021 (looked like an aspirational forecast that never matched reality — e.g. "0011 = Trunk-Based Development" but real 0011 is absent and Trunk-Based Development is actually 0017) - Listed entries for ADRs 0011 (validation library) and 0014 (gRPC) but **these files do not exist** in the repo - 0024 (BDD Test Organization) was missing from the detail list - Template still showed the obsolete F1 format (`* Status:`) - Decorative emojis on every status entry Rewrite : - Index table **regenerated from actual file contents** (title from H1, status from `**Status:**` line) — emoji-free, accurate - Notes that 0011 / 0014 are not currently in use (reserved) - Updated template block matches the canonical format - Status Legend extended with `Approved`, `Partially Implemented`, `Deferred` - Added note that 0026 is the next free number for new ADRs ## Test plan - [x] All 23 ADRs follow `**Status:**` / `**Date:**` / `**Authors:**` (verified via grep) - [x] No more occurrences of `* Status:` (F1) or `## Status` (F3) in any ADR header - [x] No more emojis on `**Status:**` lines - [x] `adr/README.md` index links resolve to existing files (no more 0011 / 0014 dead links) - [x] Pre-commit hooks pass (`go mod tidy`, `go fmt`, `swag fmt`) ## Migration context Part of Phase 1 of the ARCODANGE migration from Claude Code to Mistral Vibe. Tâche 7 of the curriculum. Independent from PR #17 (which restructures `AGENTS.md`). The two PRs touch disjoint files — no merge conflict expected when both are merged. 🤖 Generated with [Claude Code](https://claude.com/claude-code) (Opus 4.7, 1M context). Mistral Vibe (intent-router agent / mistral-medium-3.5) did the original audit identifying the 3 formats during Tâche 6 Phase A. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: Mistral Vibe (devstral-2 / mistral-medium-3.5) Reviewed-on: #18 Co-authored-by: Gabriel Radureau <arcodange@gmail.com> Co-committed-by: Gabriel Radureau <arcodange@gmail.com>
152 lines
4.4 KiB
Markdown
152 lines
4.4 KiB
Markdown
# Integrate OpenTelemetry for distributed tracing
|
|
|
|
**Status:** Accepted
|
|
**Authors:** Gabriel Radureau, AI Agent
|
|
**Date:** 2026-04-04
|
|
|
|
## Context and Problem Statement
|
|
|
|
We needed to add observability to dance-lessons-coach that provides:
|
|
- Distributed tracing capabilities
|
|
- Performance monitoring
|
|
- Request flow visualization
|
|
- Integration with existing monitoring systems
|
|
- Minimal impact on application performance
|
|
|
|
## Decision Drivers
|
|
|
|
* Need for distributed tracing in microservices architecture
|
|
* Desire for performance monitoring
|
|
* Requirement for request flow visualization
|
|
* Need for integration with monitoring tools
|
|
* Desire for minimal performance impact
|
|
|
|
## Considered Options
|
|
|
|
* OpenTelemetry - CNCF standard for observability
|
|
* Jaeger client - Direct Jaeger integration
|
|
* Zipkin - Alternative tracing system
|
|
* Custom solution - Build our own tracing
|
|
|
|
## Decision Outcome
|
|
|
|
Chosen option: "OpenTelemetry" because it provides industry-standard observability, good performance, flexibility for multiple backends, and is becoming the standard for distributed tracing.
|
|
|
|
## Pros and Cons of the Options
|
|
|
|
### OpenTelemetry
|
|
|
|
* Good, because CNCF standard with broad industry adoption
|
|
* Good, because supports multiple tracing backends (Jaeger, Zipkin, etc.)
|
|
* Good, because good performance characteristics
|
|
* Good, because active development and community
|
|
* Good, because vendor-neutral
|
|
* Bad, because more complex setup
|
|
* Bad, because larger dependency footprint
|
|
|
|
### Jaeger client
|
|
|
|
* Good, because direct integration with Jaeger
|
|
* Good, because simpler setup
|
|
* Bad, because vendor-locked to Jaeger
|
|
* Bad, because less flexible for future changes
|
|
|
|
### Zipkin
|
|
|
|
* Good, because established tracing system
|
|
* Good, because good ecosystem
|
|
* Bad, because less feature-rich than OpenTelemetry
|
|
* Bad, because declining popularity
|
|
|
|
### Custom solution
|
|
|
|
* Good, because tailored to our needs
|
|
* Good, because no external dependencies
|
|
* Bad, because time-consuming to develop
|
|
* Bad, because need to maintain ourselves
|
|
* Bad, because likely less feature-rich
|
|
|
|
## Implementation Approach
|
|
|
|
### Middleware-only approach
|
|
|
|
We chose a middleware-only approach using `otelhttp.NewHandler` rather than manual instrumentation:
|
|
|
|
```go
|
|
// In pkg/server/server.go
|
|
func (s *Server) getAllMiddlewares() []func(http.Handler) http.Handler {
|
|
middlewares := []func(http.Handler) http.Handler{
|
|
middleware.StripSlashes,
|
|
middleware.Recoverer,
|
|
}
|
|
|
|
if s.withOTEL {
|
|
middlewares = append(middlewares, func(next http.Handler) http.Handler {
|
|
return otelhttp.NewHandler(next, "")
|
|
})
|
|
}
|
|
|
|
return middlewares
|
|
}
|
|
```
|
|
|
|
### Benefits of middleware approach
|
|
|
|
* **Clean separation**: Tracing logic separate from business logic
|
|
* **Consistent instrumentation**: All endpoints automatically traced
|
|
* **Easy to enable/disable**: Single configuration flag
|
|
* **Maintainable**: No tracing boilerplate in service code
|
|
* **Upgradable**: Easy to change tracing implementation
|
|
|
|
## Configuration
|
|
|
|
```yaml
|
|
# config.yaml
|
|
telemetry:
|
|
enabled: true
|
|
otlp_endpoint: "localhost:4317"
|
|
service_name: "dance-lessons-coach"
|
|
insecure: true
|
|
sampler:
|
|
type: "parentbased_always_on"
|
|
ratio: 1.0
|
|
```
|
|
|
|
## Jaeger Integration
|
|
|
|
```bash
|
|
# Start Jaeger with OTLP support
|
|
docker run -d --name jaeger \
|
|
-e COLLECTOR_OTLP_ENABLED=true \
|
|
-p 16686:16686 \
|
|
-p 4317:4317 \
|
|
jaegertracing/all-in-one:latest
|
|
|
|
# Start server with OpenTelemetry
|
|
DLC_TELEMETRY_ENABLED=true ./scripts/start-server.sh start
|
|
|
|
# View traces at http://localhost:16686
|
|
```
|
|
|
|
## Links
|
|
|
|
* [OpenTelemetry GitHub](https://github.com/open-telemetry/opentelemetry-go)
|
|
* [OpenTelemetry Documentation](https://opentelemetry.io/docs/instrumentation/go/)
|
|
* [Jaeger Documentation](https://www.jaegertracing.io/docs/)
|
|
* [OTLP Specification](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md)
|
|
|
|
## Sampler Types Supported
|
|
|
|
* `always_on` - Sample all traces
|
|
* `always_off` - Sample no traces
|
|
* `traceidratio` - Sample based on trace ID ratio
|
|
* `parentbased_always_on` - Sample based on parent span (always on)
|
|
* `parentbased_always_off` - Sample based on parent span (always off)
|
|
* `parentbased_traceidratio` - Sample based on parent span with ratio
|
|
|
|
## Performance Considerations
|
|
|
|
* OpenTelemetry adds minimal overhead when disabled
|
|
* Sampling can be used to reduce overhead in production
|
|
* Tracing data is sent asynchronously to minimize impact
|
|
* Context propagation is efficient using Go's context package |