Files
dance-lessons-coach/adr/0007-opentelemetry-integration.md
Gabriel Radureau 95596b5e12 📝 docs: consolidate documentation and add comprehensive ADRs\n\n## Summary\nMajor documentation restructuring to improve clarity, reduce redundancy,
and preserve complete architectural context for AI/developer reference.\n\n## Changes\n\n### Documentation Consolidation 🗂️\n- Simplified README.md by ~100 lines (25% reduction)\n- Removed redundant sections (project structure, configuration, API docs)\n- Added strategic cross-references between README.md and AGENTS.md\n- README.md now focused on user onboarding and basic usage\n- AGENTS.md maintained as complete technical reference\n\n### Architecture Decision Records \n- Added comprehensive ADR directory with 9 decision records:\n  * 0001-go-1.26.1-standard.md\n  * 0002-chi-router.md\n  * 0003-zerolog-logging.md (enhanced with Zap analysis)\n  * 0004-interface-based-design.md\n  * 0005-graceful-shutdown.md\n  * 0006-configuration-management.md\n  * 0007-opentelemetry-integration.md\n  * 0008-bdd-testing.md\n  * 0009-hybrid-testing-approach.md\n- Added adr/README.md with guidelines and template\n- Enhanced Zerolog ADR with detailed performance benchmarking vs Zap\n\n### Content Organization 📝\n- README.md: User-focused guide with quick start and basic examples\n- AGENTS.md: Developer/AI-focused complete technical reference\n- ADR directory: Architectural decision history and rationale\n\n## Impact\n-  Better user onboarding experience\n-  Preserved complete technical context for AI agents\n-  Reduced maintenance burden through consolidation\n-  Improved discoverability of advanced documentation\n-  Established ADR process for future decisions\n\n## Related\n- Resolves documentation redundancy issues\n- Prepares for BDD implementation with clear context\n- Supports future Swagger integration decisions\n- Maintains project history for new contributors\n\nGenerated by Mistral Vibe.\nCo-Authored-By: Mistral Vibe <vibe@mistral.ai>
2026-04-04 15:48:27 +02:00

4.4 KiB

Integrate OpenTelemetry for distributed tracing

  • Status: Accepted
  • Deciders: Gabriel Radureau, AI Agent
  • Date: 2026-04-04

Context and Problem Statement

We needed to add observability to DanceLessonsCoach that provides:

  • Distributed tracing capabilities
  • Performance monitoring
  • Request flow visualization
  • Integration with existing monitoring systems
  • Minimal impact on application performance

Decision Drivers

  • Need for distributed tracing in microservices architecture
  • Desire for performance monitoring
  • Requirement for request flow visualization
  • Need for integration with monitoring tools
  • Desire for minimal performance impact

Considered Options

  • OpenTelemetry - CNCF standard for observability
  • Jaeger client - Direct Jaeger integration
  • Zipkin - Alternative tracing system
  • Custom solution - Build our own tracing

Decision Outcome

Chosen option: "OpenTelemetry" because it provides industry-standard observability, good performance, flexibility for multiple backends, and is becoming the standard for distributed tracing.

Pros and Cons of the Options

OpenTelemetry

  • Good, because CNCF standard with broad industry adoption
  • Good, because supports multiple tracing backends (Jaeger, Zipkin, etc.)
  • Good, because good performance characteristics
  • Good, because active development and community
  • Good, because vendor-neutral
  • Bad, because more complex setup
  • Bad, because larger dependency footprint

Jaeger client

  • Good, because direct integration with Jaeger
  • Good, because simpler setup
  • Bad, because vendor-locked to Jaeger
  • Bad, because less flexible for future changes

Zipkin

  • Good, because established tracing system
  • Good, because good ecosystem
  • Bad, because less feature-rich than OpenTelemetry
  • Bad, because declining popularity

Custom solution

  • Good, because tailored to our needs
  • Good, because no external dependencies
  • Bad, because time-consuming to develop
  • Bad, because need to maintain ourselves
  • Bad, because likely less feature-rich

Implementation Approach

Middleware-only approach

We chose a middleware-only approach using otelhttp.NewHandler rather than manual instrumentation:

// In pkg/server/server.go
func (s *Server) getAllMiddlewares() []func(http.Handler) http.Handler {
    middlewares := []func(http.Handler) http.Handler{
        middleware.StripSlashes,
        middleware.Recoverer,
    }

    if s.withOTEL {
        middlewares = append(middlewares, func(next http.Handler) http.Handler {
            return otelhttp.NewHandler(next, "")
        })
    }

    return middlewares
}

Benefits of middleware approach

  • Clean separation: Tracing logic separate from business logic
  • Consistent instrumentation: All endpoints automatically traced
  • Easy to enable/disable: Single configuration flag
  • Maintainable: No tracing boilerplate in service code
  • Upgradable: Easy to change tracing implementation

Configuration

# config.yaml
telemetry:
  enabled: true
  otlp_endpoint: "localhost:4317"
  service_name: "DanceLessonsCoach"
  insecure: true
  sampler:
    type: "parentbased_always_on"
    ratio: 1.0

Jaeger Integration

# Start Jaeger with OTLP support
docker run -d --name jaeger \
  -e COLLECTOR_OTLP_ENABLED=true \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest

# Start server with OpenTelemetry
DLC_TELEMETRY_ENABLED=true ./scripts/start-server.sh start

# View traces at http://localhost:16686

Sampler Types Supported

  • always_on - Sample all traces
  • always_off - Sample no traces
  • traceidratio - Sample based on trace ID ratio
  • parentbased_always_on - Sample based on parent span (always on)
  • parentbased_always_off - Sample based on parent span (always off)
  • parentbased_traceidratio - Sample based on parent span with ratio

Performance Considerations

  • OpenTelemetry adds minimal overhead when disabled
  • Sampling can be used to reduce overhead in production
  • Tracing data is sent asynchronously to minimize impact
  • Context propagation is efficient using Go's context package