Files
dance-lessons-coach/adr/0020-docker-build-strategy.md
Gabriel Radureau 3029e93175
Some checks failed
CI/CD Pipeline / CI Pipeline (push) Has been cancelled
CI/CD Pipeline / Build Docker Cache (push) Has been cancelled
🤖 ci: optimize CI/CD with Docker cache and remove Buildx
2026-04-07 11:52:33 +02:00

8.6 KiB

ADR 0020: Docker Build Strategy - Traditional vs Buildx

Status

Accepted

Context

The DanceLessonsCoach CI/CD pipeline initially used Docker Buildx (docker buildx build --push) for building and pushing Docker cache images. However, this approach encountered several issues:

Issues with Buildx Approach

  1. TLS Certificate Problems: Buildx had difficulty with self-signed certificates, requiring complex workaround steps
  2. Performance Concerns: Buildx setup and execution was significantly slower than expected
  3. Complexity: Buildx introduced additional complexity without providing immediate benefits
  4. Reliability Issues: Buildx builds were less reliable in the GitHub Actions environment

Working Solution Analysis

The working webapp CI/CD pipeline uses traditional docker build + docker push approach:

# Working approach from webapp
- name: Build and push image to Gitea Container Registry
  run: |-
    docker build -t app .
    docker tag app gitea.arcodange.lab/${{ github.repository }}:$TAG
    docker push gitea.arcodange.lab/${{ github.repository }}:$TAG

This approach is simpler, more reliable, and works consistently with self-signed certificates.

Decision

Replace Docker Buildx with traditional docker build + push for the CI/CD pipeline and implement a two-stage Docker build strategy.

Implementation

1. Build Cache Strategy

# Build cache using traditional docker build
- name: Build and push Docker cache image
  if: steps.check_cache.outputs.cache_hit == 'false'
  run: |
    IMAGE_NAME="${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}-build-cache:${{ steps.calculate_hash.outputs.deps_hash }}"
    echo "Building cache image: $IMAGE_NAME"
    
    # Build the image using traditional docker build
    docker build \
      --file Dockerfile.build \
      --tag "$IMAGE_NAME" \
      .
    
    # Push the image
    docker push "$IMAGE_NAME"
    
    echo "✅ Build cache image pushed successfully"

2. Production Build Strategy

# Production build using Dockerfile.prod
- name: Build and push Docker image
  if: github.ref == 'refs/heads/main'
  run: |
    source VERSION
    IMAGE_VERSION="$MAJOR.$MINOR.$PATCH${PRERELEASE:+-$PRERELEASE}"
    
    TAGS="$IMAGE_VERSION latest ${{ github.sha }}"
    echo "Building Docker image with tags: $TAGS"
    
    # Use the production Dockerfile that leverages the build cache
    docker build -t dance-lessons-coach -f Dockerfile.prod .
    
    for TAG in $TAGS; do
      IMAGE_NAME="${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:$TAG"
      echo "Tagging and pushing: $IMAGE_NAME"
      docker tag dance-lessons-coach "$IMAGE_NAME"
      docker push "$IMAGE_NAME"
    done

3. Dockerfile Structure

Dockerfile.build - Build environment with all dependencies:

FROM golang:1.26.1-alpine AS builder

# Install build dependencies
RUN apk add --no-cache git bash curl make gcc musl-dev bc grep sed jq ca-certificates

# Install Go tools
RUN go install github.com/swaggo/swag/cmd/swag@latest

# Copy and verify dependencies
COPY go.mod go.sum ./
RUN go mod download && go mod verify

WORKDIR /workspace

Dockerfile.prod - Minimal production image:

# Use the build cache image as base
FROM gitea.arcodange.lab/arcodange/dance-lessons-coach-build-cache:latest AS builder

# Final minimal image
FROM alpine:3.18

WORKDIR /app

# Install minimal dependencies
RUN apk add --no-cache ca-certificates tzdata

# Copy binary from builder
COPY --from=builder /workspace/dance-lessons-coach /app/dance-lessons-coach

# Copy configuration
COPY config.yaml /app/config.yaml

# Set permissions and entrypoint
RUN chmod +x /app/dance-lessons-coach
ENV TZ=UTC
EXPOSE 8080
ENTRYPOINT ["/app/dance-lessons-coach"]

Dockerfile - Development Dockerfile (kept for local development):

# Multi-stage build for development
FROM golang:1.26.1-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . ./
RUN go build -o /dance-lessons-coach ./cmd/server

FROM alpine:3.18
WORKDIR /app
RUN apk add --no-cache ca-certificates tzdata
COPY --from=builder /dance-lessons-coach /app/dance-lessons-coach
COPY config.yaml /app/config.yaml
RUN chmod +x /app/dance-lessons-coach
ENV TZ=UTC
EXPOSE 8080
ENTRYPOINT ["/app/dance-lessons-coach"]

Benefits

CI/CD Pipeline Benefits

  1. Simplicity: Traditional approach is easier to understand and debug
  2. Reliability: Consistent behavior across different environments
  3. Certificate Handling: Works seamlessly with self-signed certificates
  4. Performance: Faster execution without Buildx overhead
  5. Compatibility: Better compatibility with GitHub Actions environment

Two-Stage Build Benefits

  1. Separation of Concerns: Clear separation between build environment and production runtime
  2. Optimized Production Image: Minimal Alpine-based image with only necessary dependencies
  3. Reusable Build Cache: Build environment can be reused across multiple CI runs
  4. Faster CI Execution: Pre-built build cache reduces CI execution time
  5. Consistent Builds: All builds use the same build environment

Development vs Production Clarity

  1. Development Dockerfile: Full build environment for local development
  2. Production Dockerfile: Minimal runtime environment for deployment
  3. Build Cache Dockerfile: Optimized build environment for CI/CD
  4. Clear Documentation: Each Dockerfile has a specific purpose

Trade-offs

What We Lose

  1. Multi-platform builds: Cannot build for multiple architectures simultaneously
  2. BuildKit caching: Less sophisticated caching mechanism
  3. Advanced features: No secret mounting, SSH agents, etc.
  4. Parallel processing: Slower builds without Buildx optimizations

What We Gain

  1. Stability: More reliable CI/CD pipeline
  2. Simplicity: Easier to maintain and troubleshoot
  3. Consistency: Matches proven patterns from working projects
  4. Faster feedback: Quicker build times in practice

Rationale

  1. Current Needs: We don't need multi-platform builds or advanced BuildKit features
  2. Simple Dockerfile: Our Dockerfile.build doesn't require Buildx-specific features
  3. Proven Pattern: Traditional approach works reliably in production (webapp project)
  4. CI Stability: Reliability is more important than advanced features for CI/CD

Future Considerations

When to Reconsider Buildx

  1. Multi-platform needs: If we need ARM/AMD64 builds simultaneously
  2. Complex builds: If Dockerfile requires BuildKit-specific features
  3. Performance optimization: If build times become unacceptable
  4. Certificate issues resolved: If Docker Buildx improves self-signed certificate handling

Migration Path

If we need to reintroduce Buildx in the future:

  1. Fix certificate issues properly at the Docker daemon level
  2. Test thoroughly in staging environment
  3. Monitor performance impact
  4. Document benefits clearly for the specific use case

Alternatives Considered

Option 1: Keep Buildx with Certificate Workaround

  • Complex setup with questionable reliability
  • Slow performance in GitHub Actions
  • Ongoing maintenance burden

Option 2: Use Insecure Registry Flag

docker buildx build --allow security.insecure --push .
  • Security concerns
  • Not recommended for production
  • Temporary workaround, not solution

Option 3: Traditional Docker Build + Push CHOSEN

  • Simple and reliable
  • Proven in production
  • Better performance in practice
  • Easy to maintain

Decision Outcome

Chosen Option: Traditional docker build + push (Option 3)

This decision prioritizes CI/CD reliability and simplicity over advanced features we don't currently need. The traditional approach has been proven to work consistently in our environment and matches the successful pattern from the webapp project.

Success Metrics

  1. CI/CD reliability: No TLS certificate failures
  2. Build consistency: Predictable build times
  3. Maintenance: Reduced complexity and debugging time
  4. Compatibility: Works across all target environments

References


Approved by: @arcodange Date: 2026-04-07 Supersedes: None Superseded by: None