Compare commits
21 Commits
5bfb08abc7
...
feature/re
| Author | SHA1 | Date | |
|---|---|---|---|
| 29272b8fba | |||
| acebea353b | |||
| 732eee7586 | |||
| 88a934dfd2 | |||
| 41ee8c56ac | |||
| c17fb4f9b4 | |||
| 73a3af1552 | |||
| 8bae62c28e | |||
| 5eec64e5e8 | |||
| 5de703468f | |||
| be0a31a525 | |||
| b2e5c034c3 | |||
| 77344c8858 | |||
| 31af8bed07 | |||
| c1e628f339 | |||
| 30af706590 | |||
| 10f25c23e0 | |||
| e2adb3bc9f | |||
| a17eebc8f2 | |||
| 52a4ce4139 | |||
| 69e7c44eb2 |
234
.gitea/workflows/README.md
Normal file
234
.gitea/workflows/README.md
Normal file
@@ -0,0 +1,234 @@
|
||||
# CI/CD Workflow Architecture
|
||||
|
||||
## 🗺️ Overview
|
||||
|
||||
The dance-lessons-coach project uses a **multi-workflow architecture** for better separation of concerns, maintainability, and flexibility.
|
||||
|
||||
## 📁 Workflow Files
|
||||
|
||||
### 1. `ci-cd.yaml` - Main CI/CD Pipeline
|
||||
|
||||
**Purpose**: Run tests, build binaries, and generate documentation
|
||||
|
||||
**Triggers**:
|
||||
- Push to `main`, `ci/**`, `feature/**`, `fix/**`, `refactor/**` branches
|
||||
- Pull requests to `main` branch
|
||||
- Manual workflow dispatch
|
||||
|
||||
**Jobs**:
|
||||
1. **build-cache** - Build and cache Docker build environment
|
||||
2. **ci-pipeline** - Run tests, build binaries, generate Swagger docs
|
||||
3. **trigger-docker-push** - Trigger separate Docker workflow on main branch
|
||||
|
||||
**Key Features**:
|
||||
- Runs in container environment with all build tools
|
||||
- Generates Swagger documentation
|
||||
- Runs BDD and unit tests with PostgreSQL
|
||||
- Updates badges and version information
|
||||
- Triggers Docker workflow only on main branch
|
||||
|
||||
### 2. `docker-push.yaml` - Docker Image Publishing
|
||||
|
||||
**Purpose**: Build and push Docker images to registry
|
||||
|
||||
**Triggers**:
|
||||
- Manual workflow dispatch only (no automatic triggers)
|
||||
- Triggered by `ci-cd.yaml` on main branch
|
||||
|
||||
**Jobs**:
|
||||
1. **docker-push** - Build production Docker image and push to registry
|
||||
|
||||
**Key Features**:
|
||||
- Runs on host environment (access to Docker daemon)
|
||||
- Uses dependency hash from build-cache
|
||||
- Builds minimal Alpine-based production image
|
||||
- Pushes multiple tags (version, latest, commit SHA)
|
||||
|
||||
## 🔧 Architecture Benefits
|
||||
|
||||
### 1. Clear Separation of Concerns
|
||||
- **CI/CD Pipeline**: Testing and artifact generation
|
||||
- **Docker Publishing**: Image building and registry operations
|
||||
|
||||
### 2. Proper Environment Isolation
|
||||
- **CI jobs run in container**: Consistent build environment
|
||||
- **Docker jobs run on host**: Access to Docker daemon
|
||||
|
||||
### 3. Flexible Testing
|
||||
- Can trigger Docker workflow independently for testing
|
||||
- No complex conditional logic in main workflow
|
||||
- Easier to debug and maintain
|
||||
|
||||
### 4. Better Security
|
||||
- Docker operations isolated in separate workflow
|
||||
- Clear dependency between test success and deployment
|
||||
- Manual trigger capability for emergency situations
|
||||
|
||||
## 🚀 Usage Examples
|
||||
|
||||
### Trigger Full CI/CD Pipeline
|
||||
```bash
|
||||
# Automatically triggered on push to main branch
|
||||
# Or manually:
|
||||
./scripts/gitea-client.sh trigger-workflow arcodange dance-lessons-coach ci-cd.yaml main
|
||||
```
|
||||
|
||||
### Trigger Docker Push Manually
|
||||
```bash
|
||||
# Get dependency hash from build-cache job first
|
||||
DEPS_HASH="abc123def456"
|
||||
|
||||
# Trigger Docker workflow manually
|
||||
./scripts/gitea-client.sh trigger-workflow arcodange dance-lessons-coach docker-push.yaml main --deps_hash $DEPS_HASH
|
||||
```
|
||||
|
||||
### Workflow Dispatch Parameters (docker-push.yaml)
|
||||
- `deps_hash` (required): Dependency hash from build-cache job
|
||||
- `ref` (optional): Git reference (branch/tag), defaults to current
|
||||
|
||||
## 🔗 Workflow Dependencies
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Push to main] --> B[ci-cd.yaml]
|
||||
B --> C[build-cache job]
|
||||
B --> D[ci-pipeline job]
|
||||
D --> E[trigger-docker-push job]
|
||||
E --> F[docker-push.yaml]
|
||||
F --> G[docker-push job]
|
||||
G --> H[Docker Registry]
|
||||
```
|
||||
|
||||
## 📋 Best Practices
|
||||
|
||||
### 1. Always Run CI First
|
||||
- Docker workflow should only be triggered after CI passes
|
||||
- Maintains quality gate before deployment
|
||||
|
||||
### 2. Use Dependency Hash
|
||||
- Ensures consistent builds across workflows
|
||||
- Pass hash from build-cache to docker-push
|
||||
|
||||
### 3. Manual Testing
|
||||
- Use separate Docker workflow for testing image builds
|
||||
- Avoids polluting main branch with test images
|
||||
|
||||
### 4. Monitor Both Workflows
|
||||
- CI/CD workflow for test results and artifacts
|
||||
- Docker workflow for image build and push status
|
||||
|
||||
## 🎯 Docker Build Strategy Decision
|
||||
|
||||
### 🏆 Chosen Approach: Attempt 2 (Standard Dockerfile)
|
||||
|
||||
After extensive testing of multiple approaches, we selected **Attempt 2** as the optimal Docker build strategy.
|
||||
|
||||
#### ⚡ Why Attempt 2 Won:
|
||||
|
||||
**1. Simplicity (60% smaller workflow)**
|
||||
- 73 lines vs 158 lines in complex approaches
|
||||
- No inline Dockerfile generation
|
||||
- Standard `docker build -f docker/Dockerfile .` command
|
||||
|
||||
**2. Better Performance**
|
||||
- No artifact/cache action overhead
|
||||
- Natural Docker layer caching works optimally
|
||||
- Faster execution without complex variable substitutions
|
||||
|
||||
**3. Superior Reliability**
|
||||
- Proven standard Docker build process
|
||||
- Easier to debug and maintain
|
||||
- Fewer moving parts = fewer failures
|
||||
|
||||
**4. Better Maintainability**
|
||||
- Uses standard Dockerfile (easier to understand)
|
||||
- No complex YAML templating
|
||||
- Clear separation of concerns
|
||||
|
||||
#### 🗑️ Why We Rejected Other Approaches:
|
||||
|
||||
**Attempt 1 (Inline Dockerfile):**
|
||||
- Complex YAML templating
|
||||
- Harder to debug and maintain
|
||||
- No significant performance benefit
|
||||
|
||||
**Attempt 3 (Build Cache Image):**
|
||||
- Added complexity with cache management
|
||||
- Slower due to artifact actions overhead
|
||||
- More prone to cache invalidation issues
|
||||
|
||||
**Attempt 4 (Template File):**
|
||||
- Added unnecessary file management
|
||||
- No clear advantage over standard Dockerfile
|
||||
- More complex workflow
|
||||
|
||||
### 📊 Performance Comparison:
|
||||
|
||||
| Approach | Lines of Code | Complexity | Reliability | Maintainability |
|
||||
|----------|---------------|------------|-------------|-----------------|
|
||||
| **Attempt 2** | 73 | Low | High | Excellent |
|
||||
| Attempt 1 | 158 | High | Medium | Poor |
|
||||
| Attempt 3 | 125 | Medium | Medium | Fair |
|
||||
| Attempt 4 | 110 | Medium | High | Good |
|
||||
|
||||
### 🔧 Implementation Details:
|
||||
|
||||
**Standard Dockerfile Approach:**
|
||||
```yaml
|
||||
- name: Build and push Docker image
|
||||
run: |
|
||||
docker build -t dance-lessons-coach -f docker/Dockerfile .
|
||||
docker tag dance-lessons-coach "$IMAGE_NAME"
|
||||
docker push "$IMAGE_NAME"
|
||||
```
|
||||
|
||||
**Key Benefits:**
|
||||
- Uses multi-stage builds for optimization
|
||||
- Standard Docker layer caching works naturally
|
||||
- Easy to understand and modify
|
||||
- Proven reliability in production
|
||||
|
||||
## 🎯 Future Enhancements
|
||||
|
||||
### Potential Improvements:
|
||||
- Add workflow status badges to README
|
||||
- Implement workflow chaining with outputs
|
||||
- Add matrix builds for multiple architectures
|
||||
- Implement canary deployment workflow
|
||||
- Add rollback capability
|
||||
|
||||
### Architecture Considerations:
|
||||
- Keep workflows focused on single responsibilities
|
||||
- Maintain clear separation between test and deploy
|
||||
- Document all workflow triggers and conditions
|
||||
- Monitor workflow execution times and optimize
|
||||
|
||||
## 📝 Maintenance
|
||||
|
||||
### Adding New Jobs:
|
||||
- Add to appropriate workflow based on responsibility
|
||||
- CI-related jobs → `ci-cd.yaml`
|
||||
- Docker-related jobs → `docker-push.yaml`
|
||||
|
||||
### Modifying Triggers:
|
||||
- Update trigger conditions in respective workflow files
|
||||
- Test changes thoroughly before merging
|
||||
|
||||
### Debugging:
|
||||
- Check workflow logs in Gitea Actions
|
||||
- Use `gitea-client.sh diagnose-job` for detailed analysis
|
||||
- Monitor workflow dependencies and execution order
|
||||
|
||||
## 🔒 Security
|
||||
|
||||
### Secrets Management:
|
||||
- Docker registry credentials stored in Gitea secrets
|
||||
- Never hardcode credentials in workflow files
|
||||
- Use GitHub token for workflow dispatch
|
||||
|
||||
### Access Control:
|
||||
- Only authorized users can trigger workflows
|
||||
- Manual approval required for production deployments
|
||||
- Audit logs available for all workflow executions
|
||||
|
||||
This architecture provides a clean, maintainable, and secure CI/CD pipeline that scales well with project growth while maintaining clear separation of concerns.
|
||||
@@ -27,6 +27,12 @@ on:
|
||||
branches:
|
||||
- main
|
||||
types: [opened, synchronize, reopened, labeled]
|
||||
# Only run PR CI if the commit doesn't already have passing branch CI
|
||||
if: |
|
||||
github.event_name == 'pull_request' &&
|
||||
(github.event.action == 'opened' ||
|
||||
github.event.action == 'synchronize' ||
|
||||
github.event.action == 'reopened')
|
||||
paths-ignore:
|
||||
- 'README.md'
|
||||
- 'doc/**'
|
||||
@@ -51,35 +57,191 @@ env:
|
||||
CI_REGISTRY: "gitea.arcodange.lab"
|
||||
|
||||
jobs:
|
||||
ci-pipeline:
|
||||
name: CI Pipeline
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
build-cache:
|
||||
name: Build Docker Cache
|
||||
runs-on: ubuntu-latest-ca
|
||||
if: "!contains(github.event.head_commit.message, '[skip ci]') && github.actor != 'ci-bot'"
|
||||
outputs:
|
||||
deps_hash: ${{ steps.calculate_hash.outputs.deps_hash }}
|
||||
cache_hit: ${{ steps.check_cache.outputs.cache_hit }}
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v4
|
||||
- name: Calculate dependency hash
|
||||
id: calculate_hash
|
||||
run: |
|
||||
# Calculate hash of go.mod + go.sum + Dockerfile.build (inline, no script needed)
|
||||
DEPS_HASH=$(sha256sum go.mod go.sum docker/Dockerfile.build | sha256sum | cut -d' ' -f1 | head -c 12)
|
||||
echo "Dependency hash: $DEPS_HASH"
|
||||
echo "deps_hash=$DEPS_HASH" >> $GITHUB_OUTPUT
|
||||
|
||||
- name: Check for existing cache (optimized with fallback)
|
||||
id: check_cache
|
||||
run: |
|
||||
# Check if image exists in registry using optimized approach with fallback
|
||||
IMAGE_NAME="${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}-build-cache:${{ steps.calculate_hash.outputs.deps_hash }}"
|
||||
|
||||
# Fast check using docker manifest inspect (lighter than pull)
|
||||
echo "🔍 Checking cache: $IMAGE_NAME"
|
||||
|
||||
# Try manifest inspect first (fastest method, but experimental)
|
||||
if docker manifest inspect "$IMAGE_NAME" >/dev/null 2>&1; then
|
||||
echo "✅ Cache hit - using existing build cache (manifest inspect)"
|
||||
echo "cache_hit=true" >> $GITHUB_OUTPUT
|
||||
else
|
||||
# Fallback to docker pull if manifest inspect fails (more reliable)
|
||||
echo "⚠️ Manifest inspect failed, falling back to docker pull..."
|
||||
if docker pull "$IMAGE_NAME" >/dev/null 2>&1; then
|
||||
echo "✅ Cache hit - using existing build cache (fallback: docker pull)"
|
||||
echo "cache_hit=true" >> $GITHUB_OUTPUT
|
||||
else
|
||||
echo "⚠️ Cache miss - will build new cache image"
|
||||
echo "cache_hit=false" >> $GITHUB_OUTPUT
|
||||
fi
|
||||
fi
|
||||
|
||||
- name: Login to Gitea Container Registry
|
||||
if: steps.check_cache.outputs.cache_hit == 'false'
|
||||
uses: docker/login-action@v3
|
||||
with:
|
||||
go-version: '1.26.1'
|
||||
cache: true
|
||||
registry: ${{ env.CI_REGISTRY }}
|
||||
username: ${{ github.actor }}
|
||||
password: ${{ secrets.PACKAGES_TOKEN }}
|
||||
|
||||
- name: Install dependencies
|
||||
run: go mod tidy
|
||||
|
||||
# SINGLE swag installation - reused for all steps
|
||||
- name: Install swag (once)
|
||||
run: go install github.com/swaggo/swag/cmd/swag@latest
|
||||
|
||||
- name: Build and push Docker cache image
|
||||
if: steps.check_cache.outputs.cache_hit == 'false'
|
||||
run: |
|
||||
IMAGE_NAME="${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}-build-cache:${{ steps.calculate_hash.outputs.deps_hash }}"
|
||||
echo "Building cache image: $IMAGE_NAME"
|
||||
|
||||
# Build the image using traditional docker build
|
||||
docker build \
|
||||
--file docker/Dockerfile.build \
|
||||
--tag "$IMAGE_NAME" \
|
||||
.
|
||||
|
||||
# Push the image
|
||||
docker push "$IMAGE_NAME"
|
||||
|
||||
echo "✅ Build cache image pushed successfully"
|
||||
|
||||
ci-pipeline:
|
||||
name: CI Pipeline
|
||||
needs: build-cache
|
||||
runs-on: ubuntu-latest-ca
|
||||
# Skip conditions: standard skip ci + actor check + respect skip_ci input
|
||||
if: "!contains(github.event.head_commit.message, '[skip ci]') && github.actor != 'ci-bot' && (!github.event.inputs.skip_ci || github.event.inputs.skip_ci == 'false')"
|
||||
|
||||
container:
|
||||
image: ${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}-build-cache:${{ needs.build-cache.outputs.deps_hash }}
|
||||
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:15
|
||||
env:
|
||||
POSTGRES_USER: postgres
|
||||
POSTGRES_PASSWORD: postgres
|
||||
POSTGRES_DB: dance_lessons_coach_bdd_test
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Set database environment variables
|
||||
run: |
|
||||
echo "DLC_DATABASE_HOST=postgres" >> $GITHUB_ENV
|
||||
echo "DLC_DATABASE_PORT=5432" >> $GITHUB_ENV
|
||||
echo "DLC_DATABASE_USER=$POSTGRES_USER" >> $GITHUB_ENV
|
||||
echo "DLC_DATABASE_PASSWORD=$POSTGRES_PASSWORD" >> $GITHUB_ENV
|
||||
echo "DLC_DATABASE_NAME=$POSTGRES_DB" >> $GITHUB_ENV
|
||||
echo "DLC_DATABASE_SSL_MODE=disable" >> $GITHUB_ENV
|
||||
|
||||
- name: Restore Swagger Docs Cache
|
||||
id: cache-swagger-restore
|
||||
uses: actions/cache/restore@v5
|
||||
with:
|
||||
path: |
|
||||
pkg/server/docs/docs.go
|
||||
pkg/server/docs/swagger.json
|
||||
pkg/server/docs/swagger.yaml
|
||||
key: swagger-docs-${{ hashFiles('cmd/server/main.go', 'pkg/greet/*.go', 'pkg/server/*.go', 'go.mod') }}
|
||||
restore-keys: |
|
||||
swagger-docs-
|
||||
|
||||
- name: Generate Swagger Docs
|
||||
run: cd pkg/server && go generate
|
||||
if: steps.cache-swagger-restore.outputs.cache-hit != 'true'
|
||||
run: go generate ./pkg/server
|
||||
|
||||
- name: Save Swagger Docs Cache
|
||||
if: steps.cache-swagger-restore.outputs.cache-hit != 'true'
|
||||
id: cache-swagger-save
|
||||
uses: actions/cache/save@v5
|
||||
with:
|
||||
path: |
|
||||
pkg/server/docs/docs.go
|
||||
pkg/server/docs/swagger.json
|
||||
pkg/server/docs/swagger.yaml
|
||||
key: ${{ steps.cache-swagger-restore.outputs.cache-primary-key }}
|
||||
|
||||
- name: Build all packages
|
||||
run: go build ./...
|
||||
|
||||
|
||||
- name: Run tests with coverage
|
||||
run: go test ./... -cover -v
|
||||
- name: Wait for PostgreSQL to be ready
|
||||
run: |
|
||||
echo "Waiting for PostgreSQL to be ready..."
|
||||
for i in {1..30}; do
|
||||
if pg_isready -h postgres -p 5432 -U postgres -d dance_lessons_coach_bdd_test; then
|
||||
echo "✅ PostgreSQL is ready!"
|
||||
break
|
||||
fi
|
||||
echo "Waiting for PostgreSQL... ($i/30)"
|
||||
sleep 2
|
||||
done
|
||||
|
||||
# Verify PostgreSQL is accessible
|
||||
if ! pg_isready -h postgres -p 5432 -U postgres -d dance_lessons_coach_bdd_test; then
|
||||
echo "❌ PostgreSQL failed to start"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Run BDD tests with strict validation and coverage
|
||||
run: |
|
||||
echo "Running BDD tests with strict validation and coverage..."
|
||||
# Use the run-bdd-tests.sh script which fails on undefined/pending steps
|
||||
# In CI environment, PostgreSQL is already running as a service
|
||||
export DLC_DATABASE_HOST=postgres
|
||||
export DLC_DATABASE_PORT=5432
|
||||
export DLC_DATABASE_USER=postgres
|
||||
export DLC_DATABASE_PASSWORD=postgres
|
||||
export DLC_DATABASE_NAME=dance_lessons_coach_bdd_test
|
||||
export DLC_DATABASE_SSL_MODE=disable
|
||||
./scripts/run-bdd-tests.sh
|
||||
|
||||
# Generate BDD coverage report
|
||||
go tool cover -func=coverage.out > bdd_coverage.txt
|
||||
|
||||
# Extract BDD coverage percentage and set as environment variable
|
||||
BDD_COVERAGE=$(grep "total:" bdd_coverage.txt | grep -oP '\d+\.\d+' | head -1)
|
||||
echo "BDD Coverage: ${BDD_COVERAGE}%"
|
||||
echo "DLC_BDD_COVERAGE=${BDD_COVERAGE}%" >> $GITHUB_ENV
|
||||
|
||||
- name: Run unit tests with coverage
|
||||
run: |
|
||||
echo "Running unit tests with PostgreSQL service..."
|
||||
# Run unit tests excluding BDD tests (already run above)
|
||||
go test ./pkg/... ./cmd/... -coverprofile=unit_coverage.out -v
|
||||
|
||||
# Generate unit coverage report
|
||||
go tool cover -func=unit_coverage.out > unit_coverage.txt
|
||||
|
||||
# Extract unit test coverage percentage and set as environment variable
|
||||
UNIT_COVERAGE=$(grep "total:" unit_coverage.txt | grep -oP '\d+\.\d+' | head -1)
|
||||
echo "Unit Coverage: ${UNIT_COVERAGE}%"
|
||||
echo "DLC_UNIT_COVERAGE=${UNIT_COVERAGE}%" >> $GITHUB_ENV
|
||||
|
||||
- name: Run go fmt
|
||||
run: go fmt ./...
|
||||
@@ -99,80 +261,67 @@ jobs:
|
||||
# path: pkg/server/docs/swagger.json
|
||||
# retention-days: 1
|
||||
|
||||
# Version management and Docker build (main branch only)
|
||||
- name: Version management and Docker build
|
||||
if: github.ref == 'refs/heads/main'
|
||||
# Badge and version updates - multiple commits, single push
|
||||
# All documentation updates happen in one step with single push at the end
|
||||
- name: Update badges and version (multiple commits, single push)
|
||||
if: always() && github.actor != 'ci-bot'
|
||||
run: |
|
||||
# Analyze last commit message
|
||||
LAST_COMMIT=$(git log -1 --pretty=%B | head -1)
|
||||
VERSION_BUMPED="false"
|
||||
echo "🎯 Updating badges and version..."
|
||||
echo "BDD Coverage: ${DLC_BDD_COVERAGE:-Not set}"
|
||||
echo "Unit Coverage: ${DLC_UNIT_COVERAGE:-Not set}"
|
||||
|
||||
# Automatic version bump based on commit type
|
||||
if echo "$LAST_COMMIT" | grep -q "^✨ feat:"; then
|
||||
echo "🎯 Feature commit detected - bumping MINOR version"
|
||||
./scripts/version-bump.sh minor
|
||||
VERSION_BUMPED="true"
|
||||
elif echo "$LAST_COMMIT" | grep -q "^🐛 fix:"; then
|
||||
echo "🐛 Fix commit detected - bumping PATCH version"
|
||||
./scripts/version-bump.sh patch
|
||||
VERSION_BUMPED="true"
|
||||
elif echo "$LAST_COMMIT" | grep -q "BREAKING CHANGE"; then
|
||||
echo "💥 Breaking change detected - bumping MAJOR version"
|
||||
./scripts/version-bump.sh major
|
||||
VERSION_BUMPED="true"
|
||||
else
|
||||
echo "⏭️ No automatic version bump needed"
|
||||
# Configure git
|
||||
git config user.name "CI Bot"
|
||||
git config user.email "ci@arcodange.fr"
|
||||
|
||||
# Extract coverage values (remove % sign)
|
||||
BDD_COV=${DLC_BDD_COVERAGE%"%"}
|
||||
UNIT_COV=${DLC_UNIT_COVERAGE%"%"}
|
||||
|
||||
# Update BDD coverage badge if value is set (use --no-push to avoid race conditions)
|
||||
if [ -n "$BDD_COV" ]; then
|
||||
echo "📊 Updating BDD coverage badge to ${BDD_COV}%"
|
||||
./scripts/ci-update-coverage-badge.sh "$BDD_COV" "bdd" --no-push
|
||||
fi
|
||||
|
||||
# Update swagger version regardless of bump
|
||||
source VERSION
|
||||
NEW_VERSION="$MAJOR.$MINOR.$PATCH${PRERELEASE:+-$PRERELEASE}"
|
||||
sed -i "s|// @version [0-9.]*|// @version $NEW_VERSION|" cmd/server/main.go
|
||||
# Update Unit coverage badge if value is set (use --no-push to avoid race conditions)
|
||||
if [ -n "$UNIT_COV" ]; then
|
||||
echo "📊 Updating Unit coverage badge to ${UNIT_COV}%"
|
||||
./scripts/ci-update-coverage-badge.sh "$UNIT_COV" "unit" --no-push
|
||||
fi
|
||||
|
||||
# Commit version changes if bumped
|
||||
if [ "$VERSION_BUMPED" = "true" ]; then
|
||||
git config --global user.name "CI Bot"
|
||||
git config --global user.email "ci@arcodange.fr"
|
||||
git add VERSION cmd/server/main.go README.md
|
||||
git commit -m "chore: auto version bump [skip ci]" || echo "No changes to commit"
|
||||
# Check for version bump on main branch
|
||||
if [ "${{ github.ref }}" = "refs/heads/main" ]; then
|
||||
echo "🔖 Checking for version bump..."
|
||||
./scripts/ci-version-bump.sh "${{ github.event.head_commit.message }}" --no-push
|
||||
fi
|
||||
|
||||
# Single push for all commits (this is the ONLY push in the entire workflow)
|
||||
if [ -n "$(git status --porcelain)" ]; then
|
||||
echo "💾 Changes detected, pushing all commits..."
|
||||
git push
|
||||
echo "🎉 Successfully pushed all updates"
|
||||
else
|
||||
echo "ℹ️ No changes to push"
|
||||
fi
|
||||
|
||||
- name: Login to Gitea Container Registry
|
||||
if: github.ref == 'refs/heads/main'
|
||||
uses: docker/login-action@v3
|
||||
with:
|
||||
registry: ${{ env.CI_REGISTRY }}
|
||||
username: ${{ github.actor }}
|
||||
password: ${{ secrets.PACKAGES_TOKEN }}
|
||||
|
||||
- name: Set up Docker Buildx
|
||||
if: github.ref == 'refs/heads/main'
|
||||
uses: docker/setup-buildx-action@v3
|
||||
|
||||
- name: Build and push Docker image
|
||||
if: github.ref == 'refs/heads/main'
|
||||
|
||||
# Trigger Docker push workflow on main branch
|
||||
trigger-docker-push:
|
||||
name: Trigger Docker Push
|
||||
needs: [build-cache, ci-pipeline]
|
||||
runs-on: ubuntu-latest-ca
|
||||
if: "!contains(github.event.head_commit.message, '[skip ci]') && github.actor != 'ci-bot' && github.ref == 'refs/heads/main'"
|
||||
|
||||
steps:
|
||||
- name: Trigger Docker Push Workflow
|
||||
run: |
|
||||
source VERSION
|
||||
IMAGE_VERSION="$MAJOR.$MINOR.$PATCH${PRERELEASE:+-$PRERELEASE}"
|
||||
|
||||
TAGS="$IMAGE_VERSION latest ${{ github.sha }}"
|
||||
echo "Building Docker image with tags: $TAGS"
|
||||
docker build -t dance-lessons-coach .
|
||||
|
||||
for TAG in $TAGS; do
|
||||
IMAGE_NAME="${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:$TAG"
|
||||
echo "Tagging and pushing: $IMAGE_NAME"
|
||||
docker tag dance-lessons-coach "$IMAGE_NAME"
|
||||
docker push "$IMAGE_NAME"
|
||||
done
|
||||
|
||||
- name: Show published images
|
||||
if: github.ref == 'refs/heads/main'
|
||||
run: |
|
||||
source VERSION
|
||||
IMAGE_VERSION="$MAJOR.$MINOR.$PATCH${PRERELEASE:+-$PRERELEASE}"
|
||||
echo "📦 Published Docker images:"
|
||||
echo " - ${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:$IMAGE_VERSION"
|
||||
echo " - ${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:latest"
|
||||
echo " - ${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:${{ github.sha }}"
|
||||
echo "🚀 Triggering Docker Push workflow..."
|
||||
curl -X POST \
|
||||
-H "Authorization: token ${{ secrets.GITEA_TOKEN || secrets.PACKAGES_TOKEN }}" \
|
||||
-H "Content-Type: application/json" \
|
||||
"${{ env.GITEA_INTERNAL }}api/v1/repos/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}/actions/workflows/docker-push.yaml/dispatches" \
|
||||
-d '{"ref":"${{ github.ref }}"}'
|
||||
echo "✅ Docker Push workflow triggered successfully!"
|
||||
|
||||
73
.gitea/workflows/docker-push.yaml
Normal file
73
.gitea/workflows/docker-push.yaml
Normal file
@@ -0,0 +1,73 @@
|
||||
---
|
||||
# dance-lessons-coach Docker Push Workflow
|
||||
# Separate workflow for Docker image building and pushing
|
||||
# Can be triggered manually or by CI/CD workflow
|
||||
|
||||
name: Docker Push
|
||||
|
||||
on:
|
||||
# Manual trigger for testing or production
|
||||
workflow_dispatch:
|
||||
inputs:
|
||||
ref:
|
||||
description: 'Git reference (branch/tag)'
|
||||
required: false
|
||||
type: string
|
||||
default: ''
|
||||
|
||||
# Environment variables
|
||||
env:
|
||||
GITEA_INTERNAL: "https://gitea.arcodange.lab/"
|
||||
GITEA_EXTERNAL: "https://gitea.arcodange.fr/"
|
||||
GITEA_ORG: "arcodange"
|
||||
GITEA_REPO: "dance-lessons-coach"
|
||||
CI_REGISTRY: "gitea.arcodange.lab"
|
||||
|
||||
jobs:
|
||||
docker-push:
|
||||
name: Docker Push
|
||||
runs-on: ubuntu-latest-ca
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
ref: ${{ github.event.inputs.ref || github.ref }}
|
||||
|
||||
- name: Login to Gitea Container Registry
|
||||
uses: docker/login-action@v3
|
||||
with:
|
||||
registry: ${{ env.CI_REGISTRY }}
|
||||
username: ${{ github.actor }}
|
||||
password: ${{ secrets.PACKAGES_TOKEN }}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
- name: Build and push Docker image
|
||||
run: |
|
||||
source VERSION
|
||||
IMAGE_VERSION="$MAJOR.$MINOR.$PATCH${PRERELEASE:+-$PRERELEASE}"
|
||||
|
||||
TAGS="$IMAGE_VERSION latest ${{ github.sha }}"
|
||||
echo "Building Docker image with tags: $TAGS"
|
||||
|
||||
# Build using the standard Dockerfile (Attempt 2 - simplest approach)
|
||||
docker build -t dance-lessons-coach -f docker/Dockerfile .
|
||||
|
||||
for TAG in $TAGS; do
|
||||
IMAGE_NAME="${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:$TAG"
|
||||
echo "Tagging and pushing: $IMAGE_NAME"
|
||||
docker tag dance-lessons-coach "$IMAGE_NAME"
|
||||
docker push "$IMAGE_NAME"
|
||||
done
|
||||
|
||||
- name: Show published images
|
||||
run: |
|
||||
source VERSION
|
||||
IMAGE_VERSION="$MAJOR.$MINOR.$PATCH${PRERELEASE:+-$PRERELEASE}"
|
||||
echo "📦 Published Docker images:"
|
||||
echo " - ${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:$IMAGE_VERSION"
|
||||
echo " - ${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:latest"
|
||||
echo " - ${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:${{ github.sha }}"
|
||||
@@ -1,6 +1,6 @@
|
||||
# Git Hooks for DanceLessonsCoach
|
||||
# Git Hooks for dance-lessons-coach
|
||||
|
||||
This directory contains Git hooks for the DanceLessonsCoach project.
|
||||
This directory contains Git hooks for the dance-lessons-coach project.
|
||||
|
||||
## Available Hooks
|
||||
|
||||
|
||||
8
.gitignore
vendored
8
.gitignore
vendored
@@ -23,6 +23,14 @@ server.pid
|
||||
*.log
|
||||
pkg/server/docs/
|
||||
|
||||
# BDD test files
|
||||
features/**/*-config.yaml
|
||||
test-config.yaml
|
||||
test-v2-config.yaml
|
||||
|
||||
# CI/CD runner configuration
|
||||
config/runner
|
||||
.runner
|
||||
coverage.txt
|
||||
trigger.txt
|
||||
test_trigger.txt
|
||||
|
||||
@@ -1,16 +1,16 @@
|
||||
---
|
||||
name: bdd-testing
|
||||
description: Behavior-Driven Development testing for DanceLessonsCoach using Godog. Use when creating or running BDD tests, implementing new features with BDD, or validating API endpoints through Gherkin scenarios.
|
||||
description: Behavior-Driven Development testing for dance-lessons-coach using Godog. Use when creating or running BDD tests, implementing new features with BDD, or validating API endpoints through Gherkin scenarios.
|
||||
license: MIT
|
||||
metadata:
|
||||
author: DanceLessonsCoach Team
|
||||
author: dance-lessons-coach Team
|
||||
version: "1.0.0"
|
||||
based-on: pkg/bdd implementation
|
||||
---
|
||||
|
||||
# BDD Testing for DanceLessonsCoach
|
||||
# BDD Testing for dance-lessons-coach
|
||||
|
||||
Behavior-Driven Development testing framework using Godog for the DanceLessonsCoach project. This skill provides comprehensive guidance for creating, running, and maintaining BDD tests that validate API endpoints and system behavior.
|
||||
Behavior-Driven Development testing framework using Godog for the dance-lessons-coach project. This skill provides comprehensive guidance for creating, running, and maintaining BDD tests that validate API endpoints and system behavior.
|
||||
|
||||
## Key Concepts
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## What Was Created
|
||||
|
||||
A comprehensive `bdd_testing` skill that encapsulates all our BDD testing knowledge and experience from the DanceLessonsCoach project.
|
||||
A comprehensive `bdd_testing` skill that encapsulates all our BDD testing knowledge and experience from the dance-lessons-coach project.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
@@ -268,7 +268,7 @@ The skill has been validated:
|
||||
|
||||
## Conclusion
|
||||
|
||||
This `bdd_testing` skill represents the culmination of our BDD testing journey for DanceLessonsCoach. It captures:
|
||||
This `bdd_testing` skill represents the culmination of our BDD testing journey for dance-lessons-coach. It captures:
|
||||
|
||||
1. **All our hard-won knowledge** about Godog and BDD testing
|
||||
2. **Proven patterns** that work reliably
|
||||
@@ -283,7 +283,7 @@ The skill ensures that:
|
||||
- **Knowledge** is preserved and shared
|
||||
- **Debugging** is systematic and efficient
|
||||
|
||||
With this skill, the DanceLessonsCoach project has a robust, well-documented BDD testing framework that can scale with the project and support team growth.
|
||||
With this skill, the dance-lessons-coach project has a robust, well-documented BDD testing framework that can scale with the project and support team growth.
|
||||
|
||||
**Next Steps:**
|
||||
1. Use this skill for all new BDD feature development
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
package steps
|
||||
|
||||
import (
|
||||
"DanceLessonsCoach/pkg/bdd/testserver"
|
||||
"dance-lessons-coach/pkg/bdd/testserver"
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# BDD Best Practices for DanceLessonsCoach
|
||||
# BDD Best Practices for dance-lessons-coach
|
||||
|
||||
Based on our implementation experience with Godog and the existing `pkg/bdd` codebase.
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# BDD Testing Debugging Guide
|
||||
|
||||
Comprehensive guide to debugging BDD tests for DanceLessonsCoach.
|
||||
Comprehensive guide to debugging BDD tests for dance-lessons-coach.
|
||||
|
||||
## Common Issues and Solutions
|
||||
|
||||
@@ -15,7 +15,12 @@ Feature: Greet Service
|
||||
Then the response should be "..." # ??? UNDEFINED STEP
|
||||
```
|
||||
|
||||
**Root Cause:** Step patterns don't match Godog's exact expectations.
|
||||
**Root Cause:** Step patterns don't match Godog's exact expectations. Godog is very particular about regex escaping.
|
||||
|
||||
**Common Pattern Issues:**
|
||||
- `\"` vs `\\"` (single vs double escaping)
|
||||
- Exact quote handling in JSON patterns
|
||||
- Parameter capture group syntax
|
||||
|
||||
**Debugging Steps:**
|
||||
|
||||
@@ -28,25 +33,30 @@ Feature: Greet Service
|
||||
```
|
||||
You can implement step definitions for the undefined steps with these snippets:
|
||||
|
||||
func theServerIsRunning() error {
|
||||
func theResponseShouldBe(arg1, arg2 string) error {
|
||||
return godog.ErrPending
|
||||
}
|
||||
|
||||
func iRequestTheDefaultGreeting() error {
|
||||
return godog.ErrPending
|
||||
func InitializeScenario(ctx *godog.ScenarioContext) {
|
||||
ctx.Step(`^the response should be "{\\"([^"]*)\\":\\"([^"]*)\\"}"$`, theResponseShouldBe)
|
||||
}
|
||||
```
|
||||
|
||||
3. **Compare with your implementation:**
|
||||
```go
|
||||
// ❌ Wrong pattern
|
||||
ctx.Step(`^the server is running$`, sc.theServerIsRunning)
|
||||
// ❌ Wrong pattern (single escaping)
|
||||
ctx.Step(`^the response should be "{\"([^"]*)\":\"([^"]*)\"}"$`, sc.commonSteps.theResponseShouldBe)
|
||||
|
||||
// ✅ Correct pattern (matches Godog's suggestion)
|
||||
ctx.Step(`^the server is running$`, sc.theServerIsRunning)
|
||||
// ✅ Correct pattern (double escaping - matches Godog's suggestion)
|
||||
ctx.Step(`^the response should be "{\\"([^"]*)\\":\\"([^"]*)\\"}"$`, sc.commonSteps.theResponseShouldBe)
|
||||
```
|
||||
|
||||
**Solution:** Use Godog's EXACT regex patterns.
|
||||
**Key Insight:** Godog expects `\\"` (four backslashes + quote) for escaped quotes in JSON patterns, not `\"` (two backslashes + quote).
|
||||
|
||||
**Solution:** Use Godog's EXACT regex patterns, paying special attention to:
|
||||
- JSON escaping: `\\"` not `\"`
|
||||
- Parameter names: Use `arg1, arg2` as suggested
|
||||
- Capture groups: Match Godog's exact regex syntax
|
||||
|
||||
### 2. JSON Comparison Failures
|
||||
|
||||
|
||||
@@ -87,4 +87,10 @@ Godog's step matching is **very specific by design**:
|
||||
- It provides exact patterns to ensure consistency
|
||||
- Following its suggestions guarantees your steps will be recognized
|
||||
|
||||
**Remember**: The "undefined" warnings are Godog telling you exactly how to fix your step definitions!
|
||||
**Remember**: The "undefined" warnings are Godog telling you exactly how to fix your step definitions!
|
||||
## Critical Pattern Fix
|
||||
|
||||
**File:** `pkg/bdd/steps/steps.go`
|
||||
**Line:** 80
|
||||
**Issue:** Step pattern must use double escaping (4 backslashes + quote) not single escaping (2 backslashes + quote)
|
||||
**Pattern:** `^the response should be "{\\"([^"]*)\\":\\"([^"]*)\\"}"$`
|
||||
|
||||
@@ -345,13 +345,16 @@ resp, err := testClient.Do(req)
|
||||
// pkg/bdd/bdd_test.go
|
||||
func TestBDD(t *testing.T) {
|
||||
suite := godog.TestSuite{
|
||||
Name: "DanceLessonsCoach BDD Tests",
|
||||
Name: "dance-lessons-coach BDD Tests",
|
||||
TestSuiteInitializer: bdd.InitializeTestSuite,
|
||||
ScenarioInitializer: bdd.InitializeScenario,
|
||||
Options: &godog.Options{
|
||||
Format: "progress",
|
||||
Paths: []string{"."},
|
||||
TestingT: t,
|
||||
TestingT: t,
|
||||
Strict: true,
|
||||
Randomize: -1,
|
||||
StopOnFailure: true,
|
||||
// Enable parallel execution
|
||||
Concurrency: 4, // Number of parallel scenarios
|
||||
},
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
|
||||
set -e
|
||||
|
||||
echo "🧪 Running BDD tests for DanceLessonsCoach..."
|
||||
echo "🧪 Running BDD tests for dance-lessons-coach..."
|
||||
echo "============================================"
|
||||
|
||||
# Run tests with verbose output
|
||||
|
||||
@@ -3,7 +3,7 @@ name: changelog-manager
|
||||
description: A skill to help agents properly maintain and utilize AGENT_CHANGELOG.md for tracking contributions and decisions
|
||||
license: MIT
|
||||
metadata:
|
||||
author: DanceLessonsCoach Team
|
||||
author: dance-lessons-coach Team
|
||||
version: "1.0.0"
|
||||
role: Documentation Assistant
|
||||
purpose: Maintain consistent, useful changelog entries
|
||||
|
||||
@@ -3,7 +3,7 @@ name: commit-message
|
||||
description: Helps create proper Gitmoji commit messages following the Common Gitmoji Reference from AGENTS.md. Use when creating commits to ensure consistent, visual commit messages. Includes Git hooks for automatic code formatting and dependency management.
|
||||
license: MIT
|
||||
metadata:
|
||||
author: DanceLessonsCoach Team
|
||||
author: dance-lessons-coach Team
|
||||
version: "1.1.0"
|
||||
based-on: AGENTS.md Common Gitmoji Reference
|
||||
---
|
||||
@@ -52,7 +52,7 @@ git commit -m "✨ feat: implement BDD testing framework"
|
||||
|
||||
### Issue References
|
||||
```bash
|
||||
# When closing an issue
|
||||
# When closing a single issue
|
||||
git commit -m "✨ feat: implement workflow optimization (closes #2)"
|
||||
|
||||
# When fixing a bug
|
||||
@@ -63,6 +63,14 @@ git commit -m "📝 docs: update workflow documentation (related to #2)"
|
||||
|
||||
# When referencing for context
|
||||
git commit -m "♻️ refactor: clean up CI code (see #3)"
|
||||
|
||||
# For PR merges closing multiple issues (USE SEPARATE LINES!)
|
||||
git commit -m "✨ merge: implement authentication system
|
||||
|
||||
Closes #4
|
||||
Closes #5
|
||||
Closes #6
|
||||
Refs #7, #8"
|
||||
```
|
||||
|
||||
### Bug Fix
|
||||
@@ -115,7 +123,7 @@ The suggestions are just helpful reminders, never requirements.
|
||||
🔍 Checking for relevant issues...
|
||||
📋 Found 1 open issue(s):
|
||||
#2: Optimize Gitea Workflow for Main Branch
|
||||
https://gitea.arcodange.lab/arcodange/DanceLessonsCoach/issues/2
|
||||
https://gitea.arcodange.lab/arcodange/dance-lessons-coach/issues/2
|
||||
|
||||
💡 Suggested commit message formats:
|
||||
- closes #<number> (when issue is fully resolved)
|
||||
@@ -139,6 +147,29 @@ Example: ✨ feat: implement workflow (closes #2)
|
||||
**GitHub/Gitea Compatible:**
|
||||
These formats are recognized by both GitHub and Gitea to automatically close issues.
|
||||
|
||||
### ⚠️ IMPORTANT: Multiple Issue Closing
|
||||
|
||||
**For PR merge commits that close multiple issues, use SEPARATE lines:**
|
||||
|
||||
```markdown
|
||||
✨ merge: implement authentication system
|
||||
|
||||
Closes #4
|
||||
Closes #5 ← Use separate lines!
|
||||
Closes #6 ← This ensures ALL issues are closed
|
||||
Refs #7, #8
|
||||
```
|
||||
|
||||
**❌ Avoid this (only closes first issue):**
|
||||
```markdown
|
||||
✨ merge: implement authentication system
|
||||
|
||||
Closes #4, #5, #6 ← Only #4 gets closed!
|
||||
Refs #7, #8
|
||||
```
|
||||
|
||||
**Why this matters:** GitHub/Gitea issue trackers typically only process the FIRST issue reference when multiple issues are listed on the same line. Using separate lines ensures ALL referenced issues are properly closed.
|
||||
|
||||
## Git Hooks for Code Quality
|
||||
|
||||
The project includes Git hooks that automatically run before commits to ensure code quality:
|
||||
@@ -254,7 +285,7 @@ echo "$commit_message" | grep -E "^[🎨✨🐛📝🔧♻️🚀🔒📦🔥
|
||||
```bash
|
||||
#!/bin/sh
|
||||
|
||||
# DanceLessonsCoach pre-commit hook
|
||||
# dance-lessons-coach pre-commit hook
|
||||
# Runs go mod tidy and go fmt before allowing commits
|
||||
|
||||
echo "Running pre-commit hooks..."
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Git Hooks for DanceLessonsCoach
|
||||
# Git Hooks for dance-lessons-coach
|
||||
|
||||
This directory contains Git hooks for the DanceLessonsCoach project.
|
||||
This directory contains Git hooks for the dance-lessons-coach project.
|
||||
|
||||
## Available Hooks
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
#!/bin/sh
|
||||
|
||||
# DanceLessonsCoach pre-commit hook
|
||||
# dance-lessons-coach pre-commit hook
|
||||
# Runs go mod tidy, go fmt, and suggests issue references before allowing commits
|
||||
|
||||
echo "Running pre-commit hooks..."
|
||||
|
||||
@@ -25,7 +25,7 @@ fi
|
||||
echo "🔍 Checking for relevant issues..."
|
||||
|
||||
# Get list of open issues
|
||||
ISSUES_JSON=$($GITEA_CLIENT list-issues arcodange DanceLessonsCoach open 2>/dev/null || echo "[]")
|
||||
ISSUES_JSON=$($GITEA_CLIENT list-issues arcodange dance-lessons-coach open 2>/dev/null || echo "[]")
|
||||
|
||||
# Check if we got valid JSON
|
||||
if [ "$ISSUES_JSON" = "[]" ] || [ -z "$ISSUES_JSON" ]; then
|
||||
|
||||
@@ -12,6 +12,9 @@ The Gitea-Client skill provides comprehensive API access to Gitea repositories,
|
||||
|
||||
**Commands:**
|
||||
```bash
|
||||
# List available workflows
|
||||
gitea-client list-workflows <owner> <repo>
|
||||
|
||||
# List recent workflow jobs
|
||||
gitea-client list-jobs <owner> <repo> <workflow_id> [limit]
|
||||
|
||||
@@ -26,23 +29,68 @@ gitea-client list-workflow-jobs <owner> <repo> <workflow_run_id>
|
||||
|
||||
# Wait for job completion
|
||||
gitea-client wait-job <owner> <repo> <job_id> [timeout]
|
||||
|
||||
# Monitor workflow run until completion (with automatic updates)
|
||||
gitea-client monitor-workflow <owner> <repo> <workflow_run_id> [interval_seconds]
|
||||
|
||||
# Diagnose failed job with automatic error analysis
|
||||
gitea-client diagnose-job <owner> <repo> <job_id>
|
||||
|
||||
# Get summary of recent workflow runs
|
||||
gitea-client recent-workflows <owner> <repo> [limit] [status_filter]
|
||||
```
|
||||
|
||||
**Example Workflow:**
|
||||
```bash
|
||||
# 1. Find recent failed jobs
|
||||
gitea-client list-jobs arcodange dance-lessons-coach 5 10
|
||||
# 1. Get summary of recent workflows
|
||||
gitea-client recent-workflows arcodange dance-lessons-coach 5
|
||||
|
||||
# 2. Check status of specific job
|
||||
# 2. Monitor a specific workflow run until completion
|
||||
gitea-client monitor-workflow arcodange dance-lessons-coach 415 30
|
||||
|
||||
# 3. Diagnose a failed job automatically
|
||||
gitea-client diagnose-job arcodange dance-lessons-coach 759
|
||||
|
||||
# 4. List available workflows to get workflow IDs
|
||||
gitea-client list-workflows arcodange dance-lessons-coach
|
||||
|
||||
# 5. Check status of specific job
|
||||
gitea-client job-status arcodange dance-lessons-coach 706
|
||||
|
||||
# 3. Fetch logs for debugging
|
||||
# 6. Fetch logs for debugging
|
||||
gitea-client job-logs arcodange dance-lessons-coach 706 job_706_logs.txt
|
||||
|
||||
# 4. Analyze logs
|
||||
# 7. Analyze logs manually
|
||||
grep -i "error\|fail" job_706_logs.txt
|
||||
```
|
||||
|
||||
**Advanced Monitoring Example:**
|
||||
```bash
|
||||
# Monitor workflow and automatically diagnose if it fails
|
||||
WORKFLOW_ID=415
|
||||
TIMEOUT=300
|
||||
SECONDS_ELAPSED=0
|
||||
|
||||
while [ $SECONDS_ELAPSED -lt $TIMEOUT ]; do
|
||||
STATUS=$(gitea-client job-status arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.status')
|
||||
CONCLUSION=$(gitea-client job-status arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.conclusion')
|
||||
|
||||
echo "[$(date)] Status: $STATUS, Conclusion: ${CONCLUSION:-not completed}"
|
||||
|
||||
if [[ "$CONCLUSION" == "failure" ]]; then
|
||||
echo "Job failed! Running diagnosis..."
|
||||
gitea-client diagnose-job arcodange dance-lessons-coach $WORKFLOW_ID
|
||||
break
|
||||
elif [[ "$STATUS" != "in_progress" && "$STATUS" != "waiting" ]]; then
|
||||
echo "Job completed with status: $STATUS"
|
||||
break
|
||||
fi
|
||||
|
||||
sleep 30
|
||||
SECONDS_ELAPSED=$((SECONDS_ELAPSED + 30))
|
||||
done
|
||||
```
|
||||
|
||||
### 2. Pull Request Management
|
||||
|
||||
**Scenario:** Monitor and comment on PRs during CI/CD
|
||||
@@ -404,4 +452,79 @@ curl -s https://gitea.arcodange.lab/swagger.v1.json | \
|
||||
- **GitHub Actions**: https://docs.github.com/en/actions
|
||||
- **JQ Tutorial**: https://stedolan.github.io/jq/manual/
|
||||
|
||||
This reference guide provides comprehensive examples for using the gitea-client skill in real-world scenarios, covering job monitoring, PR management, issue tracking, and API discovery with practical, copy-paste-ready examples.
|
||||
This reference guide provides comprehensive examples for using the gitea-client skill in real-world scenarios, covering job monitoring, PR management, issue tracking, and API discovery with practical, copy-paste-ready examples.
|
||||
|
||||
## 🎯 Real-World Use Cases from dance-lessons-coach
|
||||
|
||||
### CI/CD Pipeline Debugging
|
||||
|
||||
**Scenario**: TLS certificate verification failures were blocking all CI/CD progress.
|
||||
|
||||
**Solution**: Replaced Docker Buildx with traditional docker build + push.
|
||||
|
||||
```bash
|
||||
# Before (Failed)
|
||||
# ERROR: failed to build: failed to solve: failed to push
|
||||
# tls: failed to verify certificate: x509: certificate signed by unknown authority
|
||||
|
||||
# After (Working)
|
||||
gitea-client diagnose-job arcodange dance-lessons-coach 766
|
||||
# Result: Building cache image: gitea.arcodange.lab/... (no TLS errors)
|
||||
|
||||
# Monitor the fix
|
||||
gitea-client monitor-workflow arcodange dance-lessons-coach 418 30
|
||||
```
|
||||
|
||||
### Automated CI Monitoring
|
||||
|
||||
```bash
|
||||
# Monitor workflow and auto-diagnose failures
|
||||
WORKFLOW_ID=418
|
||||
TIMEOUT=300
|
||||
SECONDS_ELAPSED=0
|
||||
|
||||
while [ $SECONDS_ELAPSED -lt $TIMEOUT ]; do
|
||||
STATUS=$(gitea-client job-status arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.status')
|
||||
CONCLUSION=$(gitea-client job-status arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.conclusion')
|
||||
|
||||
echo "[$(date)] Status: $STATUS, Conclusion: ${CONCLUSION:-not completed}"
|
||||
|
||||
if [[ "$CONCLUSION" == "failure" ]]; then
|
||||
echo "❌ Workflow failed! Running diagnosis..."
|
||||
gitea-client diagnose-job arcodange dance-lessons-coach $WORKFLOW_ID
|
||||
break
|
||||
elif [[ "$STATUS" != "in_progress" && "$STATUS" != "waiting" ]]; then
|
||||
echo "✅ Workflow completed: $STATUS"
|
||||
break
|
||||
fi
|
||||
|
||||
sleep 30
|
||||
SECONDS_ELAPSED=$((SECONDS_ELAPSED + 30))
|
||||
done
|
||||
```
|
||||
|
||||
### PR Management Automation
|
||||
|
||||
```bash
|
||||
# Automated PR triage based on CI results
|
||||
OPEN_PRS=$(gitea-client list-prs arcodange dance-lessons-coach | jq -r '.[] | select(.state == "open") | .number')
|
||||
|
||||
for pr in $OPEN_PRS; do
|
||||
PR_DETAILS=$(gitea-client pr-status arcodange dance-lessons-coach $pr)
|
||||
BRANCH=$(echo "$PR_DETAILS" | jq -r '.head.ref')
|
||||
|
||||
# Find related workflows
|
||||
WORKFLOWS=$(gitea-client recent-workflows arcodange dance-lessons-coach 5 | grep "$BRANCH" || echo "")
|
||||
|
||||
if [ -n "$WORKFLOWS" ]; then
|
||||
LATEST_WORKFLOW=$(echo "$WORKFLOWS" | head -1 | cut -d':' -f1)
|
||||
CONCLUSION=$(gitea-client job-status arcodange dance-lessons-coach $LATEST_WORKFLOW | jq -r '.conclusion')
|
||||
|
||||
if [ "$CONCLUSION" = "failure" ]; then
|
||||
gitea-client comment-pr arcodange dance-lessons-coach $pr "⚠️ CI Failed - Check workflow $LATEST_WORKFLOW"
|
||||
elif [ "$CONCLUSION" = "success" ]; then
|
||||
gitea-client comment-pr arcodange dance-lessons-coach $pr "✅ CI Passed - Ready for review!"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
```
|
||||
@@ -40,6 +40,18 @@ Create a token in Gitea:
|
||||
|
||||
## Commands
|
||||
|
||||
### List Workflows
|
||||
|
||||
```bash
|
||||
skill gitea-client list-workflows <owner> <repo>
|
||||
```
|
||||
|
||||
List available workflows for a repository.
|
||||
|
||||
**Arguments:**
|
||||
- `owner`: Repository owner
|
||||
- `repo`: Repository name
|
||||
|
||||
### List Jobs
|
||||
|
||||
```bash
|
||||
@@ -151,6 +163,80 @@ gitea-client list-workflow-jobs arcodange dance-lessons-coach 351 | jq '.jobs[]
|
||||
gitea-client list-workflow-jobs arcodange dance-lessons-coach 350
|
||||
```
|
||||
|
||||
### Monitor Workflow Run
|
||||
|
||||
```bash
|
||||
skill gitea-client monitor-workflow <owner> <repo> <workflow_run_id> [interval_seconds]
|
||||
```
|
||||
|
||||
Monitor a workflow run until completion with automatic updates.
|
||||
|
||||
**Arguments:**
|
||||
- `owner`: Repository owner
|
||||
- `repo`: Repository name
|
||||
- `workflow_run_id`: Workflow run ID
|
||||
- `interval_seconds`: Update interval in seconds (default: 30)
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
# Monitor workflow run 415 with 30-second updates
|
||||
gitea-client monitor-workflow arcodange dance-lessons-coach 415 30
|
||||
|
||||
# Monitor with faster updates (10 seconds)
|
||||
gitea-client monitor-workflow arcodange dance-lessons-coach 415 10
|
||||
```
|
||||
|
||||
### Diagnose Failed Job
|
||||
|
||||
```bash
|
||||
skill gitea-client diagnose-job <owner> <repo> <job_id>
|
||||
```
|
||||
|
||||
Diagnose a failed job with automatic error analysis.
|
||||
|
||||
**Arguments:**
|
||||
- `owner`: Repository owner
|
||||
- `repo`: Repository name
|
||||
- `job_id`: Job ID
|
||||
|
||||
**Features:**
|
||||
- Shows job details (status, conclusion, timestamps)
|
||||
- Displays last 50 lines of logs
|
||||
- Automatically extracts and highlights error messages
|
||||
- Shows workflow run context
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
# Diagnose failed job 759
|
||||
gitea-client diagnose-job arcodange dance-lessons-coach 759
|
||||
```
|
||||
|
||||
### Get Recent Workflows Summary
|
||||
|
||||
```bash
|
||||
skill gitea-client recent-workflows <owner> <repo> [limit] [status_filter]
|
||||
```
|
||||
|
||||
Get a summary of recent workflow runs.
|
||||
|
||||
**Arguments:**
|
||||
- `owner`: Repository owner
|
||||
- `repo`: Repository name
|
||||
- `limit`: Maximum number of workflows to show (default: 10)
|
||||
- `status_filter`: Filter by status (optional: completed, in_progress, queued, waiting)
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
# Show last 5 workflow runs
|
||||
gitea-client recent-workflows arcodange dance-lessons-coach 5
|
||||
|
||||
# Show only completed workflows
|
||||
gitea-client recent-workflows arcodange dance-lessons-coach 10 completed
|
||||
|
||||
# Show in-progress workflows
|
||||
gitea-client recent-workflows arcodange dance-lessons-coach 5 in_progress
|
||||
```
|
||||
|
||||
### Wait for Job Completion
|
||||
|
||||
```bash
|
||||
@@ -414,6 +500,70 @@ The skill handles common API errors:
|
||||
4. **Logging**: Redirect output to files for debugging
|
||||
5. **Timeouts**: Use reasonable timeouts for wait operations
|
||||
|
||||
## Enhanced Workflow Monitoring with New Commands
|
||||
|
||||
### Complete CI Debugging Workflow with New Commands
|
||||
|
||||
```bash
|
||||
# 1. Get summary of recent workflows to identify issues
|
||||
gitea-client recent-workflows arcodange dance-lessons-coach 10
|
||||
|
||||
# 2. Monitor a specific workflow run until completion
|
||||
gitea-client monitor-workflow arcodange dance-lessons-coach 415 30
|
||||
|
||||
# 3. If workflow fails, automatically diagnose all failed jobs
|
||||
WORKFLOW_ID=415
|
||||
WORKFLOW_STATUS=$(gitea-client job-status arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.status')
|
||||
WORKFLOW_CONCLUSION=$(gitea-client job-status arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.conclusion')
|
||||
|
||||
if [ "$WORKFLOW_CONCLUSION" = "failure" ]; then
|
||||
echo "Workflow failed! Diagnosing all jobs..."
|
||||
|
||||
# Get all jobs in the workflow
|
||||
JOBS=$(gitea-client list-workflow-jobs arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.jobs[] | select(.conclusion == "failure") | .id')
|
||||
|
||||
# Diagnose each failed job
|
||||
for job_id in $JOBS; do
|
||||
echo "Diagnosing job $job_id:"
|
||||
gitea-client diagnose-job arcodange dance-lessons-coach $job_id
|
||||
echo "========================================"
|
||||
done
|
||||
fi
|
||||
|
||||
# 4. Advanced monitoring with automatic diagnosis
|
||||
WORKFLOW_ID=415
|
||||
TIMEOUT=300
|
||||
SECONDS_ELAPSED=0
|
||||
|
||||
while [ $SECONDS_ELAPSED -lt $TIMEOUT ]; do
|
||||
STATUS=$(gitea-client job-status arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.status')
|
||||
CONCLUSION=$(gitea-client job-status arcodange dance-lessons-coach $WORKFLOW_ID | jq -r '.conclusion')
|
||||
|
||||
echo "[$(date)] Status: $STATUS, Conclusion: ${CONCLUSION:-not completed}"
|
||||
|
||||
if [[ "$CONCLUSION" == "failure" ]]; then
|
||||
echo "Workflow failed! Running automatic diagnosis..."
|
||||
gitea-client diagnose-job arcodange dance-lessons-coach $WORKFLOW_ID
|
||||
|
||||
# Find PR and comment
|
||||
PR_NUMBER=$(gitea-client list-prs arcodange dance-lessons-coach | \
|
||||
jq -r '.[] | select(.head.ref == "feature/user-authentication-bdd") | .number')
|
||||
|
||||
if [ -n "$PR_NUMBER" ]; then
|
||||
gitea-client comment-pr arcodange dance-lessons-coach $PR_NUMBER \
|
||||
"⚠️ CI Workflow $WORKFLOW_ID failed. See diagnosis above for details."
|
||||
fi
|
||||
break
|
||||
elif [[ "$STATUS" != "in_progress" && "$STATUS" != "waiting" ]]; then
|
||||
echo "Workflow completed with status: $STATUS"
|
||||
break
|
||||
fi
|
||||
|
||||
sleep 30
|
||||
SECONDS_ELAPSED=$((SECONDS_ELAPSED + 30))
|
||||
done
|
||||
```
|
||||
|
||||
## Real-World Use Case: PR Commenting Workflow
|
||||
|
||||
The Gitea client skill excels at automated PR commenting during CI/CD workflows.
|
||||
|
||||
@@ -52,6 +52,20 @@ api_request() {
|
||||
fi
|
||||
}
|
||||
|
||||
# List workflows
|
||||
cmd_list_workflows() {
|
||||
local owner="$1"
|
||||
local repo="$2"
|
||||
|
||||
if [[ -z "$owner" || -z "$repo" ]]; then
|
||||
echo "Usage: $0 list-workflows <owner> <repo>" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
local endpoint="/repos/${owner}/${repo}/actions/workflows"
|
||||
api_request "GET" "$endpoint"
|
||||
}
|
||||
|
||||
# List jobs
|
||||
cmd_list_jobs() {
|
||||
local owner="$1"
|
||||
@@ -189,6 +203,31 @@ cmd_wait_job() {
|
||||
}
|
||||
|
||||
# Comment on PR
|
||||
# Create a pull request
|
||||
cmd_create_pr() {
|
||||
local owner="$1"
|
||||
local repo="$2"
|
||||
local title="$3"
|
||||
local body="$4"
|
||||
local head="$5"
|
||||
local base="${6:-main}"
|
||||
|
||||
if [[ -z "$owner" || -z "$repo" || -z "$title" || -z "$head" ]]; then
|
||||
echo "Usage: $0 create-pr <owner> <repo> <title> <body> <head_branch> [base_branch]" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
local endpoint="/repos/${owner}/${repo}/pulls"
|
||||
local data
|
||||
data=$(jq -n \
|
||||
--arg title "$title" \
|
||||
--arg body "$body" \
|
||||
--arg head "$head" \
|
||||
--arg base "$base" \
|
||||
'{title: $title, body: $body, head: $head, base: $base}')
|
||||
api_request "POST" "$endpoint" "$data"
|
||||
}
|
||||
|
||||
cmd_comment_pr() {
|
||||
local owner="$1"
|
||||
local repo="$2"
|
||||
@@ -201,7 +240,8 @@ cmd_comment_pr() {
|
||||
fi
|
||||
|
||||
local endpoint="/repos/${owner}/${repo}/issues/${pr_number}/comments"
|
||||
local data="{\"body\": \"${comment}\"}"
|
||||
local data
|
||||
data=$(jq -n --arg body "$comment" '{body: $body}')
|
||||
api_request "POST" "$endpoint" "$data"
|
||||
}
|
||||
|
||||
@@ -226,12 +266,17 @@ main() {
|
||||
shift || true
|
||||
|
||||
case "$command" in
|
||||
list-workflows) cmd_list_workflows "$@" ;;
|
||||
list-jobs) cmd_list_jobs "$@" ;;
|
||||
job-status) cmd_job_status "$@" ;;
|
||||
job-logs) cmd_job_logs "$@" ;;
|
||||
action-logs) cmd_action_logs "$@" ;;
|
||||
list-workflow-jobs) cmd_list_workflow_jobs "$@" ;;
|
||||
wait-job) cmd_wait_job "$@" ;;
|
||||
monitor-workflow) cmd_monitor_workflow "$@" ;;
|
||||
diagnose-job) cmd_diagnose_job "$@" ;;
|
||||
recent-workflows) cmd_recent_workflows "$@" ;;
|
||||
create-pr) cmd_create_pr "$@" ;;
|
||||
comment-pr) cmd_comment_pr "$@" ;;
|
||||
pr-status) cmd_pr_status "$@" ;;
|
||||
list-issues) cmd_list_issues "$@" ;;
|
||||
@@ -241,16 +286,22 @@ main() {
|
||||
list-wiki) cmd_list_wiki "$@" ;;
|
||||
create-wiki) cmd_create_wiki "$@" ;;
|
||||
get-wiki) cmd_get_wiki "$@" ;;
|
||||
trigger-workflow) cmd_trigger_workflow "$@" ;;
|
||||
*)
|
||||
echo "Usage: $0 <command> [args...]" >&2
|
||||
echo "" >&2
|
||||
echo "Commands:" >&2
|
||||
echo " list-workflows <owner> <repo>" >&2
|
||||
echo " list-jobs <owner> <repo> <workflow_id> [limit]" >&2
|
||||
echo " job-status <owner> <repo> <job_id>" >&2
|
||||
echo " job-logs <owner> <repo> <job_id> [output_file]" >&2
|
||||
echo " action-logs <owner> <repo> <action_job_id> [output_file]" >&2
|
||||
echo " list-workflow-jobs <owner> <repo> <workflow_run_id>" >&2
|
||||
echo " wait-job <owner> <repo> <job_id> [timeout]" >&2
|
||||
echo " monitor-workflow <owner> <repo> <workflow_run_id> [interval_seconds]" >&2
|
||||
echo " diagnose-job <owner> <repo> <job_id>" >&2
|
||||
echo " recent-workflows <owner> <repo> [limit] [status_filter]" >&2
|
||||
echo " create-pr <owner> <repo> <title> <body> <head_branch> [base_branch]" >&2
|
||||
echo " comment-pr <owner> <repo> <pr_number> <comment>" >&2
|
||||
echo " pr-status <owner> <repo> <pr_number>" >&2
|
||||
echo " list-issues <owner> <repo> [state]" >&2
|
||||
@@ -260,6 +311,7 @@ main() {
|
||||
echo " list-wiki <owner> <repo>" >&2
|
||||
echo " create-wiki <owner> <repo> <title> <content> [message]" >&2
|
||||
echo " get-wiki <owner> <repo> <page_name>" >&2
|
||||
echo " trigger-workflow <owner> <repo> <workflow_file> <branch>" >&2
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
@@ -386,7 +438,140 @@ cmd_get_wiki() {
|
||||
fi
|
||||
|
||||
local endpoint="/repos/$owner/$repo/wiki/page/$page_name"
|
||||
api_request "GET" "$endpoint"
|
||||
local response=$(api_request "GET" "$endpoint")
|
||||
|
||||
# Extract and decode the content_base64 field
|
||||
local content_b64=$(echo "$response" | jq -r '.content_base64')
|
||||
if [[ "$content_b64" != "null" && -n "$content_b64" ]]; then
|
||||
echo "$content_b64" | base64 --decode
|
||||
else
|
||||
echo "$response"
|
||||
fi
|
||||
}
|
||||
|
||||
# Trigger workflow
|
||||
cmd_trigger_workflow() {
|
||||
local owner="$1"
|
||||
local repo="$2"
|
||||
local workflow_file="$3"
|
||||
local branch="$4"
|
||||
|
||||
if [[ -z "$owner" || -z "$repo" || -z "$workflow_file" || -z "$branch" ]]; then
|
||||
echo "Usage: $0 trigger-workflow <owner> <repo> <workflow_file> <branch>" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
local endpoint="/repos/${owner}/${repo}/actions/workflows/${workflow_file}/dispatches"
|
||||
local data="{\"ref\": \"${branch}\"}"
|
||||
|
||||
echo "Triggering workflow: ${workflow_file} on branch: ${branch}"
|
||||
api_request "POST" "$endpoint" "$data"
|
||||
echo "Workflow triggered successfully!"
|
||||
}
|
||||
|
||||
# Monitor workflow run until completion
|
||||
cmd_monitor_workflow() {
|
||||
local owner="$1"
|
||||
local repo="$2"
|
||||
local workflow_run_id="$3"
|
||||
local interval="${4:-30}"
|
||||
|
||||
if [[ -z "$owner" || -z "$repo" || -z "$workflow_run_id" ]]; then
|
||||
echo "Usage: $0 monitor-workflow <owner> <repo> <workflow_run_id> [interval_seconds]" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Monitoring workflow run $workflow_run_id (interval: ${interval}s)..."
|
||||
echo "Press Ctrl+C to stop monitoring"
|
||||
|
||||
while true; do
|
||||
local endpoint="/repos/${owner}/${repo}/actions/runs/${workflow_run_id}"
|
||||
local status=$(api_request "GET" "$endpoint" | jq -r '.status')
|
||||
local conclusion=$(api_request "GET" "$endpoint" | jq -r '.conclusion')
|
||||
local updated_at=$(api_request "GET" "$endpoint" | jq -r '.updated_at')
|
||||
|
||||
echo "[$(date +'%Y-%m-%d %H:%M:%S')] Status: $status, Conclusion: ${conclusion:-not completed}, Updated: $updated_at"
|
||||
|
||||
# List jobs in this workflow
|
||||
local jobs_endpoint="/repos/${owner}/${repo}/actions/runs/${workflow_run_id}/jobs"
|
||||
local jobs=$(api_request "GET" "$jobs_endpoint")
|
||||
echo "Jobs:"
|
||||
echo "$jobs" | jq -r '.jobs[] | " \(.id): \(.name) - \(.status) \(if .conclusion then "(\(.conclusion))" else "" end)"'
|
||||
|
||||
# Check if workflow is completed
|
||||
if [[ "$status" != "queued" && "$status" != "in_progress" && "$status" != "waiting" ]]; then
|
||||
echo "Workflow run $workflow_run_id has completed with status: $status and conclusion: ${conclusion:-none}"
|
||||
break
|
||||
fi
|
||||
|
||||
sleep "$interval"
|
||||
done
|
||||
}
|
||||
|
||||
# Diagnose failed job
|
||||
cmd_diagnose_job() {
|
||||
local owner="$1"
|
||||
local repo="$2"
|
||||
local job_id="$3"
|
||||
|
||||
if [[ -z "$owner" || -z "$repo" || -z "$job_id" ]]; then
|
||||
echo "Usage: $0 diagnose-job <owner> <repo> <job_id>" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Diagnosing job $job_id..."
|
||||
|
||||
# Get job details
|
||||
local job_endpoint="/repos/${owner}/${repo}/actions/jobs/${job_id}"
|
||||
local job_details=$(api_request "GET" "$job_endpoint")
|
||||
|
||||
echo "Job Details:"
|
||||
echo "$job_details" | jq '. | {id, name, status, conclusion, started_at, completed_at, runner_name}'
|
||||
|
||||
# Get job logs
|
||||
local logs_endpoint="/repos/${owner}/${repo}/actions/jobs/${job_id}/logs"
|
||||
echo -e "\nLast 50 lines of logs:"
|
||||
api_request "GET" "$logs_endpoint" | tail -50
|
||||
|
||||
# Look for errors
|
||||
echo -e "\nError analysis:"
|
||||
api_request "GET" "$logs_endpoint" | grep -i "error\|fail\|panic\|exception" | tail -10
|
||||
|
||||
# Get workflow run details
|
||||
local run_id=$(echo "$job_details" | jq -r '.run_id')
|
||||
local run_endpoint="/repos/${owner}/${repo}/actions/runs/${run_id}"
|
||||
local run_details=$(api_request "GET" "$run_endpoint")
|
||||
|
||||
echo -e "\nWorkflow Run Details:"
|
||||
echo "$run_details" | jq '. | {id, display_title, status, conclusion, head_branch, head_sha}'
|
||||
}
|
||||
|
||||
# Get recent workflow runs summary
|
||||
cmd_recent_workflows() {
|
||||
local owner="$1"
|
||||
local repo="$2"
|
||||
local limit="${3:-10}"
|
||||
local status_filter="${4:-}"
|
||||
|
||||
if [[ -z "$owner" || -z "$repo" ]]; then
|
||||
echo "Usage: $0 recent-workflows <owner> <repo> [limit] [status_filter]" >&2
|
||||
echo "Status filter options: all, completed, in_progress, queued, waiting" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
local endpoint="/repos/${owner}/${repo}/actions/runs?limit=${limit}"
|
||||
if [[ -n "$status_filter" ]]; then
|
||||
endpoint="$endpoint&status=$status_filter"
|
||||
fi
|
||||
|
||||
local workflows=$(api_request "GET" "$endpoint")
|
||||
|
||||
echo "Recent Workflow Runs (showing $limit most recent):"
|
||||
echo "$workflows" | jq -r '.workflow_runs[] | "\(.id): \(.display_title) - \(.status) \(if .conclusion then "(\(.conclusion))" else "" end) - \(.updated_at)"'
|
||||
|
||||
# Show summary statistics
|
||||
echo -e "\nSummary:"
|
||||
echo "$workflows" | jq -r '.workflow_runs | group_by(.conclusion) | .[] | " \(.[0].conclusion // "in_progress"): \(length)"'
|
||||
}
|
||||
|
||||
main "$@"
|
||||
|
||||
@@ -3,7 +3,7 @@ name: product-owner-assistant
|
||||
description: A skill for managing Gitea issues, organizing them into Epics and User Stories, and facilitating product backlog refinement
|
||||
license: MIT
|
||||
metadata:
|
||||
author: DanceLessonsCoach Team
|
||||
author: dance-lessons-coach Team
|
||||
version: "1.0.0"
|
||||
dependencies:
|
||||
- gitea-client
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## ✅ What We've Created
|
||||
|
||||
A comprehensive **Product Owner Assistant** skill for the DanceLessonsCoach project that enables effective agile product management using Gitea issues and wiki.
|
||||
A comprehensive **Product Owner Assistant** skill for the dance-lessons-coach project that enables effective agile product management using Gitea issues and wiki.
|
||||
|
||||
## 🎯 Key Components
|
||||
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
set -e
|
||||
|
||||
# Configuration
|
||||
SKILL_DIR="/Users/gabrielradureau/Work/Vibe/DanceLessonsCoach/.vibe/skills/product-owner-assistant"
|
||||
SKILL_DIR="/Users/gabrielradureau/Work/Vibe/dance-lessons-coach/.vibe/skills/product-owner-assistant"
|
||||
DATA_DIR="$SKILL_DIR/data"
|
||||
GITEA_CLIENT="skill gitea-client"
|
||||
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
set -e
|
||||
|
||||
# Configuration
|
||||
SKILL_DIR="/Users/gabrielradureau/Work/Vibe/DanceLessonsCoach/.vibe/skills/product-owner-assistant"
|
||||
SKILL_DIR="/Users/gabrielradureau/Work/Vibe/dance-lessons-coach/.vibe/skills/product-owner-assistant"
|
||||
GITEA_API="https://gitea.arcodange.lab/api/v1"
|
||||
OWNER="arcodange"
|
||||
REPO="dance-lessons-coach"
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
This document describes the standardized workflow for implementing user stories in the DanceLessonsCoach project. The workflow follows a test-driven development approach with clear phases and deliverables.
|
||||
This document describes the standardized workflow for implementing user stories in the dance-lessons-coach project. The workflow follows a test-driven development approach with clear phases and deliverables.
|
||||
|
||||
## 🔄 Workflow Diagram
|
||||
|
||||
@@ -89,7 +89,7 @@ Feature: User Persistence
|
||||
|
||||
```bash
|
||||
# Run BDD tests
|
||||
cd /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach
|
||||
cd /Users/gabrielradureau/Work/Vibe/dance-lessons-coach
|
||||
godog features/user-persistence.feature
|
||||
|
||||
# Expected: Test fails with "pending" or "undefined" steps
|
||||
|
||||
@@ -3,7 +3,7 @@ name: skill-creator
|
||||
description: Creates and manages Mistral Vibe skills following the Agent Skills specification. Use when you need to create new skills, validate existing ones, or maintain skill consistency across projects.
|
||||
license: MIT
|
||||
metadata:
|
||||
author: DanceLessonsCoach Team
|
||||
author: dance-lessons-coach Team
|
||||
version: "1.0.0"
|
||||
---
|
||||
|
||||
|
||||
@@ -121,4 +121,4 @@ The skill_creator has been tested with:
|
||||
- **Compliance**: Automatic validation ensures specification compliance
|
||||
- **Maintainability**: Clear structure makes skills easier to update
|
||||
|
||||
The skill_creator provides a solid foundation for building a library of high-quality, specification-compliant skills for the DanceLessonsCoach project.
|
||||
The skill_creator provides a solid foundation for building a library of high-quality, specification-compliant skills for the dance-lessons-coach project.
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## 📋 Overview
|
||||
|
||||
This skill provides comprehensive guidance and automation for managing OpenAPI/Swagger documentation in the DanceLessonsCoach project. It captures our best practices, tagging strategies, and automation patterns for maintaining high-quality API documentation.
|
||||
This skill provides comprehensive guidance and automation for managing OpenAPI/Swagger documentation in the dance-lessons-coach project. It captures our best practices, tagging strategies, and automation patterns for maintaining high-quality API documentation.
|
||||
|
||||
## 🎯 Key Features
|
||||
|
||||
@@ -145,6 +145,6 @@ Found a better way? Have a new pattern?
|
||||
|
||||
---
|
||||
|
||||
**Maintained by:** DanceLessonsCoach Team
|
||||
**Maintained by:** dance-lessons-coach Team
|
||||
**License:** MIT
|
||||
**Status:** Actively developed
|
||||
@@ -1,16 +1,16 @@
|
||||
---
|
||||
name: swagger-documentation
|
||||
description: Manage and optimize OpenAPI/Swagger documentation for DanceLessonsCoach
|
||||
description: Manage and optimize OpenAPI/Swagger documentation for dance-lessons-coach
|
||||
license: MIT
|
||||
metadata:
|
||||
author: DanceLessonsCoach Team
|
||||
author: dance-lessons-coach Team
|
||||
version: "1.0.0"
|
||||
---
|
||||
|
||||
# Swagger Documentation Skill
|
||||
|
||||
**Name:** `swagger-documentation`
|
||||
**Purpose:** Manage and optimize OpenAPI/Swagger documentation for DanceLessonsCoach
|
||||
**Purpose:** Manage and optimize OpenAPI/Swagger documentation for dance-lessons-coach
|
||||
**Version:** 1.0.0
|
||||
|
||||
## 🎯 Skill Objectives
|
||||
@@ -200,7 +200,7 @@ func (s *Server) handleHealth(w http.ResponseWriter, r *http.Request) {
|
||||
- [swaggo/swag Documentation](https://github.com/swaggo/swag#declaration)
|
||||
- [OpenAPI 2.0 Specification](https://swagger.io/specification/v2/)
|
||||
|
||||
### DanceLessonsCoach Specific
|
||||
### dance-lessons-coach Specific
|
||||
- [ADR 0013: OpenAPI/Swagger Toolchain](adr/0013-openapi-swagger-toolchain.md)
|
||||
- [AGENTS.md OpenAPI Section](#openapi-documentation)
|
||||
- [Current Implementation](pkg/greet/api_v1.go)
|
||||
@@ -303,6 +303,6 @@ fi
|
||||
|
||||
---
|
||||
|
||||
**Maintainers**: DanceLessonsCoach Team
|
||||
**Maintainers**: dance-lessons-coach Team
|
||||
**License**: MIT
|
||||
**Status**: Active
|
||||
@@ -1,4 +1,4 @@
|
||||
# DanceLessonsCoach YAML Lint Configuration
|
||||
# dance-lessons-coach YAML Lint Configuration
|
||||
# More practical limits for CI/CD workflow files
|
||||
|
||||
extends: default
|
||||
|
||||
32
AGENT_CHANGELOG.md
Normal file
32
AGENT_CHANGELOG.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# AGENT_CHANGELOG
|
||||
|
||||
Trace ordonnée des décisions et actions structurantes prises par les agents AI (Claude Code, Mistral Vibe, autres) sur le projet `dance-lessons-coach`. Complémentaire au [`CHANGELOG.md`](CHANGELOG.md) qui couvre les changements user-facing du produit.
|
||||
|
||||
**Pourquoi ce fichier** : référencé dans la documentation directrice (cf. AGENTS.md), mais initialement absent du repo. Initialisé dans le cadre de la Tâche 6 du curriculum migration Claude → Mistral Vibe (ARCODANGE Phase 1).
|
||||
|
||||
## Convention
|
||||
|
||||
Une entrée par décision/action structurante prise par un agent AI. Format :
|
||||
|
||||
```
|
||||
## YYYY-MM-DD — <Agent> — <Titre court>
|
||||
|
||||
**Contexte** : 1-3 lignes — pourquoi cette action
|
||||
**Décision/Action** : ce qui a été fait
|
||||
**Conséquence** : impact sur le projet (fichiers, conventions, workflows)
|
||||
**Référence** : commit hash, PR Gitea, ADR, issue (le cas échéant)
|
||||
```
|
||||
|
||||
Les entrées qui ne demandent pas de discussion (typo fixes, formatting, dependency bumps mineurs) ne sont **pas** loguées ici — c'est ce que fait le commit Git. Ce fichier garde uniquement les décisions où le **pourquoi** mérite une trace.
|
||||
|
||||
---
|
||||
|
||||
## 2026-05-02 — Mistral Vibe (intent-router) + Claude Code (Opus 4.7) — Initialisation AGENT_CHANGELOG.md
|
||||
|
||||
**Contexte** : Tâche 6 du curriculum migration ARCODANGE Phase 1 (cf. `~/.vibe/plans/migration-claude-vers-mistral-phase-1.md`). Le fichier `AGENT_CHANGELOG.md` était mentionné dans la documentation directrice projet mais n'existait pas — friction identifiée par l'audit Phase A.
|
||||
|
||||
**Décision/Action** : initialiser le fichier avec convention claire et pointer depuis `AGENTS.md` (Tâche 6 Phase C).
|
||||
|
||||
**Conséquence** : tout agent qui prend une décision structurante sur le projet doit ajouter une entrée datée ici. Permet la traçabilité des choix AI au-delà des commits Git.
|
||||
|
||||
**Référence** : Tâche 6 du plan migration. Voir aussi `~/.vibe/plans/task-6-phase-a-results.md` pour le contexte complet de la restructuration en cours.
|
||||
57
CHANGELOG.md
Normal file
57
CHANGELOG.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Changelog
|
||||
|
||||
Notable user-facing changes to `dance-lessons-coach`. Format inspired by [Keep a Changelog](https://keepachangelog.com/), versioning follows [Semantic Versioning 2.0.0](https://semver.org/) (see [`documentation/version-management-guide.md`](documentation/version-management-guide.md)).
|
||||
|
||||
The historical phases of foundational development (Phase 1 to Phase 9) are documented in [`documentation/HISTORY.md`](documentation/HISTORY.md).
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
|
||||
_(items pending release; move to a versioned section when tagged)_
|
||||
|
||||
### Changed
|
||||
|
||||
### Fixed
|
||||
|
||||
---
|
||||
|
||||
## 2026-04-05 — Architecture Documentation
|
||||
|
||||
- ✅ Added comprehensive ADR directory with 9 decision records
|
||||
- ✅ Enhanced Zerolog vs Zap analysis in logging ADR
|
||||
- ✅ Updated `README.md` and `AGENTS.md` with ADR references
|
||||
- ✅ Documented hybrid testing approach
|
||||
- ✅ Added BDD testing decision record
|
||||
|
||||
## 2026-04-04 — Observability & Testing
|
||||
|
||||
- ✅ OpenTelemetry integration with Jaeger
|
||||
- ✅ Middleware-only tracing approach
|
||||
- ✅ Comprehensive telemetry configuration
|
||||
- ✅ BDD testing framework setup
|
||||
- ✅ Hybrid testing strategy documentation
|
||||
|
||||
## 2026-04-03 — Production Readiness
|
||||
|
||||
- ✅ Graceful shutdown with readiness endpoints
|
||||
- ✅ Configuration management with Viper
|
||||
- ✅ JSON logging configuration
|
||||
- ✅ File output logging support
|
||||
- ✅ Comprehensive error handling
|
||||
|
||||
## 2026-04-02 — Web API Foundation
|
||||
|
||||
- ✅ Chi router integration
|
||||
- ✅ Versioned API endpoints (`/api/v1`)
|
||||
- ✅ Health and readiness endpoints
|
||||
- ✅ JSON responses with proper headers
|
||||
- ✅ Interface-based design patterns
|
||||
|
||||
## 2026-04-01 — Project Foundation
|
||||
|
||||
- ✅ Go 1.26.1 environment setup
|
||||
- ✅ Project structure with `cmd/` and `pkg/`
|
||||
- ✅ Core Greet service implementation
|
||||
- ✅ CLI interface
|
||||
- ✅ Unit tests with table-driven approach
|
||||
@@ -1,6 +1,6 @@
|
||||
# Contributing to DanceLessonsCoach
|
||||
# Contributing to dance-lessons-coach
|
||||
|
||||
Thank you for your interest in contributing to DanceLessonsCoach! This guide will help you set up your development environment and understand our contribution process.
|
||||
Thank you for your interest in contributing to dance-lessons-coach! This guide will help you set up your development environment and understand our contribution process.
|
||||
|
||||
## 📋 Table of Contents
|
||||
|
||||
@@ -24,8 +24,8 @@ Thank you for your interest in contributing to DanceLessonsCoach! This guide wil
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://gitea.arcodange.lab/arcodange/DanceLessonsCoach.git
|
||||
cd DanceLessonsCoach
|
||||
git clone https://gitea.arcodange.lab/arcodange/dance-lessons-coach.git
|
||||
cd dance-lessons-coach
|
||||
|
||||
# Install dependencies
|
||||
go mod tidy
|
||||
@@ -260,7 +260,7 @@ Major architectural decisions are documented in the `adr/` directory. Please rev
|
||||
|
||||
## 🤖 AI Agent Contributions
|
||||
|
||||
AI agents play a crucial role in maintaining and improving DanceLessonsCoach. This section provides guidance for AI agents on how to effectively contribute.
|
||||
AI agents play a crucial role in maintaining and improving dance-lessons-coach. This section provides guidance for AI agents on how to effectively contribute.
|
||||
|
||||
### Key Files and Directories
|
||||
|
||||
@@ -342,7 +342,7 @@ AI agents play a crucial role in maintaining and improving DanceLessonsCoach. Th
|
||||
|
||||
## 📜 License
|
||||
|
||||
By contributing to DanceLessonsCoach, you agree that your contributions will be licensed under the MIT License.
|
||||
By contributing to dance-lessons-coach, you agree that your contributions will be licensed under the MIT License.
|
||||
|
||||
---
|
||||
|
||||
@@ -350,7 +350,7 @@ By contributing to DanceLessonsCoach, you agree that your contributions will be
|
||||
=======
|
||||
## 🤖 AI Agent Contributions
|
||||
|
||||
AI agents play a crucial role in maintaining and improving DanceLessonsCoach. This section provides guidance for AI agents on how to effectively contribute.
|
||||
AI agents play a crucial role in maintaining and improving dance-lessons-coach. This section provides guidance for AI agents on how to effectively contribute.
|
||||
|
||||
### Key Files and Directories
|
||||
|
||||
@@ -432,7 +432,7 @@ AI agents play a crucial role in maintaining and improving DanceLessonsCoach. Th
|
||||
|
||||
## 📜 License
|
||||
|
||||
By contributing to DanceLessonsCoach, you agree that your contributions will be licensed under the MIT License.
|
||||
By contributing to dance-lessons-coach, you agree that your contributions will be licensed under the MIT License.
|
||||
|
||||
---
|
||||
|
||||
|
||||
379
README.md
379
README.md
@@ -1,361 +1,98 @@
|
||||
# DanceLessonsCoach
|
||||
# dance-lessons-coach
|
||||
|
||||
[](https://gitea.arcodange.fr/arcodange/DanceLessonsCoach)
|
||||
[](https://goreportcard.com/report/github.com/arcodange/DanceLessonsCoach)
|
||||
[](https://gitea.arcodange.fr/arcodange/DanceLessonsCoach/releases)
|
||||
[](https://gitea.arcodange.fr/arcodange/dance-lessons-coach/actions/workflows/ci-cd.yaml)
|
||||
[](https://gitea.arcodange.fr/arcodange/dance-lessons-coach/releases)
|
||||
[](LICENSE)
|
||||
|
||||
A Go project demonstrating idiomatic package structure, CLI implementation, and JSON API with Chi router.
|
||||
=======
|
||||
Go web service demonstrating idiomatic package structure, versioned JSON API, and production-ready features.
|
||||
|
||||
## Features
|
||||
|
||||
- Greet function with default behavior
|
||||
- Command-line interface
|
||||
- JSON API with versioned endpoints
|
||||
- Chi router integration
|
||||
- Zerolog for high-performance logging
|
||||
- Viper for configuration management
|
||||
- Graceful shutdown with context
|
||||
- Readiness endpoint for Kubernetes/service mesh integration
|
||||
- OpenTelemetry integration with Jaeger support
|
||||
- OpenAPI/Swagger documentation
|
||||
- Unit tests
|
||||
- Go 1.26.1 compatible
|
||||
- Versioned JSON API (`/api/v1`, `/api/v2`)
|
||||
- Chi router with graceful shutdown
|
||||
- Zerolog structured logging (console and JSON modes)
|
||||
- Viper configuration (file + env vars)
|
||||
- Readiness endpoint for Kubernetes / service mesh
|
||||
- OpenTelemetry / Jaeger distributed tracing
|
||||
- OpenAPI / Swagger UI (embedded in binary)
|
||||
- PostgreSQL user service with JWT auth
|
||||
- BDD + unit tests
|
||||
|
||||
## Installation
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://gitea.arcodange.lab/arcodange/dance-lessons-coach.git
|
||||
cd dance-lessons-coach
|
||||
|
||||
# Build all binaries
|
||||
./scripts/build.sh
|
||||
|
||||
# Use the new Cobra CLI
|
||||
./bin/dance-lessons-coach --help
|
||||
|
||||
# Or use the legacy greet CLI
|
||||
go run ./cmd/greet
|
||||
./scripts/build.sh # produces ./bin/server and ./bin/greet
|
||||
./scripts/start-server.sh start
|
||||
```
|
||||
|
||||
## CI/CD Pipeline
|
||||
|
||||
DanceLessonsCoach includes a portable CI/CD pipeline using GitHub Actions syntax:
|
||||
|
||||
### Features
|
||||
- ✅ **Multi-platform**: Works on Gitea, GitHub, and GitLab
|
||||
- ✅ **Build & Test**: Automated Go builds and tests
|
||||
- ✅ **Linting**: Code quality checks with `go fmt` and `go vet`
|
||||
- ✅ **Version Management**: Automatic version detection
|
||||
- ✅ **Portable**: Uses standard GitHub Actions workflow format
|
||||
|
||||
### Workflow File
|
||||
```yaml
|
||||
# .github/workflows/main.yml
|
||||
jobs:
|
||||
build-test:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-go@v4
|
||||
with:
|
||||
go-version: '1.26.1'
|
||||
- run: go build ./...
|
||||
- run: go test ./... -cover
|
||||
|
||||
lint-format:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- run: go fmt ./...
|
||||
- run: go vet ./...
|
||||
```bash
|
||||
curl http://localhost:8080/api/health
|
||||
curl http://localhost:8080/api/v1/greet/Alice
|
||||
```
|
||||
|
||||
### Setup Instructions
|
||||
1. **Gitea**: Enable GitHub Actions compatibility in repo settings
|
||||
2. **GitHub**: Push to mirror repository (workflow runs automatically)
|
||||
3. **GitLab**: Convert workflow to `.gitlab-ci.yml` or use compatibility mode
|
||||
Stop: `./scripts/start-server.sh stop`
|
||||
|
||||
**See [ADR 0016](adr/0016-ci-cd-pipeline-design.md) for complete CI/CD design and [STATUS_BADGES.md](STATUS_BADGES.md) for badge setup.**
|
||||
## Greet CLI
|
||||
|
||||
```bash
|
||||
go run ./cmd/greet # Hello world!
|
||||
go run ./cmd/greet Alice # Hello Alice!
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Basic configuration options:
|
||||
All options are available via `config.yaml` or `DLC_*` environment variables.
|
||||
|
||||
```bash
|
||||
# Start with default configuration
|
||||
./scripts/start-server.sh start
|
||||
| Env var | Default | Description |
|
||||
|---------|---------|-------------|
|
||||
| `DLC_SERVER_PORT` | `8080` | Listening port |
|
||||
| `DLC_SERVER_HOST` | `0.0.0.0` | Bind address |
|
||||
| `DLC_LOGGING_JSON` | `false` | JSON log format |
|
||||
| `DLC_LOGGING_OUTPUT` | stderr | Log file path |
|
||||
| `DLC_SHUTDOWN_TIMEOUT` | `30s` | Graceful shutdown window |
|
||||
| `DLC_API_V2_ENABLED` | `false` | Enable `/api/v2` routes |
|
||||
| `DLC_CONFIG_FILE` | `./config.yaml` | Override config path |
|
||||
|
||||
# Custom port
|
||||
export DLC_SERVER_PORT=9090
|
||||
./scripts/start-server.sh start
|
||||
See `config.example.yaml` for a full template.
|
||||
|
||||
# JSON logging
|
||||
export DLC_LOGGING_JSON=true
|
||||
./scripts/start-server.sh start
|
||||
```
|
||||
## API
|
||||
|
||||
**See [AGENTS.md](AGENTS.md#configuration-management) for comprehensive configuration guide including:**
|
||||
- File-based configuration
|
||||
- Environment variables
|
||||
- Configuration priority rules
|
||||
- OpenTelemetry setup
|
||||
- Advanced scenarios
|
||||
|
||||
## Usage
|
||||
|
||||
### New Cobra CLI (Recommended)
|
||||
|
||||
```bash
|
||||
# Show help
|
||||
./bin/dance-lessons-coach --help
|
||||
|
||||
# Show version
|
||||
./bin/dance-lessons-coach version
|
||||
|
||||
# Greet someone
|
||||
./bin/dance-lessons-coach greet John
|
||||
|
||||
# Start server
|
||||
./bin/dance-lessons-coach server
|
||||
```
|
||||
|
||||
### Legacy CLI (Deprecated)
|
||||
|
||||
```bash
|
||||
# Default greeting
|
||||
go run ./cmd/greet
|
||||
# Output: Hello world!
|
||||
|
||||
# Custom greeting
|
||||
go run ./cmd/greet John
|
||||
# Output: Hello John!
|
||||
```
|
||||
|
||||
### Web Server
|
||||
|
||||
**Using the server control script (recommended):**
|
||||
|
||||
```bash
|
||||
# Start the server
|
||||
./scripts/start-server.sh start
|
||||
|
||||
# Test API endpoints
|
||||
./scripts/start-server.sh test
|
||||
|
||||
# Access OpenAPI documentation
|
||||
# Swagger UI: http://localhost:8080/swagger/
|
||||
# OpenAPI spec: http://localhost:8080/swagger/doc.json
|
||||
|
||||
# Stop the server
|
||||
./scripts/start-server.sh stop
|
||||
```
|
||||
|
||||
**Manual server management:**
|
||||
|
||||
```bash
|
||||
# Start the server
|
||||
go run ./cmd/server
|
||||
|
||||
# Test API endpoints
|
||||
curl http://localhost:8080/api/health
|
||||
# Output: {"status":"healthy"}
|
||||
|
||||
curl http://localhost:8080/api/ready
|
||||
# Output: {"ready":true}
|
||||
|
||||
curl http://localhost:8080/api/v1/greet
|
||||
# Output: {"message":"Hello world!"}
|
||||
|
||||
curl http://localhost:8080/api/v1/greet/John
|
||||
# Output: {"message":"Hello John!"}
|
||||
```
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/health` | Liveness check |
|
||||
| GET | `/api/ready` | Readiness check (503 during shutdown) |
|
||||
| GET | `/api/version` | Version info (`?format=plain\|full\|json`) |
|
||||
| GET | `/api/v1/greet/` | Default greeting |
|
||||
| GET | `/api/v1/greet/{name}` | Named greeting |
|
||||
| POST | `/api/v2/greet` | V2 greeting with validation |
|
||||
| GET | `/swagger/` | Swagger UI |
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
go test ./...
|
||||
|
||||
# Run specific package tests
|
||||
go test ./pkg/greet/
|
||||
go test ./... # unit + integration tests
|
||||
./scripts/test-graceful-shutdown.sh # lifecycle + JSON logging validation
|
||||
./scripts/test-opentelemetry.sh # tracing end-to-end
|
||||
```
|
||||
|
||||
## CI/CD
|
||||
## Gitea Client
|
||||
|
||||
DanceLessonsCoach includes a comprehensive CI/CD pipeline with multiple testing options:
|
||||
AI agent helper script at `.vibe/skills/gitea-client/scripts/gitea-client.sh`.
|
||||
|
||||
### Local Testing (No Gitea Required)
|
||||
Auth setup:
|
||||
```bash
|
||||
# Validate workflow structure
|
||||
./scripts/cicd.sh validate
|
||||
|
||||
# Test workflow steps locally
|
||||
./scripts/cicd.sh test-simple
|
||||
echo "your_token" > ~/.gitea_token
|
||||
chmod 600 ~/.gitea_token
|
||||
export GITEA_API_TOKEN_FILE="$HOME/.gitea_token"
|
||||
```
|
||||
|
||||
### Gitea Integration
|
||||
```bash
|
||||
# Test local setup with Gitea configuration
|
||||
./scripts/cicd.sh test-local
|
||||
|
||||
# Check pipeline status on Gitea
|
||||
./scripts/cicd.sh check-status
|
||||
```
|
||||
|
||||
### Full CI/CD Testing
|
||||
```bash
|
||||
# Test with docker compose (requires Gitea runner)
|
||||
./scripts/cicd.sh test-docker
|
||||
```
|
||||
|
||||
**See [adr/0016-ci-cd-pipeline-design.md](adr/0016-ci-cd-pipeline-design.md) for complete CI/CD architecture.**
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
DanceLessonsCoach/
|
||||
├── adr/ # Architecture Decision Records
|
||||
├── cmd/ # Entry points (greet CLI, server)
|
||||
├── pkg/ # Core packages (config, greet, server, telemetry)
|
||||
│ └── server/docs/ # Generated OpenAPI documentation (gitignored)
|
||||
├── config.yaml # Configuration file
|
||||
├── scripts/ # Management scripts
|
||||
└── go.mod # Go module definition
|
||||
```
|
||||
|
||||
**See [AGENTS.md](AGENTS.md#project-structure) for detailed structure and component explanations.**
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
### Generate OpenAPI Documentation
|
||||
|
||||
The project uses [swaggo/swag](https://github.com/swaggo/swag) to generate OpenAPI/Swagger documentation from code annotations:
|
||||
|
||||
```bash
|
||||
# Generate documentation
|
||||
go generate ./pkg/server/
|
||||
|
||||
# This creates:
|
||||
# - pkg/server/docs/docs.go (swagger template)
|
||||
# - pkg/server/docs/swagger.json (OpenAPI spec)
|
||||
# - pkg/server/docs/swagger.yaml (YAML version)
|
||||
```
|
||||
|
||||
**Note:** `pkg/server/docs/` is gitignored. Documentation is embedded in the binary at build time.
|
||||
|
||||
### Documentation Annotations
|
||||
|
||||
Add swagger annotations to handlers and models:
|
||||
|
||||
```go
|
||||
// @Summary Get personalized greeting
|
||||
// @Description Returns a greeting with the specified name
|
||||
// @Tags greet
|
||||
// @Accept json
|
||||
// @Produce json
|
||||
// @Param name path string true "Name to greet"
|
||||
// @Success 200 {object} GreetResponse "Successful response"
|
||||
// @Failure 400 {object} ErrorResponse "Invalid name parameter"
|
||||
// @Router /v1/greet/{name} [get]
|
||||
func (h *apiV1GreetHandler) handleGreetPath(w http.ResponseWriter, r *http.Request) {
|
||||
// handler implementation
|
||||
}
|
||||
```
|
||||
Get a token at https://gitea.arcodange.lab → Profile → Settings → Applications.
|
||||
|
||||
## Architecture
|
||||
|
||||
This project uses Architecture Decision Records (ADRs) to document key technical choices. See [adr/](adr/) for complete documentation including decisions on Go 1.26.1, Chi router, Zerolog, OpenTelemetry, interface-based design, graceful shutdown, configuration management, testing strategies, and OpenAPI documentation.
|
||||
|
||||
**Adding new decisions?** See [adr/README.md](adr/README.md) for guidelines.
|
||||
|
||||
## Gitea Integration
|
||||
|
||||
DanceLessonsCoach includes AI agent skills for Gitea integration to monitor CI/CD jobs and interact with pull requests.
|
||||
|
||||
### Gitea Client Skill Setup
|
||||
|
||||
The Gitea client skill enables AI agents to:
|
||||
- Monitor CI/CD job status
|
||||
- Fetch job logs for debugging
|
||||
- Comment on pull requests
|
||||
- Track PR status
|
||||
|
||||
**Setup Instructions:**
|
||||
|
||||
1. **Create a Personal Access Token:**
|
||||
- Log in to https://gitea.arcodange.lab
|
||||
- Go to Profile → Settings → Applications
|
||||
- Generate token with `read:repository`, `write:repository`, and `read:user` scopes
|
||||
|
||||
2. **Configure Authentication:**
|
||||
```bash
|
||||
# Option 1: Environment variable
|
||||
export GITEA_API_TOKEN="your_token"
|
||||
|
||||
# Option 2: Token file (recommended)
|
||||
echo "your_token" > ~/.gitea_token
|
||||
chmod 600 ~/.gitea_token
|
||||
export GITEA_API_TOKEN_FILE="$HOME/.gitea_token"
|
||||
```
|
||||
|
||||
3. **Add to shell configuration:**
|
||||
```bash
|
||||
echo 'export GITEA_API_TOKEN_FILE="$HOME/.gitea_token"' >> ~/.bashrc
|
||||
source ~/.bashrc
|
||||
```
|
||||
|
||||
**Usage Examples:**
|
||||
```bash
|
||||
# List recent jobs
|
||||
.vibe/skills/gitea-client/scripts/gitea-client.sh list-jobs owner repo workflow_id 5
|
||||
|
||||
# Wait for job completion
|
||||
.vibe/skills/gitea-client/scripts/gitea-client.sh wait-job owner repo job_id 300
|
||||
|
||||
# Comment on PR
|
||||
.vibe/skills/gitea-client/scripts/gitea-client.sh comment-pr owner repo 42 "Build completed!"
|
||||
```
|
||||
|
||||
**Documentation:** See [.vibe/skills/gitea-client/README.md](.vibe/skills/gitea-client/README.md) for complete setup and usage guide.
|
||||
|
||||
## 🤖 AI Agent Usage
|
||||
|
||||
### Quick Launch Commands
|
||||
|
||||
**Programmer Agent** (for code implementation, testing, CI/CD):
|
||||
```bash
|
||||
vibe start --agent dancelessonscoachprogrammer
|
||||
```
|
||||
|
||||
**Product Owner Agent** (for requirements, interviews, documentation):
|
||||
```bash
|
||||
vibe start --agent dancelessonscoach-product-owner
|
||||
```
|
||||
|
||||
### Full Documentation
|
||||
|
||||
For complete agent usage guide including:
|
||||
- Agent selection guidance
|
||||
- Common workflow examples
|
||||
- Configuration reference
|
||||
- Best practices
|
||||
- Troubleshooting tips
|
||||
|
||||
See: [AGENT_USAGE_GUIDE.md](documentation/AGENT_USAGE_GUIDE.md)
|
||||
|
||||
### Gitmoji Cheatsheet
|
||||
|
||||
Quick reference for commit messages:
|
||||
- **📝 `:memo:` docs** - Documentation
|
||||
- **✨ `:sparkles:` feat** - New feature
|
||||
- **🐛 `:bug:` fix** - Bug fix
|
||||
- **♻️ `:recycle:` refactor** - Code refactoring
|
||||
- **🔧 `:wrench:` chore** - Build/config changes
|
||||
|
||||
Full cheatsheet: [GITMOJI_CHEATSHEET.md](documentation/GITMOJI_CHEATSHEET.md)
|
||||
Key decisions are documented in [adr/](adr/). See [AGENTS.md](AGENTS.md) for the full development reference (commands, config, ADR index, commit conventions).
|
||||
|
||||
## License
|
||||
|
||||
|
||||
2
VERSION
2
VERSION
@@ -1,4 +1,4 @@
|
||||
# DanceLessonsCoach Version
|
||||
# dance-lessons-coach Version
|
||||
|
||||
# Current Version (Semantic Versioning)
|
||||
MAJOR=1
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We needed to choose a Go version for the DanceLessonsCoach project that provides:
|
||||
We needed to choose a Go version for the dance-lessons-coach project that provides:
|
||||
- Stability and long-term support
|
||||
- Access to modern language features
|
||||
- Good ecosystem compatibility
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We needed to choose an HTTP router for the DanceLessonsCoach web service that provides:
|
||||
We needed to choose an HTTP router for the dance-lessons-coach web service that provides:
|
||||
- Good performance characteristics
|
||||
- Flexible routing capabilities
|
||||
- Middleware support
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We needed to choose a logging library for DanceLessonsCoach that provides:
|
||||
We needed to choose a logging library for dance-lessons-coach that provides:
|
||||
- High performance with minimal overhead
|
||||
- Structured logging capabilities
|
||||
- Multiple output formats (console, JSON)
|
||||
@@ -94,7 +94,7 @@ Chosen option: "Zerolog" because it provides excellent performance, clean API, g
|
||||
| With fields | 3 alloc | 4 alloc |
|
||||
| Complex | 5 alloc | 6 alloc |
|
||||
|
||||
### Real-World Impact for DanceLessonsCoach
|
||||
### Real-World Impact for dance-lessons-coach
|
||||
|
||||
* **Performance**: <1μs difference per request - negligible impact
|
||||
* **Memory**: Zerolog's better allocation profile helps in long-running services
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We needed to choose a design pattern for DanceLessonsCoach that provides:
|
||||
We needed to choose a design pattern for dance-lessons-coach that provides:
|
||||
- Good testability and mocking capabilities
|
||||
- Flexibility for future changes
|
||||
- Clear separation of concerns
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We needed to implement a shutdown mechanism for DanceLessonsCoach that provides:
|
||||
We needed to implement a shutdown mechanism for dance-lessons-coach that provides:
|
||||
- Clean resource cleanup
|
||||
- Proper handling of in-flight requests
|
||||
- Kubernetes/service mesh compatibility
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We needed a configuration management solution for DanceLessonsCoach that provides:
|
||||
We needed a configuration management solution for dance-lessons-coach that provides:
|
||||
- Support for multiple configuration sources (files, environment variables, defaults)
|
||||
- Configuration validation
|
||||
- Type-safe configuration loading
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We needed to add observability to DanceLessonsCoach that provides:
|
||||
We needed to add observability to dance-lessons-coach that provides:
|
||||
- Distributed tracing capabilities
|
||||
- Performance monitoring
|
||||
- Request flow visualization
|
||||
@@ -105,7 +105,7 @@ func (s *Server) getAllMiddlewares() []func(http.Handler) http.Handler {
|
||||
telemetry:
|
||||
enabled: true
|
||||
otlp_endpoint: "localhost:4317"
|
||||
service_name: "DanceLessonsCoach"
|
||||
service_name: "dance-lessons-coach"
|
||||
insecure: true
|
||||
sampler:
|
||||
type: "parentbased_always_on"
|
||||
|
||||
@@ -4,9 +4,11 @@
|
||||
* Deciders: Gabriel Radureau, AI Agent
|
||||
* Date: 2026-04-05
|
||||
|
||||
> **⚠️ Structure superseded by ADR-0024.** The framework decision (Godog, in-process test server) remains valid. However, the flat `features/` layout and single `steps.go` file described here were replaced by a modular per-domain structure. See ADR-0024 for the current organisation: `features/{auth,greet,health,jwt,config}/` with domain-specific step files and per-domain `*_test.go` runners. The `cd features && godog` execution pattern is also outdated — each domain now uses `go test`.
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We needed to add behavioral testing to DanceLessonsCoach that provides:
|
||||
We needed to add behavioral testing to dance-lessons-coach that provides:
|
||||
- User-centric test scenarios
|
||||
- Living documentation
|
||||
- Integration testing capabilities
|
||||
|
||||
@@ -1,14 +1,15 @@
|
||||
# Combine BDD and Swagger-based testing
|
||||
# BDD Testing with OpenAPI Documentation
|
||||
|
||||
* Status: ✅ Partially Implemented (BDD + Documentation only)
|
||||
* Status: Accepted
|
||||
* Deciders: Gabriel Radureau, AI Agent
|
||||
* Date: 2026-04-05
|
||||
* Last Updated: 2026-04-05
|
||||
* Implementation Status: BDD testing and OpenAPI documentation completed, SDK generation deferred
|
||||
* Last Updated: 2026-04-12
|
||||
|
||||
> **⚠️ Title corrected.** This ADR was originally named "Combine BDD and Swagger-based testing" with the intent of eventually adding SDK-generated BDD tests as a second layer ("hybrid"). That second layer was deferred and has no concrete plan. The actual architecture is **BDD direct-HTTP testing + OpenAPI documentation via swaggo** — calling it "hybrid" is misleading. SDK generation remains a possible future enhancement but is not tracked by any open issue.
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
We need to establish a comprehensive testing strategy for DanceLessonsCoach that provides:
|
||||
We need to establish a comprehensive testing strategy for dance-lessons-coach that provides:
|
||||
- Behavioral verification through BDD
|
||||
- API documentation through Swagger/OpenAPI
|
||||
- Client SDK validation
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context
|
||||
|
||||
The DanceLessonsCoach application needed to add a new API version (v2) that provides different greeting behavior while maintaining backward compatibility with the existing v1 API. The v2 API should only be available when explicitly enabled via a feature flag.
|
||||
The dance-lessons-coach application needed to add a new API version (v2) that provides different greeting behavior while maintaining backward compatibility with the existing v1 API. The v2 API should only be available when explicitly enabled via a feature flag.
|
||||
|
||||
## Decision
|
||||
|
||||
|
||||
36
adr/0011-validation-library-selection.md
Normal file
36
adr/0011-validation-library-selection.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# 11. Validation Library Selection
|
||||
|
||||
* Status: Accepted
|
||||
* Deciders: Gabriel Radureau, AI Agent
|
||||
* Date: 2026-04-05
|
||||
* Implementation Date: 2026-04-05
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
The dance-lessons-coach application needs input validation for API request bodies and configuration values. We need a library that integrates well with Go structs and provides clear error messages.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
* Struct-tag-based validation to avoid boilerplate
|
||||
* Good error messages with field-level detail
|
||||
* Active maintenance and wide adoption
|
||||
* Compatibility with existing interface-based design
|
||||
|
||||
## Considered Options
|
||||
|
||||
* `github.com/go-playground/validator/v10` — struct-tag driven, widely adopted
|
||||
* `github.com/asaskevich/govalidator` — tag-based but less expressive
|
||||
* Manual validation — full control, no dependency, high boilerplate
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: **`go-playground/validator/v10`** because it is the de-facto standard in the Go ecosystem, supports struct-tag annotations, provides field-level error detail, and integrates cleanly with our interface-based design.
|
||||
|
||||
## Implementation
|
||||
|
||||
`github.com/go-playground/validator/v10 v10.30.2` is present in `go.mod`.
|
||||
The `pkg/validation/` package wraps the validator for reuse across handlers.
|
||||
|
||||
## Links
|
||||
|
||||
* [go-playground/validator GitHub](https://github.com/go-playground/validator)
|
||||
@@ -6,7 +6,7 @@
|
||||
|
||||
## Context
|
||||
|
||||
The DanceLessonsCoach project implemented Git hooks to automatically run `go fmt` and `go mod tidy` before commits. Initially, the `go fmt` hook was configured to format **all Go files** in the repository, regardless of their staged status.
|
||||
The dance-lessons-coach project implemented Git hooks to automatically run `go fmt` and `go mod tidy` before commits. Initially, the `go fmt` hook was configured to format **all Go files** in the repository, regardless of their staged status.
|
||||
|
||||
During implementation review, concerns were raised about this approach:
|
||||
|
||||
|
||||
@@ -9,7 +9,7 @@
|
||||
|
||||
## Context
|
||||
|
||||
The DanceLessonsCoach project requires comprehensive API documentation and testing capabilities. As the API evolves with v1 and v2 endpoints, we need a robust OpenAPI/Swagger toolchain to:
|
||||
The dance-lessons-coach project requires comprehensive API documentation and testing capabilities. As the API evolves with v1 and v2 endpoints, we need a robust OpenAPI/Swagger toolchain to:
|
||||
|
||||
1. **Document APIs**: Generate interactive API documentation
|
||||
2. **Test APIs**: Enable automated API testing
|
||||
@@ -166,9 +166,9 @@ import (
|
||||
// Chi adapter would be needed
|
||||
)
|
||||
|
||||
// @title DanceLessonsCoach API
|
||||
// @title dance-lessons-coach API
|
||||
// @version 1.0
|
||||
// @description API for DanceLessonsCoach service
|
||||
// @description API for dance-lessons-coach service
|
||||
// @host localhost:8080
|
||||
// @BasePath /api
|
||||
func main() {
|
||||
@@ -328,71 +328,9 @@ After thorough evaluation and implementation, we've successfully integrated swag
|
||||
go install github.com/swaggo/swag/cmd/swag@latest
|
||||
|
||||
# 2. Add swagger metadata to main.go
|
||||
// @title DanceLessonsCoach API
|
||||
// @title dance-lessons-coach API
|
||||
// @version 1.0
|
||||
// @description API for DanceLessonsCoach service
|
||||
// @host localhost:8080
|
||||
// @BasePath /api
|
||||
package main
|
||||
```
|
||||
|
||||
### Swag Formatting Integration
|
||||
|
||||
To ensure consistent swagger comment formatting, we've integrated `swag fmt` into our workflow:
|
||||
|
||||
#### Git Hooks
|
||||
Added to `.git/hooks/pre-commit`:
|
||||
```bash
|
||||
# Run swag fmt to format swagger comments
|
||||
echo "Running swag fmt..."
|
||||
if command -v swag >/dev/null 2>&1; then
|
||||
swag fmt
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "ERROR: swag fmt failed"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
echo "swag not installed, skipping swag fmt"
|
||||
fi
|
||||
```
|
||||
|
||||
#### CI/CD Integration
|
||||
Added to `.gitea/workflows/go-ci-cd.yaml` lint-format job:
|
||||
```yaml
|
||||
- name: Install swag
|
||||
run: go install github.com/swaggo/swag/cmd/swag@latest
|
||||
|
||||
- name: Run swag fmt
|
||||
run: swag fmt
|
||||
```
|
||||
|
||||
#### Benefits
|
||||
- **Consistent Formatting**: Automatic formatting of swagger comments
|
||||
- **Pre-Commit Validation**: Catches issues before commit
|
||||
- **CI/CD Enforcement**: Ensures formatting in all pull requests
|
||||
- **Team Consistency**: Everyone follows the same rules
|
||||
- **Automatic Fixes**: Issues are fixed automatically
|
||||
|
||||
#### Usage
|
||||
```bash
|
||||
# Format swagger comments manually
|
||||
swag fmt
|
||||
|
||||
# Format is automatically run in:
|
||||
# - pre-commit hook
|
||||
# - CI/CD lint-format job
|
||||
```
|
||||
=======
|
||||
### Final Implementation
|
||||
|
||||
```bash
|
||||
# 1. Install swaggo
|
||||
go install github.com/swaggo/swag/cmd/swag@latest
|
||||
|
||||
# 2. Add swagger metadata to main.go
|
||||
// @title DanceLessonsCoach API
|
||||
// @version 1.0
|
||||
// @description API for DanceLessonsCoach service
|
||||
// @description API for dance-lessons-coach service
|
||||
// @host localhost:8080
|
||||
// @BasePath /api
|
||||
package main
|
||||
@@ -525,7 +463,7 @@ s.router.Get("/swagger/*", httpSwagger.WrapHandler)
|
||||
# 2. Create OpenAPI spec (openapi.yaml)
|
||||
# openapi: 3.0.3
|
||||
# info:
|
||||
# title: DanceLessonsCoach API
|
||||
# title: dance-lessons-coach API
|
||||
# version: 1.0.0
|
||||
|
||||
# 3. Generate server types
|
||||
@@ -654,9 +592,9 @@ go install github.com/deepmap/oapi-codegen/cmd/oapi-codegen@latest
|
||||
# 2. Create OpenAPI spec (openapi.yaml)
|
||||
openapi: 3.0.3
|
||||
info:
|
||||
title: DanceLessonsCoach API
|
||||
title: dance-lessons-coach API
|
||||
version: 1.0.0
|
||||
description: API for DanceLessonsCoach service
|
||||
description: API for dance-lessons-coach service
|
||||
servers:
|
||||
- url: http://localhost:8080/api
|
||||
description: Development server
|
||||
|
||||
44
adr/0014-grpc-adoption-strategy.md
Normal file
44
adr/0014-grpc-adoption-strategy.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# 14. gRPC Adoption Strategy
|
||||
|
||||
* Status: Rejected / Deferred
|
||||
* Deciders: Gabriel Radureau, AI Agent
|
||||
* Date: 2026-04-05
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
As the API grows, gRPC was evaluated as an alternative or complement to REST for internal service communication. The question was whether to adopt gRPC alongside the existing Chi REST API.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
* Performance of inter-service communication
|
||||
* Type safety via Protocol Buffers
|
||||
* Streaming support
|
||||
* Team familiarity and operational overhead
|
||||
|
||||
## Considered Options
|
||||
|
||||
* **Hybrid REST/gRPC** — add gRPC endpoints alongside existing REST endpoints
|
||||
* **REST only** — maintain current Chi router approach
|
||||
* **gRPC-first with transcoding** — use bufbuild/connect for unified REST+gRPC
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: **REST only (deferred)**. gRPC adoption is not warranted at the current scale. The application has a small number of endpoints, a single-binary deployment model, and no internal service mesh that would benefit from gRPC's efficiency.
|
||||
|
||||
### Reasons for deferral
|
||||
|
||||
1. **No inter-service communication today** — the application is a single binary; gRPC's main benefit (efficient binary RPC between services) does not apply
|
||||
2. **Complexity cost** — adding Protobuf toolchain, code generation, and a second transport layer would significantly increase cognitive overhead
|
||||
3. **Chi router commitment** — the REST API is well-designed with OpenAPI documentation; introducing gRPC in parallel creates dual-maintenance burden
|
||||
4. **Team capacity** — limited bandwidth for large architectural changes
|
||||
|
||||
## When to reconsider
|
||||
|
||||
* Application evolves into multiple services that need efficient internal RPC
|
||||
* Streaming use cases emerge (real-time lesson progress, etc.)
|
||||
* External consumers explicitly require gRPC endpoints
|
||||
|
||||
## Links
|
||||
|
||||
* [ADR-0002: Chi Router](0002-chi-router.md)
|
||||
* [ADR-0013: OpenAPI/Swagger Toolchain](0013-openapi-swagger-toolchain.md)
|
||||
@@ -8,7 +8,7 @@
|
||||
|
||||
## Context
|
||||
|
||||
As DanceLessonsCoach grows, we need a more robust and maintainable CLI structure. Currently, we use simple flag parsing (`--version`), but this approach has limitations:
|
||||
As dance-lessons-coach grows, we need a more robust and maintainable CLI structure. Currently, we use simple flag parsing (`--version`), but this approach has limitations:
|
||||
|
||||
1. **Limited scalability**: Adding more commands/flags becomes messy
|
||||
2. **Poor user experience**: No built-in help, completion, or validation
|
||||
@@ -51,10 +51,10 @@ We will adopt **Cobra** as our CLI framework. Cobra is a mature, widely-used lib
|
||||
```go
|
||||
var rootCmd = &cobra.Command{
|
||||
Use: "dance-lessons-coach",
|
||||
Short: "DanceLessonsCoach - API server and CLI tools",
|
||||
Long: `DanceLessonsCoach provides greeting services and API management.
|
||||
Short: "dance-lessons-coach - API server and CLI tools",
|
||||
Long: `dance-lessons-coach provides greeting services and API management.
|
||||
|
||||
To begin working with DanceLessonsCoach, run:
|
||||
To begin working with dance-lessons-coach, run:
|
||||
dance-lessons-coach server --help`,
|
||||
SilenceUsage: true,
|
||||
}
|
||||
@@ -69,7 +69,7 @@ var versionCmd = &cobra.Command{
|
||||
|
||||
var serverCmd = &cobra.Command{
|
||||
Use: "server",
|
||||
Short: "Start the DanceLessonsCoach server",
|
||||
Short: "Start the dance-lessons-coach server",
|
||||
Run: func(cmd *cobra.Command, args []string) {
|
||||
// Load config and start server
|
||||
cfg, err := config.LoadConfig()
|
||||
@@ -116,7 +116,7 @@ func main() {
|
||||
|
||||
**Current Commands:**
|
||||
- `version`: Print version information
|
||||
- `server`: Start the DanceLessonsCoach server
|
||||
- `server`: Start the dance-lessons-coach server
|
||||
- `greet [name]`: Greet someone by name
|
||||
- `help`: Built-in help system
|
||||
- `completion`: Shell completion scripts (automatic)
|
||||
@@ -222,7 +222,7 @@ dance-lessons-coach config validate
|
||||
|
||||
---
|
||||
|
||||
**Status:** Proposed
|
||||
**Next Review:** 2026-04-12
|
||||
**Status:** Accepted
|
||||
**Implementation Date:** 2026-04-05
|
||||
**Implementation Owner:** Arcodange Team
|
||||
**Approvers Needed:** @gabrielradureau
|
||||
**Approved by:** @gabrielradureau
|
||||
@@ -1,14 +1,14 @@
|
||||
# 16. CI/CD Pipeline Design for Multi-Platform Compatibility
|
||||
|
||||
**Date:** 2026-04-05
|
||||
**Status:** 🟡 Proposed
|
||||
**Status:** ✅ Accepted
|
||||
**Authors:** Arcodange Team
|
||||
**Decision Date:** TBD
|
||||
**Implementation Status:** Not Started
|
||||
**Decision Date:** 2026-04-08
|
||||
**Implementation Status:** ✅ Completed
|
||||
|
||||
## Context
|
||||
|
||||
DanceLessonsCoach requires a robust CI/CD pipeline that:
|
||||
dance-lessons-coach requires a robust CI/CD pipeline that:
|
||||
|
||||
1. **Primary Platform**: Gitea (self-hosted Git service)
|
||||
2. **Mirror Support**: GitHub and GitLab mirrors for visibility and backup
|
||||
@@ -69,7 +69,7 @@ graph TD
|
||||
|
||||
```yaml
|
||||
# .github/workflows/main.yml
|
||||
name: DanceLessonsCoach CI/CD
|
||||
name: dance-lessons-coach CI/CD
|
||||
|
||||
on:
|
||||
push:
|
||||
@@ -140,10 +140,10 @@ jobs:
|
||||
# README.md
|
||||
|
||||
[](https://ci.dancelessonscoach.org)
|
||||
[](https://github.com/yourorg/DanceLessonsCoach/actions)
|
||||
[](https://gitlab.com/yourorg/DanceLessonsCoach/-/pipelines)
|
||||
[](https://goreportcard.com/report/github.com/yourorg/DanceLessonsCoach)
|
||||
[](https://codecov.io/gh/yourorg/DanceLessonsCoach)
|
||||
[](https://github.com/yourorg/dance-lessons-coach/actions)
|
||||
[](https://gitlab.com/yourorg/dance-lessons-coach/-/pipelines)
|
||||
[](https://goreportcard.com/report/github.com/yourorg/dance-lessons-coach)
|
||||
[](https://codecov.io/gh/yourorg/dance-lessons-coach)
|
||||
```
|
||||
|
||||
### 5. Mirror Synchronization Strategy
|
||||
@@ -170,7 +170,7 @@ mkdir -p .gitea/workflows
|
||||
|
||||
# 2. Create main workflow file with Arcodange-specific configuration
|
||||
cat > .gitea/workflows/ci-cd.yaml << 'EOF'
|
||||
name: DanceLessonsCoach CI/CD
|
||||
name: dance-lessons-coach CI/CD
|
||||
|
||||
on:
|
||||
push:
|
||||
@@ -200,41 +200,41 @@ jobs:
|
||||
- name: Notify internal systems
|
||||
if: always()
|
||||
run: |
|
||||
curl -X POST "$GITEA_INTERNAL/api/v1/repos/yourorg/DanceLessonsCoach/statuses/$(git rev-parse HEAD)" \
|
||||
curl -X POST "$GITEA_INTERNAL/api/v1/repos/yourorg/dance-lessons-coach/statuses/$(git rev-parse HEAD)" \
|
||||
-H "Authorization: token $GITEA_TOKEN" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"state\": \"$([ $? -eq 0 ] && echo 'success' || echo 'failure')\", \"context\": \"ci/build-test\"}"
|
||||
EOF
|
||||
|
||||
# 3. Enable Gitea CI/CD in repo settings (Arcodange instance)
|
||||
# - Go to: https://gitea.arcodange.lab/arcodange/DanceLessonsCoach/settings/actions
|
||||
# - Go to: https://gitea.arcodange.lab/arcodange/dance-lessons-coach/settings/actions
|
||||
# - Enable GitHub Actions
|
||||
# - Configure runner to use internal network (192.168.1.202)
|
||||
# - Set up GITEA_TOKEN for API access
|
||||
# - SSH URL: ssh://git@192.168.1.202:2222/arcodange/DanceLessonsCoach.git
|
||||
# - SSH URL: ssh://git@192.168.1.202:2222/arcodange/dance-lessons-coach.git
|
||||
|
||||
# 4. Add STATUS_BADGES.md with Arcodange-specific URLs
|
||||
cat > STATUS_BADGES.md << 'EOF'
|
||||
## Arcodange Gitea Badges
|
||||
|
||||
```markdown
|
||||
[](https://gitea.arcodange.fr/arcodange/DanceLessonsCoach)
|
||||
[](https://gitea.arcodange.fr/arcodange/DanceLessonsCoach/-/pipelines)
|
||||
[](https://gitea.arcodange.fr/arcodange/dance-lessons-coach)
|
||||
[](https://gitea.arcodange.fr/arcodange/dance-lessons-coach/-/pipelines)
|
||||
```
|
||||
|
||||
**Configuration Details:**
|
||||
- Organization: arcodange
|
||||
- Repository: DanceLessonsCoach
|
||||
- Repository: dance-lessons-coach
|
||||
- Internal URL: https://gitea.arcodange.lab/
|
||||
- External URL: https://gitea.arcodange.fr/
|
||||
- SSH URL: ssh://git@192.168.1.202:2222/arcodange/DanceLessonsCoach.git
|
||||
- SSH URL: ssh://git@192.168.1.202:2222/arcodange/dance-lessons-coach.git
|
||||
- Badges use external URL with full org/repo path
|
||||
- CI/CD uses internal URL for faster network access
|
||||
EOF
|
||||
|
||||
# 5. Configure CI/CD runners on internal network
|
||||
# - Set up runners to access: https://gitea.arcodange.lab/
|
||||
# - Configure SSH access: ssh://git@192.168.1.202:2222/arcodange/DanceLessonsCoach.git
|
||||
# - Configure SSH access: ssh://git@192.168.1.202:2222/arcodange/dance-lessons-coach.git
|
||||
# - Ensure runners have network access to internal services (192.168.1.202:2222)
|
||||
# - Configure runners with proper GITEA_TOKEN
|
||||
# - Test connection: curl https://gitea.arcodange.lab/api/v1/version
|
||||
@@ -332,18 +332,18 @@ cat > STATUS_BADGES.md << 'EOF'
|
||||
|
||||
## GitHub Mirror
|
||||
```markdown
|
||||
[](https://github.com/yourorg/DanceLessonsCoach/actions)
|
||||
[](https://github.com/yourorg/dance-lessons-coach/actions)
|
||||
```
|
||||
|
||||
## GitLab Mirror
|
||||
```markdown
|
||||
[](https://gitlab.com/yourorg/DanceLessonsCoach/-/pipelines)
|
||||
[](https://gitlab.com/yourorg/dance-lessons-coach/-/pipelines)
|
||||
```
|
||||
|
||||
## Code Quality
|
||||
```markdown
|
||||
[](https://goreportcard.com/report/github.com/yourorg/DanceLessonsCoach)
|
||||
[](https://codecov.io/gh/yourorg/DanceLessonsCoach)
|
||||
[](https://goreportcard.com/report/github.com/yourorg/dance-lessons-coach)
|
||||
[](https://codecov.io/gh/yourorg/dance-lessons-coach)
|
||||
```
|
||||
EOF
|
||||
|
||||
@@ -452,7 +452,7 @@ docker run --rm \
|
||||
-e GITEA_INTERNAL="https://gitea.arcodange.lab/" \
|
||||
-e GITEA_EXTERNAL="https://gitea.arcodange.fr/" \
|
||||
-e GITEA_ORG="arcodange" \
|
||||
-e GITEA_REPO="DanceLessonsCoach" \
|
||||
-e GITEA_REPO="dance-lessons-coach" \
|
||||
gitea/act_runner:latest \
|
||||
act -W .gitea/workflows/ci-cd.yaml --rm
|
||||
```
|
||||
@@ -472,7 +472,7 @@ act -W .gitea/workflows/ci-cd.yaml \
|
||||
# 3. With specific event simulation
|
||||
act push -W .gitea/workflows/ci-cd.yaml \
|
||||
--env GITEA_ORG=arcodange \
|
||||
--env GITEA_REPO=DanceLessonsCoach
|
||||
--env GITEA_REPO=dance-lessons-coach
|
||||
```
|
||||
|
||||
### Pipeline Status Checking Scripts
|
||||
@@ -489,10 +489,10 @@ echo "🔍 Checking CI/CD Pipeline Status"
|
||||
echo "================================"
|
||||
|
||||
# 1. Gitea (Primary) - Internal URL
|
||||
if curl -s -o /dev/null -w "%{http_code}" "https://gitea.arcodange.lab/api/v1/repos/arcodange/DanceLessonsCoach/actions/workflows" | grep -q "200"; then
|
||||
if curl -s -o /dev/null -w "%{http_code}" "https://gitea.arcodange.lab/api/v1/repos/arcodange/dance-lessons-coach/actions/workflows" | grep -q "200"; then
|
||||
echo "✅ Gitea Internal API: Accessible"
|
||||
# Get workflow list
|
||||
WORKFLOWS=$(curl -s "https://gitea.arcodange.lab/api/v1/repos/arcodange/DanceLessonsCoach/actions/workflows" | jq -r '.[] | .name + " (" + .file_name + ")"')
|
||||
WORKFLOWS=$(curl -s "https://gitea.arcodange.lab/api/v1/repos/arcodange/dance-lessons-coach/actions/workflows" | jq -r '.[] | .name + " (" + .file_name + ")"')
|
||||
echo "📋 Gitea Workflows:"
|
||||
echo "$WORKFLOWS" | sed 's/^/ - /'
|
||||
else
|
||||
@@ -502,9 +502,9 @@ fi
|
||||
# 2. Gitea (External) - Public URL
|
||||
echo ""
|
||||
echo "🌐 Gitea External Status:"
|
||||
if curl -s -o /dev/null -w "%{http_code}" "https://gitea.arcodange.fr/arcodange/DanceLessonsCoach" | grep -q "200"; then
|
||||
if curl -s -o /dev/null -w "%{http_code}" "https://gitea.arcodange.fr/arcodange/dance-lessons-coach" | grep -q "200"; then
|
||||
echo "✅ Gitea External: Accessible"
|
||||
echo "🔗 Repository: https://gitea.arcodange.fr/arcodange/DanceLessonsCoach"
|
||||
echo "🔗 Repository: https://gitea.arcodange.fr/arcodange/dance-lessons-coach"
|
||||
else
|
||||
echo "❌ Gitea External: Not accessible"
|
||||
fi
|
||||
@@ -512,7 +512,7 @@ fi
|
||||
# 3. Check badge API
|
||||
echo ""
|
||||
echo "🏷️ Badge API Status:"
|
||||
BADGE_URL="https://gitea.arcodange.fr/api/badges/arcodange/DanceLessonsCoach/status"
|
||||
BADGE_URL="https://gitea.arcodange.fr/api/badges/arcodange/dance-lessons-coach/status"
|
||||
if curl -s -o /dev/null -w "%{http_code}" "$BADGE_URL" | grep -q "200"; then
|
||||
echo "✅ Badge API: Accessible"
|
||||
echo "🔗 Badge URL: $BADGE_URL"
|
||||
@@ -541,8 +541,8 @@ echo "✅ Arcodange conventions: Matches webapp workflow style"
|
||||
echo ""
|
||||
echo "💡 Next Steps:"
|
||||
echo " 1. Push to trigger workflow: git push origin main"
|
||||
echo " 2. Check Gitea Actions: https://gitea.arcodange.lab/arcodange/DanceLessonsCoach/actions"
|
||||
echo " 3. Monitor badges: https://gitea.arcodange.fr/arcodange/DanceLessonsCoach"
|
||||
echo " 2. Check Gitea Actions: https://gitea.arcodange.lab/arcodange/dance-lessons-coach/actions"
|
||||
echo " 3. Monitor badges: https://gitea.arcodange.fr/arcodange/dance-lessons-coach"
|
||||
```
|
||||
|
||||
### Workflow Validation Script
|
||||
@@ -659,7 +659,7 @@ services:
|
||||
- GITEA_INTERNAL=https://gitea.arcodange.lab/
|
||||
- GITEA_EXTERNAL=https://gitea.arcodange.fr/
|
||||
- GITEA_ORG=arcodange
|
||||
- GITEA_REPO=DanceLessonsCoach
|
||||
- GITEA_REPO=dance-lessons-coach
|
||||
command: act -W .gitea/workflows/ci-cd.yaml --rm
|
||||
|
||||
yamllint:
|
||||
@@ -758,7 +758,81 @@ graph TD
|
||||
|
||||
---
|
||||
|
||||
**Status:** Proposed
|
||||
**Next Review:** 2026-04-12
|
||||
## Implementation Status
|
||||
|
||||
### ✅ Completed - Container/Services Architecture
|
||||
|
||||
The CI/CD pipeline has been successfully implemented using GitHub Actions' container/services architecture:
|
||||
|
||||
**Key Implementation Details:**
|
||||
|
||||
1. **Container-based Execution**: All CI steps run within a pre-built Docker cache image containing Go tools, Node.js, and PostgreSQL client
|
||||
2. **Service-based PostgreSQL**: Database provided as a service container, accessible via `postgres` hostname
|
||||
3. **Smart Caching**: Dependency hash calculated from `go.mod`, `go.sum`, and `Dockerfile.build` for accurate cache invalidation
|
||||
4. **Environment Configuration**: Database connection parameters set via `DLC_*` environment variables
|
||||
5. **Simplified Workflow**: Removed Docker Compose overhead and unnecessary setup steps
|
||||
|
||||
**Current Workflow Structure:**
|
||||
|
||||
```yaml
|
||||
jobs:
|
||||
build-cache:
|
||||
name: Build Docker Cache
|
||||
# Calculates dependency hash and builds cache image if needed
|
||||
|
||||
ci-pipeline:
|
||||
name: CI Pipeline
|
||||
needs: build-cache
|
||||
container:
|
||||
image: gitea.arcodange.lab/arcodange/dance-lessons-coach-build-cache:${{ needs.build-cache.outputs.deps_hash }}
|
||||
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:15
|
||||
env:
|
||||
POSTGRES_USER: postgres
|
||||
POSTGRES_PASSWORD: postgres
|
||||
POSTGRES_DB: dance_lessons_coach_bdd_test
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Set database environment variables
|
||||
run: |
|
||||
echo "DLC_DATABASE_HOST=postgres" >> $GITHUB_ENV
|
||||
echo "DLC_DATABASE_PORT=5432" >> $GITHUB_ENV
|
||||
# ... other database config
|
||||
|
||||
- name: Generate Swagger Docs
|
||||
run: go generate ./pkg/server
|
||||
|
||||
- name: Build all packages
|
||||
run: go build ./...
|
||||
|
||||
- name: Wait for PostgreSQL to be ready
|
||||
run: pg_isready -h postgres -p 5432
|
||||
|
||||
- name: Run tests with coverage
|
||||
run: go test ./... -coverprofile=coverage.out
|
||||
|
||||
- name: Build binaries
|
||||
run: ./scripts/build.sh
|
||||
```
|
||||
|
||||
**Performance Improvements:**
|
||||
- ✅ **Faster execution**: Direct container execution without compose overhead
|
||||
- ✅ **Reliable caching**: Accurate dependency tracking with multi-file hash
|
||||
- ✅ **Simpler debugging**: Clear container boundaries and service networking
|
||||
- ✅ **Better portability**: Standard GitHub Actions patterns work across platforms
|
||||
|
||||
**Verification:**
|
||||
- ✅ **Workflow 465**: Both jobs completed successfully (2026-04-08)
|
||||
- ✅ **All tests passing**: Database connectivity working correctly
|
||||
- ✅ **Coverage reporting**: Badges updating automatically
|
||||
- ✅ **Binary builds**: Scripts executing properly in container environment
|
||||
|
||||
**Status:** ✅ Accepted
|
||||
**Implementation Date:** 2026-04-08
|
||||
**Implementation Owner:** Arcodange Team
|
||||
**Approvers Needed:** @gabrielradureau
|
||||
**Reviewers:** @gabrielradureau
|
||||
@@ -8,7 +8,7 @@
|
||||
|
||||
## Context
|
||||
|
||||
DanceLessonsCoach requires a safe workflow for making CI/CD changes to prevent breaking the main branch. The current workflow allows direct pushes to main, which poses risks for CI/CD configuration changes that could break the entire pipeline.
|
||||
dance-lessons-coach requires a safe workflow for making CI/CD changes to prevent breaking the main branch. The current workflow allows direct pushes to main, which poses risks for CI/CD configuration changes that could break the entire pipeline.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
@@ -220,13 +220,13 @@ echo 'm' | act -n -W .gitea/workflows/ci-cd.yaml
|
||||
|
||||
#### Sample Dry Run Output
|
||||
```
|
||||
*DRYRUN* [DanceLessonsCoach CI/CD/Build and Test ] ⭐ Run Set up job
|
||||
*DRYRUN* [DanceLessonsCoach CI/CD/Build and Test ] 🚀 Start image=node:16-buster-slim
|
||||
*DRYRUN* [DanceLessonsCoach CI/CD/Build and Test ] ✅ Success - Set up job
|
||||
*DRYRUN* [DanceLessonsCoach CI/CD/Build and Test ] ⭐ Run Main Checkout code
|
||||
*DRYRUN* [DanceLessonsCoach CI/CD/Build and Test ] ✅ Success - Main Checkout code [4.038875ms]
|
||||
*DRYRUN* [dance-lessons-coach CI/CD/Build and Test ] ⭐ Run Set up job
|
||||
*DRYRUN* [dance-lessons-coach CI/CD/Build and Test ] 🚀 Start image=node:16-buster-slim
|
||||
*DRYRUN* [dance-lessons-coach CI/CD/Build and Test ] ✅ Success - Set up job
|
||||
*DRYRUN* [dance-lessons-coach CI/CD/Build and Test ] ⭐ Run Main Checkout code
|
||||
*DRYRUN* [dance-lessons-coach CI/CD/Build and Test ] ✅ Success - Main Checkout code [4.038875ms]
|
||||
... (all steps succeeded)
|
||||
*DRYRUN* [DanceLessonsCoach CI/CD/Build and Test ] 🏁 Job succeeded
|
||||
*DRYRUN* [dance-lessons-coach CI/CD/Build and Test ] 🏁 Job succeeded
|
||||
```
|
||||
|
||||
### Recommended Local Development Workflow
|
||||
|
||||
@@ -1,13 +1,14 @@
|
||||
# 18. User Management and Authentication System
|
||||
|
||||
**Date:** 2024-04-06
|
||||
**Status:** Proposed
|
||||
**Date:** 2026-04-06
|
||||
**Status:** Accepted
|
||||
**Implementation Date:** 2026-04-08
|
||||
**Authors:** Product Owner
|
||||
**Decision Drivers:** Security, User Personalization, Admin Functionality
|
||||
|
||||
## Context
|
||||
|
||||
The DanceLessonsCoach application currently lacks user management and authentication capabilities. To provide personalized experiences and administrative functions, we need to implement a secure user authentication system with PostgreSQL persistence.
|
||||
The dance-lessons-coach application currently lacks user management and authentication capabilities. To provide personalized experiences and administrative functions, we need to implement a secure user authentication system with PostgreSQL persistence.
|
||||
|
||||
## Decision
|
||||
|
||||
@@ -69,7 +70,7 @@ CREATE TABLE users (
|
||||
|
||||
#### Architecture Alignment
|
||||
|
||||
The user management system follows the established DanceLessonsCoach patterns:
|
||||
The user management system follows the established dance-lessons-coach patterns:
|
||||
|
||||
1. **Interface-based Design:**
|
||||
```go
|
||||
@@ -120,6 +121,7 @@ The user management system follows the established DanceLessonsCoach patterns:
|
||||
- 30-minute expiration for access tokens
|
||||
- Secure random signing key
|
||||
- HTTPS-only cookies
|
||||
- **Secret Rotation:** Multiple valid secrets with retention policy (see Issue #8)
|
||||
3. **Admin Access:**
|
||||
- Master password from environment variable
|
||||
- Non-persisted admin user
|
||||
@@ -308,7 +310,7 @@ type Config struct {
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
This implementation builds upon the completed phases and follows the established DanceLessonsCoach patterns.
|
||||
This implementation builds upon the completed phases and follows the established dance-lessons-coach patterns.
|
||||
|
||||
### Phase 10: User Management Foundation (Next Phase)
|
||||
|
||||
@@ -464,6 +466,7 @@ The implementation maintains full backward compatibility:
|
||||
3. **User Activity Logging:** For audit trails
|
||||
4. **Password Strength Meter:** For better user experience
|
||||
5. **Account Recovery:** Email/phone-based recovery options
|
||||
6. **JWT Secret Rotation:** Implement secret persistence and rotation mechanism (Issue #8)
|
||||
|
||||
## References
|
||||
|
||||
|
||||
702
adr/0019-postgresql-integration.md
Normal file
702
adr/0019-postgresql-integration.md
Normal file
@@ -0,0 +1,702 @@
|
||||
# 19. PostgreSQL Database Integration
|
||||
|
||||
**Date:** 2026-04-07
|
||||
**Status:** Accepted (Partial)
|
||||
**Implementation Date:** 2026-04-08
|
||||
**Authors:** Product Owner
|
||||
**Decision Drivers:** Data Persistence, Scalability, Production Readiness
|
||||
|
||||
> **⚠️ Pending cleanup:** `pkg/user/sqlite_repository.go` and `gorm.io/driver/sqlite` still present in the codebase. The ADR requires their removal, but no Gitea issue tracks this yet. The PostgreSQL implementation (`pkg/user/postgres_repository.go`) is complete and in use.
|
||||
|
||||
## Context
|
||||
|
||||
The dance-lessons-coach application currently uses SQLite with GORM for the user management system (ADR 0018), but since there are no existing users or production data, we can implement PostgreSQL directly as our primary database without migration concerns.
|
||||
|
||||
### Current State
|
||||
- **Database:** SQLite (in-memory mode) - no persistent data
|
||||
- **ORM:** GORM v1.31.1
|
||||
- **Implementation:** `pkg/user/sqlite_repository.go`
|
||||
- **Usage:** User management system only
|
||||
- **Data:** No existing users or production data
|
||||
|
||||
### Implementation Drivers
|
||||
1. **Production Readiness:** PostgreSQL is enterprise-grade and production-ready
|
||||
2. **Data Persistence:** Proper persistent storage for user accounts
|
||||
3. **Concurrency:** PostgreSQL handles concurrent connections better
|
||||
4. **Scalability:** PostgreSQL supports horizontal scaling
|
||||
5. **Features:** Advanced PostgreSQL features (JSONB, full-text search)
|
||||
6. **Ecosystem:** Better tooling and monitoring for PostgreSQL
|
||||
|
||||
## Decision
|
||||
|
||||
We will implement PostgreSQL database directly, replacing the SQLite implementation with the following characteristics:
|
||||
|
||||
### Core Features
|
||||
|
||||
1. **Database Setup**
|
||||
- PostgreSQL 15+ for production compatibility
|
||||
- Containerized development environment
|
||||
- Connection pooling for performance
|
||||
- SSL support for secure connections
|
||||
|
||||
2. **ORM Integration**
|
||||
- GORM as the primary ORM
|
||||
- Interface-based repository pattern
|
||||
- Database migrations for schema management
|
||||
- Transaction support for data integrity
|
||||
|
||||
3. **Configuration Management**
|
||||
- Viper integration for database settings
|
||||
- Environment variable support with DLC_ prefix
|
||||
- Multiple environment support (dev, staging, prod)
|
||||
- Connection health checking
|
||||
|
||||
4. **Integration Points**
|
||||
- User management system (ADR 0018)
|
||||
- Existing greet service (for future personalization)
|
||||
- OpenTelemetry tracing integration
|
||||
- Zerolog structured logging
|
||||
|
||||
### Technical Implementation
|
||||
|
||||
#### Database Schema Foundation
|
||||
```sql
|
||||
-- Users table (from ADR 0018)
|
||||
CREATE TABLE users (
|
||||
id SERIAL PRIMARY KEY,
|
||||
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
||||
deleted_at TIMESTAMP WITH TIME ZONE,
|
||||
username VARCHAR(50) UNIQUE NOT NULL,
|
||||
password_hash VARCHAR(255) NOT NULL,
|
||||
description TEXT,
|
||||
current_goal TEXT,
|
||||
is_admin BOOLEAN DEFAULT FALSE,
|
||||
allow_password_reset BOOLEAN DEFAULT FALSE,
|
||||
last_login TIMESTAMP WITH TIME ZONE
|
||||
);
|
||||
|
||||
-- Greet history table (future extension)
|
||||
CREATE TABLE greet_history (
|
||||
id SERIAL PRIMARY KEY,
|
||||
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
|
||||
user_id INTEGER REFERENCES users(id),
|
||||
message TEXT NOT NULL,
|
||||
context JSONB
|
||||
);
|
||||
```
|
||||
|
||||
#### Technology Stack
|
||||
- **Database:** PostgreSQL 15+ - production-ready relational database
|
||||
- **ORM:** GORM v1.25+ - aligns with interface-based design
|
||||
- **Migrations:** GORM AutoMigrate + custom SQL migrations
|
||||
- **Connection Pooling:** PgBouncer-compatible connection management
|
||||
- **Configuration:** Viper integration - consistent with existing patterns
|
||||
- **Logging:** Zerolog integration - structured database logging
|
||||
- **Telemetry:** OpenTelemetry database instrumentation
|
||||
|
||||
|
||||
|
||||
#### Architecture Alignment
|
||||
|
||||
The PostgreSQL integration follows established dance-lessons-coach patterns:
|
||||
|
||||
1. **Interface-based Design:**
|
||||
```go
|
||||
type DatabaseRepository interface {
|
||||
GetDB() *gorm.DB
|
||||
Close() error
|
||||
HealthCheck(ctx context.Context) error
|
||||
BeginTransaction(ctx context.Context) (*gorm.DB, error)
|
||||
}
|
||||
|
||||
type UserRepository interface {
|
||||
CreateUser(ctx context.Context, user *User) error
|
||||
GetUserByUsername(ctx context.Context, username string) (*User, error)
|
||||
// ... other methods
|
||||
}
|
||||
```
|
||||
|
||||
2. **Context-aware Services:**
|
||||
```go
|
||||
func (r *PostgresUserRepository) CreateUser(ctx context.Context, user *User) error {
|
||||
log.Trace().Ctx(ctx).Str("username", user.Username).Msg("Creating user")
|
||||
return r.db.WithContext(ctx).Create(user).Error
|
||||
}
|
||||
```
|
||||
|
||||
3. **Configuration Integration:**
|
||||
```go
|
||||
type DatabaseConfig struct {
|
||||
Type string `mapstructure:"type"` // sqlite, postgres, auto
|
||||
Host string `mapstructure:"host"`
|
||||
Port int `mapstructure:"port"`
|
||||
User string `mapstructure:"user"`
|
||||
Password string `mapstructure:"password"`
|
||||
Name string `mapstructure:"name"`
|
||||
SSLMode string `mapstructure:"ssl_mode"`
|
||||
MaxOpenConns int `mapstructure:"max_open_conns"`
|
||||
MaxIdleConns int `mapstructure:"max_idle_conns"`
|
||||
ConnMaxLifetime time.Duration `mapstructure:"conn_max_lifetime"`
|
||||
}
|
||||
```
|
||||
|
||||
4. **Graceful Shutdown Integration:**
|
||||
```go
|
||||
func (s *Server) Shutdown(ctx context.Context) error {
|
||||
// Close database connections gracefully
|
||||
if s.userRepo != nil {
|
||||
if err := s.userRepo.Close(); err != nil {
|
||||
log.Error().Err(err).Msg("User repository shutdown failed")
|
||||
// Continue shutdown even if database fails
|
||||
}
|
||||
}
|
||||
|
||||
// The readiness endpoint already handles shutdown detection via s.readyCtx
|
||||
// No need for atomic operations - the context-based approach is cleaner
|
||||
|
||||
// Continue with existing HTTP server shutdown
|
||||
return s.httpServer.Shutdown(ctx)
|
||||
}
|
||||
```
|
||||
|
||||
5. **Readiness Endpoint Integration:**
|
||||
```go
|
||||
func (s *Server) handleReadiness(w http.ResponseWriter, r *http.Request) {
|
||||
// Check database health if using persistent database
|
||||
if s.config.GetDatabaseType() != "sqlite" {
|
||||
if err := s.userRepo.CheckDatabaseHealth(r.Context()); err != nil {
|
||||
log.Warn().Err(err).Msg("Database health check failed")
|
||||
s.writeJSONResponse(w, http.StatusServiceUnavailable, map[string]interface{}{
|
||||
"ready": false,
|
||||
"reason": "database_unhealthy",
|
||||
"error": err.Error(),
|
||||
})
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
// Existing readiness logic
|
||||
select {
|
||||
case <-s.readyCtx.Done():
|
||||
s.writeJSONResponse(w, http.StatusServiceUnavailable, map[string]interface{}{
|
||||
"ready": false,
|
||||
"reason": "shutting_down",
|
||||
})
|
||||
default:
|
||||
s.writeJSONResponse(w, http.StatusOK, map[string]interface{}{
|
||||
"ready": true,
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Implementation Strategy
|
||||
|
||||
#### Phase 1: PostgreSQL Repository Implementation
|
||||
|
||||
1. **Replace Dependencies:**
|
||||
```bash
|
||||
# Remove SQLite dependencies
|
||||
go get gorm.io/driver/postgres
|
||||
go get github.com/lib/pq # PostgreSQL driver
|
||||
go mod tidy # Clean up unused dependencies
|
||||
```
|
||||
|
||||
2. **Create PostgreSQL Repository:**
|
||||
- `pkg/user/postgres_repository.go` - PostgreSQL implementation
|
||||
- Implement `UserRepository` interface directly
|
||||
- Add PostgreSQL-specific connection management
|
||||
|
||||
3. **Docker Setup:**
|
||||
- Create `docker-compose.yml` with PostgreSQL 16 service (current stable version)
|
||||
- Add initialization scripts for development
|
||||
- Configure health checks and monitoring
|
||||
- Use Alpine-based image for smaller footprint
|
||||
|
||||
4. **Configuration:**
|
||||
- Add `DatabaseConfig` to existing config structure
|
||||
- Environment variables with `DLC_` prefix
|
||||
- Connection validation and health checking
|
||||
|
||||
#### Phase 2: Server Integration
|
||||
|
||||
1. **Update Server Initialization:**
|
||||
- Modify `initializeUserServices()` in `pkg/server/server.go`
|
||||
- Replace SQLite repository with PostgreSQL repository
|
||||
- Update error handling and logging
|
||||
|
||||
2. **Remove SQLite Code:**
|
||||
- Delete `pkg/user/sqlite_repository.go`
|
||||
- Clean up any SQLite-specific references
|
||||
- Update imports and dependencies
|
||||
|
||||
3. **Enhance Health Checks:**
|
||||
- Add database health check to readiness endpoint
|
||||
- Implement connection pooling monitoring
|
||||
- Add startup health validation
|
||||
|
||||
#### Phase 3: Testing & Validation
|
||||
|
||||
1. **BDD Test Integration:**
|
||||
- Updated test server configuration with PostgreSQL settings
|
||||
- Automatic PostgreSQL container startup in test script
|
||||
- Health checks for database readiness before tests
|
||||
- **Separate BDD test database** (`dance_lessons_coach_bdd_test`)
|
||||
- Complete isolation from development/production databases
|
||||
|
||||
2. **Test Script Enhancement:**
|
||||
- `scripts/run-bdd-tests.sh` now starts PostgreSQL if needed
|
||||
- **Automatic BDD database creation** using `createdb` command
|
||||
- Checks for existing BDD database before creating
|
||||
- Waits for database readiness before running tests
|
||||
- Proper error handling and timeout management
|
||||
- Reuses existing container if already running
|
||||
|
||||
3. **Database Isolation Strategy:**
|
||||
- **Development**: `dance_lessons_coach` (config.yaml)
|
||||
- **BDD Tests**: `dance_lessons_coach_bdd_test` (automatically created)
|
||||
- **Production**: Custom name per environment
|
||||
- **Manual Testing**: Developers can use development database
|
||||
|
||||
3. **Unit & Integration Tests:**
|
||||
- Repository method testing with PostgreSQL
|
||||
- Transaction and error case testing
|
||||
- Performance benchmarks
|
||||
- Connection failure scenarios
|
||||
|
||||
4. **Graceful Shutdown Testing:**
|
||||
- Database connection cleanup during shutdown
|
||||
- Readiness endpoint behavior during shutdown
|
||||
- Connection pool behavior under stress
|
||||
|
||||
#### Phase 4: Documentation & Finalization
|
||||
|
||||
1. **Documentation Updates:**
|
||||
- Update AGENTS.md with PostgreSQL setup instructions
|
||||
- Add database configuration guide
|
||||
- Create development setup documentation
|
||||
- Update BDD test documentation
|
||||
|
||||
2. **Cleanup:**
|
||||
- Remove all SQLite references from code
|
||||
- Update go.mod and go.sum
|
||||
- Verify no unused imports or dependencies
|
||||
|
||||
3. **Production Readiness:**
|
||||
- Add database health monitoring
|
||||
- Configure connection pooling for production
|
||||
- Add environment-specific configurations
|
||||
|
||||
1. **User Model & Repository:**
|
||||
- `pkg/user/models.go` - GORM user model
|
||||
- `pkg/user/repository.go` - GORM implementation
|
||||
- `pkg/user/repository_mock.go` - Mock for testing
|
||||
|
||||
2. **Database Integration:**
|
||||
- Implement `UserRepository` interface
|
||||
- Add transaction support
|
||||
- Implement health checks
|
||||
|
||||
3. **Testing Setup:**
|
||||
- Test container for PostgreSQL
|
||||
- Integration test suite
|
||||
- Mock-based unit tests
|
||||
|
||||
#### Phase 3: Service Integration
|
||||
|
||||
1. **Auth Service Integration:**
|
||||
- Update auth service to use user repository
|
||||
- Implement JWT token persistence
|
||||
- Add session management
|
||||
|
||||
2. **Greet Service Extension:**
|
||||
- Add greet history tracking
|
||||
- Implement user-specific greetings
|
||||
- Add database logging
|
||||
|
||||
3. **API Endpoints:**
|
||||
- Health check endpoint: `GET /api/health/db`
|
||||
- Database metrics endpoint: `GET /api/metrics/db`
|
||||
|
||||
#### Phase 4: Testing & Validation
|
||||
|
||||
1. **BDD Test Integration:**
|
||||
- Temporary test database setup
|
||||
- Test container for PostgreSQL
|
||||
- Clean database between scenarios
|
||||
- Test data isolation
|
||||
|
||||
2. **Unit & Integration Tests:**
|
||||
- Repository method testing
|
||||
- Transaction testing
|
||||
- Error case testing
|
||||
- Performance benchmarks
|
||||
|
||||
3. **Fallback Testing:**
|
||||
- SQLite fallback scenarios
|
||||
- Connection failure handling
|
||||
- Graceful degradation
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Data Persistence:** User accounts and application data properly persisted
|
||||
2. **Production Ready:** PostgreSQL is enterprise-grade database
|
||||
3. **Scalability:** Better concurrent connection handling
|
||||
4. **Simplified Architecture:** Direct PostgreSQL implementation without migration complexity
|
||||
5. **Clean Codebase:** No legacy SQLite code or dual implementation
|
||||
6. **Future-Proof:** Foundation for all future data-driven features
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Dependency Changes:** Replacing SQLite with PostgreSQL dependencies
|
||||
2. **Operational Overhead:** Database container management
|
||||
3. **Learning Curve:** PostgreSQL-specific features and optimization
|
||||
4. **Testing Requirements:** Comprehensive testing needed for new implementation
|
||||
|
||||
### Neutral
|
||||
|
||||
1. **Code Changes:** Repository implementation replacement
|
||||
2. **Configuration Updates:** New database configuration structure
|
||||
3. **Development Workflow:** Docker-based database for local development
|
||||
|
||||
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Alternative 1: Keep SQLite with File Persistence
|
||||
- **Pros:** Simple, no new dependencies, works for small-scale
|
||||
- **Cons:** Not production-grade, limited concurrency, file-based limitations
|
||||
- **Rejected:** Doesn't meet long-term production requirements
|
||||
|
||||
### Alternative 2: Dual Implementation with Fallback
|
||||
- **Pros:** Smooth migration path, backward compatibility
|
||||
- **Cons:** Complex codebase, testing overhead, maintenance burden
|
||||
- **Rejected:** Unnecessary complexity since no existing data or users
|
||||
|
||||
### Alternative 2: MySQL
|
||||
- **Pros:** Widely used, good community support
|
||||
- **Cons:** Different ecosystem, licensing concerns
|
||||
- **Rejected:** PostgreSQL better fits our needs
|
||||
|
||||
### Alternative 3: MongoDB
|
||||
- **Pros:** Flexible schema, document-oriented
|
||||
- **Cons:** NoSQL approach, different query patterns
|
||||
- **Rejected:** Relational data better suits our model
|
||||
|
||||
### Alternative 4: Pure SQL (no ORM)
|
||||
- **Pros:** No ORM overhead, direct control
|
||||
- **Cons:** More boilerplate, manual query building
|
||||
- **Rejected:** GORM provides good balance
|
||||
|
||||
## Graceful Shutdown & Readiness Integration
|
||||
|
||||
### Database Connection Lifecycle
|
||||
|
||||
The PostgreSQL integration must properly handle the server lifecycle:
|
||||
|
||||
1. **Startup Sequence:**
|
||||
- Initialize database connections
|
||||
- Run health check
|
||||
- Set readiness to true only if database is healthy
|
||||
- Log connection details at trace level
|
||||
|
||||
2. **Runtime Operation:**
|
||||
- Monitor database connection health
|
||||
- Handle connection failures gracefully
|
||||
- Implement connection retry logic
|
||||
- Log connection issues appropriately
|
||||
|
||||
3. **Shutdown Sequence:**
|
||||
- Set readiness to false immediately
|
||||
- Close all database connections
|
||||
- Wait for in-flight queries to complete
|
||||
- Handle shutdown timeouts gracefully
|
||||
- Log shutdown progress
|
||||
|
||||
### Readiness Endpoint Enhancement
|
||||
|
||||
The existing `/api/ready` endpoint already has the correct nested structure for service health checks. We'll enhance it to include PostgreSQL database health:
|
||||
|
||||
**Current Structure:**
|
||||
```json
|
||||
{
|
||||
"ready": true,
|
||||
"connections": {
|
||||
"database": {
|
||||
"status": "healthy"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Health Check Logic:**
|
||||
```go
|
||||
func (r *PostgresUserRepository) CheckDatabaseHealth(ctx context.Context) error {
|
||||
// Simple query to test connectivity
|
||||
var count int64
|
||||
result := r.db.WithContext(ctx).Model(&User{}).Count(&count)
|
||||
if result.Error != nil {
|
||||
return fmt.Errorf("database health check failed: %w", result.Error)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
**Readiness Response States:**
|
||||
- **Healthy:** `{"ready": true, "connections": {"database": {"status": "healthy"}}}`
|
||||
- **Database Unhealthy:** `{"ready": false, "reason": "database_unhealthy", "connections": {"database": {"status": "unhealthy", "error": "connection refused"}}}`
|
||||
- **Shutting Down:** `{"ready": false, "reason": "server_shutting_down", "connections": {"database": "not_checked"}}`
|
||||
- **Not Configured:** `{"ready": true, "connections": {"database": {"status": "not_configured"}}}` (for SQLite mode)
|
||||
|
||||
### Connection Pool Management
|
||||
|
||||
Proper connection pool configuration for graceful shutdown:
|
||||
|
||||
```go
|
||||
// In database initialization
|
||||
sqlDB, err := db.DB()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to get SQL DB: %w", err)
|
||||
}
|
||||
|
||||
// Configure connection pool
|
||||
sqlDB.SetMaxOpenConns(cfg.MaxOpenConns)
|
||||
sqlDB.SetMaxIdleConns(cfg.MaxIdleConns)
|
||||
sqlDB.SetConnMaxLifetime(cfg.ConnMaxLifetime)
|
||||
|
||||
// Configure graceful connection handling
|
||||
sqlDB.SetConnMaxIdleTime(time.Minute * 5)
|
||||
sqlDB.SetConnMaxLifetime(time.Hour * 1)
|
||||
```
|
||||
|
||||
### Shutdown Timeout Handling
|
||||
|
||||
```go
|
||||
func (s *Server) Shutdown(ctx context.Context) error {
|
||||
// Create shutdown context with timeout
|
||||
shutdownCtx, cancel := context.WithTimeout(ctx, s.config.GetShutdownTimeout())
|
||||
defer cancel()
|
||||
|
||||
// Close database connections with timeout
|
||||
done := make(chan struct{})
|
||||
go func() {
|
||||
if s.userRepo != nil {
|
||||
if err := s.userRepo.Close(); err != nil {
|
||||
log.Error().Err(err).Msg("Database shutdown error")
|
||||
}
|
||||
}
|
||||
close(done)
|
||||
}()
|
||||
|
||||
select {
|
||||
case <-done:
|
||||
log.Trace().Msg("Database shutdown completed")
|
||||
case <-shutdownCtx.Done():
|
||||
log.Warn().Msg("Database shutdown timed out, forcing closure")
|
||||
}
|
||||
|
||||
return s.httpServer.Shutdown(shutdownCtx)
|
||||
}
|
||||
```
|
||||
|
||||
## Alignment with Existing Architecture
|
||||
|
||||
This implementation builds upon completed phases:
|
||||
|
||||
- **Phase 1-3:** Uses Go 1.26.1, Chi router, Zerolog, interface-based design
|
||||
- **Phase 5:** Extends Viper configuration management
|
||||
- **Phase 6:** Integrates with graceful shutdown patterns and readiness endpoints
|
||||
- **Phase 7:** Maintains OpenTelemetry compatibility
|
||||
- **Phase 8:** Follows existing build system patterns
|
||||
- **Phase 9:** Preserves trace-level logging approach
|
||||
- **Phase 18:** Supports user management system
|
||||
|
||||
## Backward Compatibility
|
||||
|
||||
The implementation maintains full backward compatibility:
|
||||
|
||||
1. **API Endpoints:** Existing endpoints unchanged
|
||||
2. **Configuration:** All existing config options preserved
|
||||
3. **Logging:** Maintains existing Zerolog integration
|
||||
4. **Telemetry:** OpenTelemetry continues to work
|
||||
5. **Error Handling:** Consistent error patterns
|
||||
|
||||
## Success Metrics
|
||||
|
||||
1. **Reliability:** 99.9% database uptime
|
||||
2. **Performance:** <100ms average query time
|
||||
3. **Scalability:** Support 1000+ concurrent connections
|
||||
4. **Data Integrity:** Zero data corruption incidents
|
||||
5. **Adoption:** All new features use database storage
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. What should be the connection pool size for production?
|
||||
2. Should we implement read replicas for scaling?
|
||||
3. What backup strategy should we implement?
|
||||
4. Should we add database connection health metrics?
|
||||
5. What query timeout should we set for production?
|
||||
|
||||
## Database Cleanup Strategy
|
||||
|
||||
### Decision: Raw SQL Cleanup Between Scenarios
|
||||
|
||||
**Approach:** Use raw SQL DELETE statements with `SET CONSTRAINTS ALL DEFERRED` to clean up database between test scenarios
|
||||
|
||||
**Rationale:**
|
||||
- **Black Box Principle:** BDD tests should not depend on implementation details
|
||||
- **Foreign Key Safety:** `SET CONSTRAINTS ALL DEFERRED` allows proper handling of constraints (PostgreSQL docs: https://www.postgresql.org/docs/current/sql-set-constraints.html)
|
||||
- **Migration Compatibility:** Works regardless of schema changes
|
||||
- **Transaction Safety:** Uses explicit transactions with proper rollback handling
|
||||
|
||||
**Alternatives Considered:**
|
||||
1. **Repository-based cleanup** - Rejected: Violates black box principle
|
||||
2. **Transaction rollback** - Rejected: Complex with nested transactions
|
||||
3. **Recreate database** - Rejected: Too slow for frequent test runs
|
||||
4. **Separate test database** - Chosen: Combined with SQL cleanup
|
||||
|
||||
### Implementation Details
|
||||
|
||||
**Cleanup Process:**
|
||||
1. **Disable constraints temporarily:** `SET CONSTRAINTS ALL DEFERRED`
|
||||
2. **Query all tables:** From `information_schema.tables`
|
||||
3. **Delete in reverse order:** Handle foreign key dependencies
|
||||
4. **Reset sequences:** `ALTER SEQUENCE ... RESTART WITH 1`
|
||||
|
||||
**Execution Timing:**
|
||||
- **AfterSuite:** Full cleanup after all scenarios
|
||||
- **Between Scenarios:** Individual scenario cleanup (future enhancement)
|
||||
|
||||
**Benefits:**
|
||||
- ✅ **Fast execution:** Milliseconds vs seconds for recreation
|
||||
- ✅ **Reliable:** Handles schema changes automatically
|
||||
- ✅ **Isolated:** Each test gets clean state
|
||||
- ✅ **Maintainable:** No dependency on ORM or repositories
|
||||
|
||||
### Temporary Database Approach
|
||||
|
||||
For BDD testing, we'll use temporary PostgreSQL databases to ensure:
|
||||
- **Isolation:** Each test run gets a clean database
|
||||
- **Reproducibility:** Consistent starting state
|
||||
- **Performance:** No interference between tests
|
||||
- **CI/CD Compatibility:** Works in containerized environments
|
||||
|
||||
### Implementation Plan
|
||||
|
||||
1. **Test Container Setup:**
|
||||
```bash
|
||||
# Use testcontainers-go for PostgreSQL
|
||||
go get github.com/testcontainers/testcontainers-go
|
||||
go get github.com/testcontainers/testcontainers-go/modules/postgres
|
||||
```
|
||||
|
||||
2. **BDD Test Configuration:**
|
||||
- Create `features/support/database.go`
|
||||
- Implement `BeforeScenario` and `AfterScenario` hooks
|
||||
- Automatic database cleanup
|
||||
- Integrate with existing test suite structure
|
||||
|
||||
3. **Test Data Management:**
|
||||
- Schema migration before each scenario
|
||||
- Transaction rollback for data isolation
|
||||
- Seed data for specific scenarios
|
||||
- Match existing BDD test patterns
|
||||
|
||||
4. **Configuration:**
|
||||
```yaml
|
||||
# config.test.yaml
|
||||
database:
|
||||
host: "localhost"
|
||||
port: 5433 # Different from dev port
|
||||
name: "dance_lessons_coach_test"
|
||||
user: "test_user"
|
||||
password: "test_password"
|
||||
```
|
||||
|
||||
### Example Test Setup
|
||||
|
||||
```go
|
||||
// features/support/database.go
|
||||
func BeforeScenario(ctx context.Context, sc *godog.Scenario) (context.Context, error) {
|
||||
// Start PostgreSQL container
|
||||
postgresContainer, err := postgres.RunContainer(ctx,
|
||||
testcontainers.WithImage("postgres:15-alpine"),
|
||||
postgres.WithDatabase("test_db"),
|
||||
postgres.WithUsername("test_user"),
|
||||
postgres.WithPassword("test_password"),
|
||||
)
|
||||
if err != nil {
|
||||
return ctx, err
|
||||
}
|
||||
|
||||
// Get connection string
|
||||
connStr, err := postgresContainer.ConnectionString(ctx, "sslmode=disable")
|
||||
if err != nil {
|
||||
return ctx, err
|
||||
}
|
||||
|
||||
// Store in context for test
|
||||
ctx = context.WithValue(ctx, "postgres_container", postgresContainer)
|
||||
ctx = context.WithValue(ctx, "postgres_conn_str", connStr)
|
||||
|
||||
// Initialize user repository with test database
|
||||
config := config.GetTestConfig()
|
||||
config.Database.DSN = connStr
|
||||
|
||||
repo, err := user.NewPostgresRepository(config)
|
||||
if err != nil {
|
||||
return ctx, err
|
||||
}
|
||||
|
||||
// Store repository in context for scenario steps
|
||||
ctx = context.WithValue(ctx, "user_repository", repo)
|
||||
|
||||
return ctx, nil
|
||||
}
|
||||
|
||||
func AfterScenario(ctx context.Context, sc *godog.Scenario, err error) (context.Context, error) {
|
||||
// Clean up repository
|
||||
if repo, ok := ctx.Value("user_repository").(user.UserRepository); ok {
|
||||
repo.Close()
|
||||
}
|
||||
|
||||
// Terminate PostgreSQL container
|
||||
if container, ok := ctx.Value("postgres_container").(testcontainers.Container); ok {
|
||||
if terminateErr := container.Terminate(ctx); terminateErr != nil {
|
||||
log.Error().Err(terminateErr).Msg("Failed to terminate PostgreSQL container")
|
||||
}
|
||||
}
|
||||
return ctx, err
|
||||
}
|
||||
```
|
||||
|
||||
## Future Considerations
|
||||
|
||||
### Immediate Next Steps (Post-Migration)
|
||||
1. **CI/CD Integration:** Add PostgreSQL to CI pipeline
|
||||
2. **Performance Tuning:** Query optimization
|
||||
3. **Monitoring:** Database health metrics
|
||||
4. **Backup Strategy:** Regular database backups
|
||||
|
||||
### Long-Term Enhancements
|
||||
1. **Database Sharding:** For horizontal scaling
|
||||
2. **Read Replicas:** For read-heavy workloads
|
||||
3. **Advanced Caching:** Redis integration
|
||||
4. **Database Monitoring:** Prometheus exporter
|
||||
5. **Backup Automation:** Regular backup scheduling
|
||||
6. **Query Optimization:** Performance tuning
|
||||
|
||||
## References
|
||||
|
||||
- [GORM Documentation](https://gorm.io/)
|
||||
- [PostgreSQL 16 Documentation](https://www.postgresql.org/docs/16/)
|
||||
- [PostgreSQL Latest Version](https://www.postgresql.org/)
|
||||
- [GORM + PostgreSQL Guide](https://gorm.io/docs/connecting_to_the_database.html#PostgreSQL)
|
||||
- [Database Connection Pooling](https://www.alexedwards.net/blog/configuring-sqldb)
|
||||
|
||||
**Approved by:** [Product Owner]
|
||||
**Approval Date:** [To be determined]
|
||||
**Implementation Target:** Q2 2024
|
||||
494
adr/0020-docker-build-strategy.md
Normal file
494
adr/0020-docker-build-strategy.md
Normal file
@@ -0,0 +1,494 @@
|
||||
# ADR 0020: Docker Build Strategy - Traditional vs Buildx
|
||||
|
||||
## Status
|
||||
**Accepted** ✅
|
||||
|
||||
## Context
|
||||
|
||||
The dance-lessons-coach CI/CD pipeline initially used Docker Buildx (`docker buildx build --push`) for building and pushing Docker cache images. However, this approach encountered several issues:
|
||||
|
||||
### Issues with Buildx Approach
|
||||
|
||||
1. **TLS Certificate Problems**: Buildx had difficulty with self-signed certificates, requiring complex workaround steps
|
||||
2. **Performance Concerns**: Buildx setup and execution was significantly slower than expected
|
||||
3. **Complexity**: Buildx introduced additional complexity without providing immediate benefits
|
||||
4. **Reliability Issues**: Buildx builds were less reliable in the GitHub Actions environment
|
||||
|
||||
### Working Solution Analysis
|
||||
|
||||
The working webapp CI/CD pipeline uses traditional `docker build` + `docker push` approach:
|
||||
|
||||
```yaml
|
||||
# Working approach from webapp
|
||||
- name: Build and push image to Gitea Container Registry
|
||||
run: |-
|
||||
docker build -t app .
|
||||
docker tag app gitea.arcodange.lab/${{ github.repository }}:$TAG
|
||||
docker push gitea.arcodange.lab/${{ github.repository }}:$TAG
|
||||
```
|
||||
|
||||
This approach is simpler, more reliable, and works consistently with self-signed certificates.
|
||||
|
||||
## Decision
|
||||
|
||||
**Replace Docker Buildx with traditional docker build + push** for the CI/CD pipeline and implement a two-stage Docker build strategy.
|
||||
|
||||
### Implementation
|
||||
|
||||
#### 1. Build Cache Strategy
|
||||
|
||||
```yaml
|
||||
# Build cache using traditional docker build
|
||||
- name: Build and push Docker cache image
|
||||
if: steps.check_cache.outputs.cache_hit == 'false'
|
||||
run: |
|
||||
IMAGE_NAME="${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}-build-cache:${{ steps.calculate_hash.outputs.deps_hash }}"
|
||||
echo "Building cache image: $IMAGE_NAME"
|
||||
|
||||
# Build the image using traditional docker build
|
||||
docker build \
|
||||
--file Dockerfile.build \
|
||||
--tag "$IMAGE_NAME" \
|
||||
.
|
||||
|
||||
# Push the image
|
||||
docker push "$IMAGE_NAME"
|
||||
|
||||
echo "✅ Build cache image pushed successfully"
|
||||
```
|
||||
|
||||
#### 2. Production Build Strategy
|
||||
|
||||
```yaml
|
||||
# Production build using Dockerfile.prod
|
||||
- name: Build and push Docker image
|
||||
if: github.ref == 'refs/heads/main'
|
||||
run: |
|
||||
source VERSION
|
||||
IMAGE_VERSION="$MAJOR.$MINOR.$PATCH${PRERELEASE:+-$PRERELEASE}"
|
||||
|
||||
TAGS="$IMAGE_VERSION latest ${{ github.sha }}"
|
||||
echo "Building Docker image with tags: $TAGS"
|
||||
|
||||
# Use the production Dockerfile that leverages the build cache
|
||||
docker build -t dance-lessons-coach -f Dockerfile.prod .
|
||||
|
||||
for TAG in $TAGS; do
|
||||
IMAGE_NAME="${{ env.CI_REGISTRY }}/${{ env.GITEA_ORG }}/${{ env.GITEA_REPO }}:$TAG"
|
||||
echo "Tagging and pushing: $IMAGE_NAME"
|
||||
docker tag dance-lessons-coach "$IMAGE_NAME"
|
||||
docker push "$IMAGE_NAME"
|
||||
done
|
||||
```
|
||||
|
||||
#### 3. Dockerfile Structure
|
||||
|
||||
**Dockerfile.build** - Build environment with all dependencies:
|
||||
```dockerfile
|
||||
FROM golang:1.26.1-alpine AS builder
|
||||
|
||||
# Install build dependencies
|
||||
RUN apk add --no-cache git bash curl make gcc musl-dev bc grep sed jq ca-certificates
|
||||
|
||||
# Install Go tools
|
||||
RUN go install github.com/swaggo/swag/cmd/swag@latest
|
||||
|
||||
# Copy and verify dependencies
|
||||
COPY go.mod go.sum ./
|
||||
RUN go mod download && go mod verify
|
||||
|
||||
WORKDIR /workspace
|
||||
```
|
||||
|
||||
**Dockerfile.prod** - Minimal production image:
|
||||
```dockerfile
|
||||
# Use the build cache image as base
|
||||
FROM gitea.arcodange.lab/arcodange/dance-lessons-coach-build-cache:latest AS builder
|
||||
|
||||
# Final minimal image
|
||||
FROM alpine:3.18
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install minimal dependencies
|
||||
RUN apk add --no-cache ca-certificates tzdata
|
||||
|
||||
# Copy binary from builder
|
||||
COPY --from=builder /workspace/dance-lessons-coach /app/dance-lessons-coach
|
||||
|
||||
# Copy configuration
|
||||
COPY config.yaml /app/config.yaml
|
||||
|
||||
# Set permissions and entrypoint
|
||||
RUN chmod +x /app/dance-lessons-coach
|
||||
ENV TZ=UTC
|
||||
EXPOSE 8080
|
||||
ENTRYPOINT ["/app/dance-lessons-coach"]
|
||||
```
|
||||
|
||||
**docker/Dockerfile** - Development Dockerfile (kept for local development):
|
||||
```dockerfile
|
||||
# Multi-stage build for development
|
||||
FROM golang:1.26.1-alpine AS builder
|
||||
WORKDIR /app
|
||||
COPY go.mod go.sum ./
|
||||
RUN go mod download
|
||||
COPY . ./
|
||||
RUN go build -o /dance-lessons-coach ./cmd/server
|
||||
|
||||
FROM alpine:3.18
|
||||
WORKDIR /app
|
||||
RUN apk add --no-cache ca-certificates tzdata
|
||||
COPY --from=builder /dance-lessons-coach /app/dance-lessons-coach
|
||||
COPY config.yaml /app/config.yaml
|
||||
RUN chmod +x /app/dance-lessons-coach
|
||||
ENV TZ=UTC
|
||||
EXPOSE 8080
|
||||
ENTRYPOINT ["/app/dance-lessons-coach"]
|
||||
```
|
||||
|
||||
### File Organization
|
||||
|
||||
All Dockerfiles are now organized in the `docker/` directory:
|
||||
- `docker/Dockerfile` - Development Dockerfile
|
||||
- `docker/Dockerfile.build` - Build cache Dockerfile
|
||||
- `docker/Dockerfile.prod` - Production Dockerfile (development only, uses latest)
|
||||
- `docker/Dockerfile.prod.template` - Template for reference
|
||||
|
||||
This organization keeps the root directory clean and makes it clear which files are for development vs production.
|
||||
|
||||
## Benefits
|
||||
|
||||
### CI/CD Pipeline Benefits
|
||||
|
||||
1. **Simplicity**: Traditional approach is easier to understand and debug
|
||||
2. **Reliability**: Consistent behavior across different environments
|
||||
3. **Certificate Handling**: Works seamlessly with self-signed certificates
|
||||
4. **Performance**: Faster execution without Buildx overhead
|
||||
5. **Compatibility**: Better compatibility with GitHub Actions environment
|
||||
|
||||
### Two-Stage Build Benefits
|
||||
|
||||
1. **Separation of Concerns**: Clear separation between build environment and production runtime
|
||||
2. **Optimized Production Image**: Minimal Alpine-based image with only necessary dependencies
|
||||
3. **Reusable Build Cache**: Build environment can be reused across multiple CI runs
|
||||
4. **Faster CI Execution**: Pre-built build cache reduces CI execution time
|
||||
5. **Consistent Builds**: All builds use the same build environment
|
||||
|
||||
### Development vs Production Clarity
|
||||
|
||||
1. **Development Dockerfile**: Full build environment for local development
|
||||
2. **Production Dockerfile**: Minimal runtime environment for deployment
|
||||
3. **Build Cache Dockerfile**: Optimized build environment for CI/CD
|
||||
4. **Clear Documentation**: Each Dockerfile has a specific purpose
|
||||
|
||||
## Trade-offs
|
||||
|
||||
### What We Lose
|
||||
|
||||
1. **Multi-platform builds**: Cannot build for multiple architectures simultaneously
|
||||
2. **BuildKit caching**: Less sophisticated caching mechanism
|
||||
3. **Advanced features**: No secret mounting, SSH agents, etc.
|
||||
4. **Parallel processing**: Slower builds without Buildx optimizations
|
||||
|
||||
### What We Gain
|
||||
|
||||
1. **Stability**: More reliable CI/CD pipeline
|
||||
2. **Simplicity**: Easier to maintain and troubleshoot
|
||||
3. **Consistency**: Matches proven patterns from working projects
|
||||
4. **Faster feedback**: Quicker build times in practice
|
||||
5. **Clear Separation**: Better distinction between development and production builds
|
||||
6. **Optimized Production**: Smaller, more secure production images
|
||||
|
||||
## Rationale
|
||||
|
||||
1. **Current Needs**: We don't need multi-platform builds or advanced BuildKit features
|
||||
2. **Simple Dockerfile**: Our `Dockerfile.build` doesn't require Buildx-specific features
|
||||
3. **Proven Pattern**: Traditional approach works reliably in production (webapp project)
|
||||
4. **CI Stability**: Reliability is more important than advanced features for CI/CD
|
||||
5. **Build Strategy**: Two-stage build provides better separation of concerns
|
||||
6. **Maintenance**: Simpler approach is easier to maintain and debug
|
||||
|
||||
## Critical Bug Fix: Dependency Hash Usage
|
||||
|
||||
### Issue Identified
|
||||
|
||||
The initial implementation had a critical bug where `Dockerfile.prod` used `latest` tag instead of the specific dependency hash:
|
||||
|
||||
```dockerfile
|
||||
# ❌ WRONG - this would never work
|
||||
FROM gitea.arcodange.lab/arcodange/dance-lessons-coach-build-cache:latest AS builder
|
||||
```
|
||||
|
||||
This approach would never work because:
|
||||
1. The build cache images are tagged with specific dependency hashes
|
||||
2. No image is ever tagged as `latest`
|
||||
3. The CI/CD workflow would fail to find the cache image
|
||||
|
||||
### Solution Implemented
|
||||
|
||||
1. **Dynamic Dockerfile Generation**: The CI/CD workflow now generates `Dockerfile.prod` dynamically with the correct dependency hash
|
||||
2. **Dependency Hash Calculation**: Added `scripts/calculate-deps-hash.sh` for consistent hash calculation
|
||||
3. **Template Approach**: Created `Dockerfile.prod.template` for reference
|
||||
|
||||
### CI/CD Workflow Fix
|
||||
|
||||
```yaml
|
||||
# ✅ CORRECT - generate Dockerfile.prod with proper hash
|
||||
- name: Build and push Docker image
|
||||
if: github.ref == 'refs/heads/main'
|
||||
run: |
|
||||
# Generate Dockerfile.prod with correct dependency hash
|
||||
DEPS_HASH="${{ needs.build-cache.outputs.deps_hash }}"
|
||||
|
||||
# Create Dockerfile.prod with the correct cache image tag
|
||||
cat > Dockerfile.prod << EOF
|
||||
FROM gitea.arcodange.lab/arcodange/dance-lessons-coach-build-cache:$DEPS_HASH AS builder
|
||||
# ... rest of Dockerfile
|
||||
EOF
|
||||
|
||||
# Build using the generated Dockerfile
|
||||
docker build -t dance-lessons-coach -f Dockerfile.prod .
|
||||
```
|
||||
|
||||
## CI/CD Pipeline Optimization
|
||||
|
||||
### Changes Made
|
||||
|
||||
1. **Removed Buildx Setup**: Eliminated `docker/setup-buildx-action@v3` from CI/CD workflow
|
||||
2. **Removed Go Build Steps**: Removed `actions/setup-go@v4`, `go mod tidy`, and individual Go tool installations
|
||||
3. **Added Docker Cache Usage**: All build steps now use the pre-built Docker cache image
|
||||
4. **Updated Production Build**: Production Docker build now generates `Dockerfile.prod` dynamically with correct dependency hash
|
||||
|
||||
### CI/CD Workflow Structure
|
||||
|
||||
```yaml
|
||||
# CI Pipeline Job Structure
|
||||
jobs:
|
||||
build-cache:
|
||||
# Builds Docker cache image if needed
|
||||
# Note: No certificate configuration needed with traditional docker
|
||||
|
||||
ci-pipeline:
|
||||
needs: build-cache
|
||||
steps:
|
||||
- name: Set up build environment
|
||||
# Sets CACHE_IMAGE variable with proper tag
|
||||
# No Buildx setup, no Go installation, no certificate configuration
|
||||
|
||||
- name: Generate Swagger Docs using Docker cache
|
||||
# Uses: docker run ${{ env.CACHE_IMAGE }} sh -c "cd pkg/server && go generate"
|
||||
|
||||
- name: Build all packages using Docker cache
|
||||
# Uses: docker run ${{ env.CACHE_IMAGE }} sh -c "go build ./..."
|
||||
|
||||
- name: Run tests with coverage using Docker cache
|
||||
# Uses: docker run ${{ env.CACHE_IMAGE }} sh -c "go test ./..."
|
||||
|
||||
- name: Build and push Docker image
|
||||
# Uses: docker build -t dance-lessons-coach -f Dockerfile.prod .
|
||||
# No Buildx, no certificate issues
|
||||
```
|
||||
|
||||
### Key Improvements
|
||||
|
||||
1. **Faster Execution**: No need to set up Go environment for each job
|
||||
2. **Consistent Environment**: All builds use the same Docker cache image
|
||||
3. **Reduced Complexity**: Simpler workflow with fewer steps
|
||||
4. **Better Error Handling**: Docker cache handles dependency management
|
||||
5. **No Certificate Configuration**: Traditional docker works seamlessly with self-signed certificates
|
||||
6. **Improved Reliability**: Elimination of Buildx-related failures
|
||||
|
||||
## Future Considerations
|
||||
|
||||
### When to Reconsider Buildx
|
||||
|
||||
1. **Multi-platform needs**: If we need ARM/AMD64 builds simultaneously
|
||||
2. **Complex builds**: If Dockerfile requires BuildKit-specific features
|
||||
3. **Performance optimization**: If build times become unacceptable
|
||||
4. **Certificate issues resolved**: If Docker Buildx improves self-signed certificate handling
|
||||
|
||||
### Migration Path
|
||||
|
||||
If we need to reintroduce Buildx in the future:
|
||||
|
||||
1. **Fix certificate issues properly** at the Docker daemon level
|
||||
2. **Test thoroughly** in staging environment
|
||||
3. **Monitor performance** impact
|
||||
4. **Document benefits** clearly for the specific use case
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Option 1: Keep Buildx with Certificate Workaround
|
||||
- ❌ Complex setup with questionable reliability
|
||||
- ❌ Slow performance in GitHub Actions
|
||||
- ❌ Ongoing maintenance burden
|
||||
|
||||
### Option 2: Use Insecure Registry Flag
|
||||
```yaml
|
||||
docker buildx build --allow security.insecure --push .
|
||||
```
|
||||
- ❌ Security concerns
|
||||
- ❌ Not recommended for production
|
||||
- ❌ Temporary workaround, not solution
|
||||
|
||||
### Option 3: Traditional Docker Build + Push ✅ **CHOSEN**
|
||||
- ✅ Simple and reliable
|
||||
- ✅ Proven in production
|
||||
- ✅ Better performance in practice
|
||||
- ✅ Easy to maintain
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
**Chosen Option**: Traditional docker build + push (Option 3)
|
||||
|
||||
This decision prioritizes CI/CD reliability and simplicity over advanced features we don't currently need. The traditional approach has been proven to work consistently in our environment and matches the successful pattern from the webapp project.
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### CI/CD Pipeline Metrics
|
||||
|
||||
1. **CI/CD reliability**: No TLS certificate failures
|
||||
2. **Build consistency**: Predictable build times
|
||||
3. **Maintenance**: Reduced complexity and debugging time
|
||||
4. **Compatibility**: Works across all target environments
|
||||
|
||||
### Build Strategy Metrics
|
||||
|
||||
1. **Cache hit rate**: Percentage of CI runs using existing cache
|
||||
2. **Build time reduction**: Comparison of build times with vs without cache
|
||||
3. **Image size**: Production image size vs development image size
|
||||
4. **CI execution time**: Total CI pipeline duration
|
||||
|
||||
### Quality Metrics
|
||||
|
||||
1. **Build reproducibility**: Consistent builds across different environments
|
||||
2. **Error rate**: Reduction in CI/CD failures
|
||||
3. **Recovery time**: Time to recover from cache misses
|
||||
4. **Resource utilization**: Memory and CPU usage during builds
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
- [x] Create `Dockerfile.prod` for production builds
|
||||
- [x] Update `Dockerfile.build` for build cache
|
||||
- [x] Keep `Dockerfile` for development use
|
||||
- [x] Remove Docker Buildx from CI/CD workflow
|
||||
- [x] Remove Go build steps from CI/CD workflow
|
||||
- [x] Remove certificate configuration step (no longer needed)
|
||||
- [x] Add Docker cache usage to all build steps
|
||||
- [x] Fix Dockerfile.prod to use proper dependency hash (not latest)
|
||||
- [x] Create dependency hash calculation script
|
||||
- [x] Create build cache environment test script
|
||||
- [x] Update CI/CD workflow to generate Dockerfile.prod dynamically
|
||||
- [x] Update ADR 0020 with comprehensive documentation
|
||||
- [x] Test changes locally
|
||||
- [x] Push changes to trigger CI/CD workflow
|
||||
- [ ] Monitor workflow execution
|
||||
- [ ] Verify successful completion
|
||||
- [ ] Document results and metrics
|
||||
|
||||
## Testing and Validation
|
||||
|
||||
### Build Cache Environment Testing
|
||||
|
||||
A comprehensive test script is provided to validate the build cache environment:
|
||||
|
||||
```bash
|
||||
# Test the build cache environment (simulates Gitea act runner)
|
||||
./scripts/test-build-cache-environment.sh
|
||||
```
|
||||
|
||||
This script tests:
|
||||
1. Dependency hash calculation
|
||||
2. Build cache image creation
|
||||
3. Go environment inside container
|
||||
4. Swagger generation
|
||||
5. Go build and test
|
||||
6. Binary build
|
||||
7. Production Dockerfile with cache
|
||||
8. Production container runtime
|
||||
|
||||
### Dependency Hash Calculation
|
||||
|
||||
```bash
|
||||
# Calculate dependency hash (used for cache image tagging)
|
||||
./scripts/calculate-deps-hash.sh
|
||||
|
||||
# Export to file for use in scripts
|
||||
./scripts/calculate-deps-hash.sh deps_hash.env
|
||||
source deps_hash.env
|
||||
echo "Hash: $DEPS_HASH"
|
||||
```
|
||||
|
||||
### Workflow Monitoring
|
||||
|
||||
```bash
|
||||
# Monitor the workflow
|
||||
./scripts/gitea-client.sh monitor-workflow arcodange dance-lessons-coach 420 30
|
||||
|
||||
# Check job status
|
||||
./scripts/gitea-client.sh job-status arcodange dance-lessons-coach 420
|
||||
|
||||
# List workflow jobs
|
||||
./scripts/gitea-client.sh list-workflow-jobs arcodange dance-lessons-coach 420
|
||||
```
|
||||
|
||||
### Validation Commands
|
||||
|
||||
```bash
|
||||
# Verify CI/CD changes
|
||||
./scripts/verify-cicd-changes.sh
|
||||
|
||||
# Test new CI/CD workflow
|
||||
./scripts/test-new-cicd.sh
|
||||
|
||||
# Check Dockerfile syntax
|
||||
docker run --rm -i hadolint/hadolint < Dockerfile.prod
|
||||
```
|
||||
|
||||
## Cleanup and Organization
|
||||
|
||||
### Files Removed
|
||||
|
||||
1. **docker-compose.cicd-test.yml**: Unused Docker Compose file
|
||||
2. **scripts/cicd/**: Old CI/CD test scripts (replaced by main test scripts)
|
||||
|
||||
### Files Organized
|
||||
|
||||
All Dockerfiles moved to `docker/` directory:
|
||||
- `docker/Dockerfile` - Development
|
||||
- `docker/Dockerfile.build` - Build cache
|
||||
- `docker/Dockerfile.prod` - Production (dev only)
|
||||
- `docker/Dockerfile.prod.template` - Template
|
||||
|
||||
### Utility Scripts
|
||||
|
||||
- `scripts/calculate-deps-hash.sh` - Consistent hash calculation
|
||||
- `scripts/test-local-ci-cd.sh` - Main local testing
|
||||
- `scripts/test-build-cache-environment.sh` - Build cache testing
|
||||
|
||||
## Expected Outcomes
|
||||
|
||||
1. **Successful workflow execution**: Workflow completes without errors
|
||||
2. **Cache image created**: Build cache image pushed to registry
|
||||
3. **Production image built**: Final Docker image built using generated `docker/Dockerfile.prod`
|
||||
4. **Faster CI execution**: Reduced build times compared to previous approach
|
||||
5. **No certificate errors**: No TLS certificate verification failures
|
||||
6. **Clean organization**: No clutter in root directory
|
||||
|
||||
## References
|
||||
|
||||
- [Docker Buildx Documentation](https://docs.docker.com/buildx/working-with-buildx/)
|
||||
- [Docker Build Documentation](https://docs.docker.com/engine/reference/commandline/build/)
|
||||
- [GitHub Actions Docker Examples](https://github.com/actions/starter-workflows/tree/main/ci-and-cd)
|
||||
- [webapp CI/CD Pipeline](https://gitea.arcodange.fr/arcodange-org/webapp/src/branch/main/.gitea/workflows/dockerimage.yaml)
|
||||
- [Docker Multi-stage Builds](https://docs.docker.com/build/building/multi-stage/)
|
||||
- [Alpine Linux Docker Images](https://hub.docker.com/_/alpine)
|
||||
|
||||
---
|
||||
|
||||
**Approved by**: @arcodange
|
||||
**Date**: 2026-04-07
|
||||
**Updated**: 2026-04-07
|
||||
**Supersedes**: None
|
||||
**Superseded by**: None
|
||||
471
adr/0021-jwt-secret-retention-policy.md
Normal file
471
adr/0021-jwt-secret-retention-policy.md
Normal file
@@ -0,0 +1,471 @@
|
||||
# 21. JWT Secret Retention Policy
|
||||
|
||||
## Status
|
||||
**Proposed** 🟡
|
||||
|
||||
> **Note:** Basic JWT multi-secret support and graceful rotation are implemented in `pkg/jwt/jwt_secret_manager.go`. The retention cleanup policy (background job, configurable TTL factor) proposed in this ADR is **not yet implemented**.
|
||||
|
||||
## Context
|
||||
|
||||
The dance-lessons-coach application requires a robust JWT secret management system that balances security and user experience. The system supports multiple JWT secrets for graceful rotation. However, the current implementation lacks a clear policy for secret retention and cleanup.
|
||||
|
||||
### Current State
|
||||
|
||||
- ✅ Multiple JWT secrets supported
|
||||
- ✅ Graceful rotation implemented
|
||||
- ✅ Backward compatibility maintained
|
||||
- ❌ No automatic cleanup of old secrets
|
||||
- ❌ No configurable retention periods
|
||||
- ❌ No expiration-based secret management
|
||||
|
||||
### Problem Statement
|
||||
|
||||
Without a retention policy:
|
||||
1. **Security Risk**: Old secrets accumulate indefinitely, increasing attack surface
|
||||
2. **Memory Bloat**: Unbounded growth of secret storage
|
||||
3. **Operational Overhead**: Manual cleanup required
|
||||
4. **Compliance Issues**: May violate security policies requiring regular key rotation
|
||||
|
||||
### Requirements
|
||||
|
||||
1. **Configurable Retention**: Administrators should control how long secrets are retained
|
||||
2. **Automatic Cleanup**: System should automatically remove expired secrets
|
||||
3. **Backward Compatibility**: Existing tokens should continue working during retention period
|
||||
4. **Sensible Defaults**: Should work out-of-the-box with secure defaults
|
||||
5. **Performance**: Cleanup should not impact runtime performance
|
||||
|
||||
## Decision
|
||||
|
||||
### JWT Secret Retention Policy
|
||||
|
||||
Implement a configurable retention policy based on JWT TTL (Time-To-Live) with the following components:
|
||||
|
||||
#### 1. Configuration Structure
|
||||
|
||||
```yaml
|
||||
jwt:
|
||||
# Token time-to-live (default: 24h)
|
||||
ttl: 24h
|
||||
|
||||
# Secret retention configuration
|
||||
secret_retention:
|
||||
# Retention factor multiplier (default: 2.0)
|
||||
# Retention period = JWT TTL × retention_factor
|
||||
retention_factor: 2.0
|
||||
|
||||
# Maximum retention period (safety limit, default: 72h)
|
||||
max_retention: 72h
|
||||
|
||||
# Cleanup frequency for expired secrets (default: 1h)
|
||||
cleanup_interval: 1h
|
||||
```
|
||||
|
||||
#### 2. Retention Period Calculation
|
||||
|
||||
```
|
||||
retention_period = min(JWT_TTL × retention_factor, max_retention)
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
- Default (24h TTL, 2.0 factor): `min(48h, 72h) = 48h`
|
||||
- Short-lived tokens (1h TTL, 3.0 factor): `min(3h, 72h) = 3h`
|
||||
- Long-lived tokens (72h TTL, 2.0 factor): `min(144h, 72h) = 72h`
|
||||
|
||||
#### 3. Secret Lifecycle
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
A[Secret Created] --> B[Active Period]
|
||||
B --> C{Retention Period}
|
||||
C -->|Expired| D[Marked for Cleanup]
|
||||
C -->|Valid| B
|
||||
D --> E[Automatic Removal]
|
||||
```
|
||||
|
||||
#### 4. Cleanup Process
|
||||
|
||||
- **Frequency**: Configurable interval (default: 1 hour)
|
||||
- **Scope**: Remove secrets older than retention period
|
||||
- **Safety**: Never remove current primary secret
|
||||
- **Logging**: Audit trail of cleanup operations
|
||||
|
||||
### Implementation Strategy
|
||||
|
||||
#### Phase 1: Configuration Framework
|
||||
|
||||
1. **Extend Config Package** (`pkg/config/config.go`)
|
||||
- Add JWT TTL configuration
|
||||
- Add secret retention parameters
|
||||
- Implement validation
|
||||
|
||||
2. **Environment Variables**
|
||||
```bash
|
||||
# JWT Token TTL
|
||||
DLC_JWT_TTL=24h
|
||||
|
||||
# Secret Retention
|
||||
DLC_JWT_SECRET_RETENTION_FACTOR=2.0
|
||||
DLC_JWT_SECRET_MAX_RETENTION=72h
|
||||
DLC_JWT_SECRET_CLEANUP_INTERVAL=1h
|
||||
```
|
||||
|
||||
#### Phase 2: Secret Manager Enhancement
|
||||
|
||||
1. **Enhance JWTSecret Struct**
|
||||
```go
|
||||
type JWTSecret struct {
|
||||
Secret string
|
||||
IsPrimary bool
|
||||
CreatedAt time.Time
|
||||
ExpiresAt *time.Time // Now properly calculated
|
||||
RetentionPeriod time.Duration
|
||||
}
|
||||
```
|
||||
|
||||
2. **Add Expiration Logic**
|
||||
```go
|
||||
func (m *JWTSecretManager) AddSecret(secret string, isPrimary bool, expiresIn time.Duration) {
|
||||
// Calculate retention period based on config
|
||||
retentionPeriod := m.calculateRetentionPeriod()
|
||||
expiresAt := time.Now().Add(expiresIn)
|
||||
|
||||
m.secrets = append(m.secrets, JWTSecret{
|
||||
Secret: secret,
|
||||
IsPrimary: isPrimary,
|
||||
CreatedAt: time.Now(),
|
||||
ExpiresAt: &expiresAt,
|
||||
RetentionPeriod: retentionPeriod,
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
#### Phase 3: Automatic Cleanup
|
||||
|
||||
1. **Background Cleanup Job**
|
||||
```go
|
||||
func (m *JWTSecretManager) StartCleanupJob(ctx context.Context, interval time.Duration) {
|
||||
ticker := time.NewTicker(interval)
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-ticker.C:
|
||||
m.CleanupExpiredSecrets()
|
||||
case <-ctx.Done():
|
||||
ticker.Stop()
|
||||
return
|
||||
}
|
||||
}
|
||||
}()
|
||||
}
|
||||
```
|
||||
|
||||
2. **Cleanup Implementation**
|
||||
```go
|
||||
func (m *JWTSecretManager) CleanupExpiredSecrets() {
|
||||
now := time.Now()
|
||||
var activeSecrets []JWTSecret
|
||||
|
||||
for _, secret := range m.secrets {
|
||||
if secret.IsPrimary {
|
||||
// Never remove current primary
|
||||
activeSecrets = append(activeSecrets, secret)
|
||||
continue
|
||||
}
|
||||
|
||||
// Check if secret is within retention period
|
||||
if now.Sub(secret.CreatedAt) <= secret.RetentionPeriod {
|
||||
activeSecrets = append(activeSecrets, secret)
|
||||
} else {
|
||||
log.Info().
|
||||
Str("secret", secret.Secret).
|
||||
Msg("Removed expired JWT secret")
|
||||
}
|
||||
}
|
||||
|
||||
m.secrets = activeSecrets
|
||||
}
|
||||
```
|
||||
|
||||
#### Phase 4: Integration
|
||||
|
||||
1. **Server Initialization**
|
||||
```go
|
||||
func (s *Server) InitializeJWT() error {
|
||||
// Load config
|
||||
jwtConfig := s.config.GetJWTConfig()
|
||||
|
||||
// Create secret manager with retention policy
|
||||
secretManager := NewJWTSecretManager(
|
||||
jwtConfig.Secret,
|
||||
WithRetentionFactor(jwtConfig.RetentionFactor),
|
||||
WithMaxRetention(jwtConfig.MaxRetention),
|
||||
)
|
||||
|
||||
// Start cleanup job
|
||||
secretManager.StartCleanupJob(s.ctx, jwtConfig.CleanupInterval)
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
### Validation
|
||||
|
||||
#### 1. Configuration Validation
|
||||
|
||||
```go
|
||||
func (c *Config) ValidateJWTConfig() error {
|
||||
if c.JWT.TTL <= 0 {
|
||||
return fmt.Errorf("jwt.ttl must be positive")
|
||||
}
|
||||
|
||||
if c.JWT.SecretRetention.RetentionFactor < 1.0 {
|
||||
return fmt.Errorf("jwt.secret_retention.retention_factor must be ≥ 1.0")
|
||||
}
|
||||
|
||||
if c.JWT.SecretRetention.MaxRetention <= 0 {
|
||||
return fmt.Errorf("jwt.secret_retention.max_retention must be positive")
|
||||
}
|
||||
|
||||
if c.JWT.SecretRetention.CleanupInterval <= 0 {
|
||||
return fmt.Errorf("jwt.secret_retention.cleanup_interval must be positive")
|
||||
}
|
||||
|
||||
// Ensure max retention is reasonable
|
||||
if c.JWT.SecretRetention.MaxRetention > 720h { // 30 days
|
||||
return fmt.Errorf("jwt.secret_retention.max_retention exceeds maximum of 720h")
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. Runtime Validation
|
||||
|
||||
```go
|
||||
func (m *JWTSecretManager) ValidateSecret(secret string) error {
|
||||
// Check minimum length
|
||||
if len(secret) < 16 {
|
||||
return fmt.Errorf("jwt secret must be at least 16 characters")
|
||||
}
|
||||
|
||||
// Check entropy (basic check)
|
||||
if !hasSufficientEntropy(secret) {
|
||||
return fmt.Errorf("jwt secret must have sufficient entropy")
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
### Monitoring and Observability
|
||||
|
||||
#### 1. Metrics
|
||||
|
||||
```go
|
||||
// Prometheus metrics
|
||||
var (
|
||||
jwtSecretsActive = prometheus.NewGauge(prometheus.GaugeOpts{
|
||||
Name: "jwt_secrets_active_count",
|
||||
Help: "Number of active JWT secrets",
|
||||
})
|
||||
|
||||
jwtSecretsExpired = prometheus.NewCounter(prometheus.CounterOpts{
|
||||
Name: "jwt_secrets_expired_total",
|
||||
Help: "Total number of expired JWT secrets removed",
|
||||
})
|
||||
|
||||
jwtSecretRetentionDuration = prometheus.NewHistogram(prometheus.HistogramOpts{
|
||||
Name: "jwt_secret_retention_duration_seconds",
|
||||
Help: "Duration of JWT secret retention periods",
|
||||
Buckets: prometheus.ExponentialBuckets(3600, 2, 6), // 1h to 32h
|
||||
})
|
||||
)
|
||||
```
|
||||
|
||||
#### 2. Logging
|
||||
|
||||
```go
|
||||
func (m *JWTSecretManager) logSecretEvent(secret string, event string, details ...interface{}) {
|
||||
log.Info().
|
||||
Str("secret", maskSecret(secret)).
|
||||
Str("event", event).
|
||||
Interface("details", details).
|
||||
Msg("JWT secret event")
|
||||
}
|
||||
|
||||
func maskSecret(secret string) string {
|
||||
if len(secret) <= 4 {
|
||||
return "****"
|
||||
}
|
||||
return secret[:4] + "****" + secret[len(secret)-4:]
|
||||
}
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Enhanced Security**: Automatic cleanup reduces attack surface
|
||||
2. **Reduced Memory Usage**: Prevents unbounded growth of secret storage
|
||||
3. **Operational Efficiency**: No manual cleanup required
|
||||
4. **Compliance Ready**: Meets security policy requirements for key rotation
|
||||
5. **Flexibility**: Configurable to meet different security requirements
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Complexity**: Adds configuration and cleanup logic
|
||||
2. **Performance Overhead**: Background cleanup job (minimal impact)
|
||||
3. **Migration**: Existing deployments need configuration updates
|
||||
4. **Debugging**: More moving parts to troubleshoot
|
||||
|
||||
### Neutral
|
||||
|
||||
1. **Backward Compatibility**: Existing tokens continue to work
|
||||
2. **Learning Curve**: New configuration options to understand
|
||||
3. **Monitoring**: Additional metrics to track
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Alternative 1: Fixed Retention Period
|
||||
|
||||
**Proposal**: Use fixed retention period (e.g., 48 hours) instead of TTL-based calculation
|
||||
|
||||
**Rejected Because**:
|
||||
- Less flexible for different use cases
|
||||
- Doesn't scale with JWT TTL changes
|
||||
- May be too short for long-lived tokens or too long for short-lived ones
|
||||
|
||||
### Alternative 2: Manual Cleanup Only
|
||||
|
||||
**Proposal**: Require administrators to manually clean up old secrets
|
||||
|
||||
**Rejected Because**:
|
||||
- Operational overhead
|
||||
- Security risk if cleanup is forgotten
|
||||
- Doesn't scale for frequent rotations
|
||||
|
||||
### Alternative 3: No Retention (Current State)
|
||||
|
||||
**Proposal**: Keep current behavior with no automatic cleanup
|
||||
|
||||
**Rejected Because**:
|
||||
- Security concerns with accumulating secrets
|
||||
- Memory management issues
|
||||
- Compliance violations
|
||||
|
||||
## Success Metrics
|
||||
|
||||
1. **Security**: No old secrets remain beyond retention period
|
||||
2. **Reliability**: 99.9% of valid tokens continue to work during rotation
|
||||
3. **Performance**: Cleanup job completes in <100ms with <1000 secrets
|
||||
4. **Adoption**: Configuration used in 100% of deployments within 3 months
|
||||
|
||||
## Migration Plan
|
||||
|
||||
### Phase 1: Preparation (1 week)
|
||||
- ✅ Create this ADR
|
||||
- ✅ Update documentation
|
||||
- ✅ Add configuration to config package
|
||||
- ✅ Implement basic retention logic
|
||||
|
||||
### Phase 2: Testing (2 weeks)
|
||||
- ✅ Write BDD scenarios for retention
|
||||
- ✅ Add unit tests for secret manager
|
||||
- ✅ Test with various TTL/factor combinations
|
||||
- ✅ Performance testing with large secret counts
|
||||
|
||||
### Phase 3: Rollout (1 week)
|
||||
- ✅ Update default configuration
|
||||
- ✅ Add feature flag for gradual rollout
|
||||
- ✅ Monitor metrics in staging
|
||||
- ✅ Gradual production rollout
|
||||
|
||||
### Phase 4: Optimization (Ongoing)
|
||||
- ✅ Monitor cleanup performance
|
||||
- ✅ Adjust defaults based on real-world usage
|
||||
- ✅ Add alerts for cleanup failures
|
||||
- ✅ Document troubleshooting guide
|
||||
|
||||
## References
|
||||
|
||||
- [ADR-0008: BDD Testing](0008-bdd-testing.md)
|
||||
- [ADR-0018: User Management and Auth System](0018-user-management-auth-system.md)
|
||||
- [RFC 7519: JSON Web Tokens](https://tools.ietf.org/html/rfc7519)
|
||||
- [OWASP Key Management Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Key_Management_Cheat_Sheet.html)
|
||||
|
||||
## Appendix
|
||||
|
||||
### Configuration Examples
|
||||
|
||||
**Development Environment** (short retention for testing):
|
||||
```yaml
|
||||
jwt:
|
||||
ttl: 1h
|
||||
secret_retention:
|
||||
retention_factor: 1.5
|
||||
max_retention: 3h
|
||||
cleanup_interval: 30m
|
||||
```
|
||||
|
||||
**Production Environment** (secure defaults):
|
||||
```yaml
|
||||
jwt:
|
||||
ttl: 24h
|
||||
secret_retention:
|
||||
retention_factor: 2.0
|
||||
max_retention: 72h
|
||||
cleanup_interval: 1h
|
||||
```
|
||||
|
||||
**High-Security Environment** (aggressive rotation):
|
||||
```yaml
|
||||
jwt:
|
||||
ttl: 8h
|
||||
secret_retention:
|
||||
retention_factor: 1.5
|
||||
max_retention: 24h
|
||||
cleanup_interval: 30m
|
||||
```
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
**Issue**: Secrets being removed too quickly
|
||||
- **Check**: Retention factor and JWT TTL settings
|
||||
- **Fix**: Increase retention_factor or JWT TTL
|
||||
|
||||
**Issue**: Too many old secrets accumulating
|
||||
- **Check**: Cleanup job logs and interval
|
||||
- **Fix**: Decrease cleanup_interval or retention_factor
|
||||
|
||||
**Issue**: Performance degradation during cleanup
|
||||
- **Check**: Number of secrets and cleanup frequency
|
||||
- **Fix**: Optimize cleanup algorithm or increase interval
|
||||
|
||||
### FAQ
|
||||
|
||||
**Q: What happens to tokens signed with expired secrets?**
|
||||
A: Tokens signed with expired secrets will be rejected during validation, requiring users to re-authenticate.
|
||||
|
||||
**Q: Can I disable automatic cleanup?**
|
||||
A: Yes, set `cleanup_interval` to a very high value (e.g., `8760h` for 1 year).
|
||||
|
||||
**Q: How does this affect existing deployments?**
|
||||
A: Existing deployments will use sensible defaults. The feature is backward compatible.
|
||||
|
||||
**Q: What's the recommended retention factor?**
|
||||
A: Start with 2.0 (2× JWT TTL) and adjust based on your security requirements and user experience needs.
|
||||
|
||||
**Q: How often should cleanup run?**
|
||||
A: For most deployments, every 1 hour is sufficient. High-volume systems may need more frequent cleanup.
|
||||
|
||||
## Decision Record
|
||||
|
||||
**Approved By**:
|
||||
**Approved Date**:
|
||||
**Implemented By**:
|
||||
**Implementation Date**:
|
||||
|
||||
---
|
||||
|
||||
*Generated by Mistral Vibe*
|
||||
*Co-Authored-By: Mistral Vibe <vibe@mistral.ai>*
|
||||
538
adr/0022-rate-limiting-cache-strategy.md
Normal file
538
adr/0022-rate-limiting-cache-strategy.md
Normal file
@@ -0,0 +1,538 @@
|
||||
# ADR 0022: Rate Limiting and Cache Strategy
|
||||
|
||||
## Status
|
||||
**Proposed** 🟡
|
||||
|
||||
> **⚠️ Not yet implemented.** Gitea issue #13 ("feat: Implement Rate Limiting and Caching Strategy") is open and tracks this work. `go-cache`, `redis`, and `ulule/limiter` are absent from `go.mod`. The phase checkboxes below are corrected to reflect actual status.
|
||||
|
||||
## Context
|
||||
|
||||
As the dance-lessons-coach application grows and potentially serves multiple users simultaneously, we need to implement rate limiting to:
|
||||
|
||||
1. **Prevent abuse** of API endpoints
|
||||
2. **Protect against DDoS attacks**
|
||||
3. **Ensure fair usage** across all users
|
||||
4. **Maintain system stability** under load
|
||||
5. **Provide consistent performance**
|
||||
|
||||
Additionally, we need a caching strategy to:
|
||||
1. **Reduce database load** for frequently accessed data
|
||||
2. **Improve response times** for common requests
|
||||
3. **Support horizontal scaling** with shared cache
|
||||
4. **Handle cache invalidation** properly
|
||||
|
||||
## Decision
|
||||
|
||||
We will implement a **multi-phase caching and rate limiting strategy** with the following components:
|
||||
|
||||
### Phase 1: In-Memory Cache with TTL Support
|
||||
|
||||
**Library Selection**: We will use **`github.com/patrickmn/go-cache`** for in-memory caching because:
|
||||
|
||||
✅ **Pros:**
|
||||
- Simple, lightweight, and well-maintained
|
||||
- Built-in TTL (Time-To-Live) support
|
||||
- Thread-safe by default
|
||||
- No external dependencies
|
||||
- Good performance for single-instance applications
|
||||
- Supports automatic expiration
|
||||
|
||||
❌ **Cons:**
|
||||
- Not shared between multiple instances
|
||||
- Memory-bound (not persistent)
|
||||
- Limited advanced features
|
||||
|
||||
**Implementation Plan:**
|
||||
```go
|
||||
type CacheService interface {
|
||||
Set(key string, value interface{}, expiration time.Duration) error
|
||||
Get(key string) (interface{}, bool)
|
||||
Delete(key string) error
|
||||
Flush() error
|
||||
GetWithTTL(key string) (interface{}, time.Duration, bool)
|
||||
}
|
||||
|
||||
type InMemoryCacheService struct {
|
||||
cache *cache.Cache
|
||||
defaultTTL time.Duration
|
||||
cleanupInterval time.Duration
|
||||
}
|
||||
```
|
||||
|
||||
**Use Cases:**
|
||||
- JWT token validation results
|
||||
- User session data
|
||||
- Frequently accessed greet messages
|
||||
- API response caching for idempotent endpoints
|
||||
|
||||
### Phase 2: Redis-Compatible Shared Cache
|
||||
|
||||
**Library Selection**: We will use **`github.com/redis/go-redis/v9`** with a **Redis-compatible open-source alternative**:
|
||||
|
||||
**Primary Choice**: **Dragonfly** (https://www.dragonflydb.io/)
|
||||
- Redis-compatible
|
||||
- Open-source (Apache 2.0 license)
|
||||
- Written in C++ with multi-threaded architecture
|
||||
- 25x higher throughput than Redis
|
||||
- Lower latency
|
||||
- Drop-in Redis replacement
|
||||
|
||||
**Fallback Choice**: **KeyDB** (https://keydb.dev/)
|
||||
- Multi-threaded Redis fork
|
||||
- Open-source (GPL license)
|
||||
- Better performance than Redis
|
||||
- Full Redis API compatibility
|
||||
|
||||
**Implementation Plan:**
|
||||
```go
|
||||
type RedisCacheService struct {
|
||||
client *redis.Client
|
||||
defaultTTL time.Duration
|
||||
prefix string
|
||||
}
|
||||
|
||||
func NewRedisCacheService(config *config.CacheConfig) (*RedisCacheService, error) {
|
||||
client := redis.NewClient(&redis.Options{
|
||||
Addr: config.Host + ":" + strconv.Itoa(config.Port),
|
||||
Password: config.Password,
|
||||
DB: config.Database,
|
||||
PoolSize: config.PoolSize,
|
||||
})
|
||||
|
||||
// Test connection
|
||||
_, err := client.Ping(context.Background()).Result()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to connect to Redis: %w", err)
|
||||
}
|
||||
|
||||
return &RedisCacheService{
|
||||
client: client,
|
||||
defaultTTL: config.DefaultTTL,
|
||||
prefix: config.Prefix,
|
||||
}, nil
|
||||
}
|
||||
```
|
||||
|
||||
**Configuration:**
|
||||
```yaml
|
||||
cache:
|
||||
# In-memory cache configuration
|
||||
in_memory:
|
||||
enabled: true
|
||||
default_ttl: 5m
|
||||
cleanup_interval: 10m
|
||||
max_items: 10000
|
||||
|
||||
# Redis-compatible cache configuration
|
||||
redis:
|
||||
enabled: false
|
||||
host: "localhost"
|
||||
port: 6379
|
||||
password: ""
|
||||
database: 0
|
||||
pool_size: 10
|
||||
default_ttl: 5m
|
||||
prefix: "dlc:"
|
||||
use_dragonfly: true # Set to false to use KeyDB
|
||||
```
|
||||
|
||||
### Phase 3: Rate Limiting Implementation
|
||||
|
||||
**Library Selection**: We will use **`github.com/ulule/limiter/v3`** because:
|
||||
|
||||
✅ **Pros:**
|
||||
- Multiple storage backends (in-memory, Redis, etc.)
|
||||
- Sliding window algorithm
|
||||
- Distributed rate limiting support
|
||||
- Configurable rate limits
|
||||
- Middleware support for Chi router
|
||||
- Good performance
|
||||
|
||||
**Implementation Plan:**
|
||||
```go
|
||||
// Rate limit configuration
|
||||
type RateLimitConfig struct {
|
||||
Enabled bool `mapstructure:"enabled"`
|
||||
RequestsPerHour int `mapstructure:"requests_per_hour"`
|
||||
BurstLimit int `mapstructure:"burst_limit"`
|
||||
IPWhitelist []string `mapstructure:"ip_whitelist"`
|
||||
EndpointSpecific map[string]struct {
|
||||
RequestsPerHour int `mapstructure:"requests_per_hour"`
|
||||
BurstLimit int `mapstructure:"burst_limit"`
|
||||
} `mapstructure:"endpoint_specific"`
|
||||
}
|
||||
|
||||
// Rate limiter service
|
||||
type RateLimiterService struct {
|
||||
limiter *limiter.Limiter
|
||||
store limiter.Store
|
||||
config *RateLimitConfig
|
||||
}
|
||||
|
||||
func NewRateLimiterService(config *RateLimitConfig) (*RateLimiterService, error) {
|
||||
var store limiter.Store
|
||||
|
||||
// Use Redis if available, otherwise use in-memory
|
||||
if config.UseRedis {
|
||||
// Initialize Redis store
|
||||
store, err = limiter.NewStoreRedisWithOptions(&limiter.StoreOptions{
|
||||
Prefix: config.RedisPrefix,
|
||||
// ... other Redis options
|
||||
})
|
||||
} else {
|
||||
// Use in-memory store
|
||||
store = limiter.NewStoreMemory()
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to create rate limiter store: %w", err)
|
||||
}
|
||||
|
||||
// Create rate limiter
|
||||
rate := limiter.Rate{
|
||||
Period: time.Hour,
|
||||
Limit: int64(config.RequestsPerHour),
|
||||
}
|
||||
|
||||
return &RateLimiterService{
|
||||
limiter: limiter.New(store, rate),
|
||||
store: store,
|
||||
config: config,
|
||||
}, nil
|
||||
}
|
||||
```
|
||||
|
||||
**Chi Middleware:**
|
||||
```go
|
||||
func RateLimitMiddleware(limiter *RateLimiterService) func(http.Handler) http.Handler {
|
||||
return func(next http.Handler) http.Handler {
|
||||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
// Skip rate limiting for whitelisted IPs
|
||||
clientIP := r.Header.Get("X-Real-IP")
|
||||
if clientIP == "" {
|
||||
clientIP = r.RemoteAddr
|
||||
}
|
||||
|
||||
for _, allowedIP := range limiter.config.IPWhitelist {
|
||||
if clientIP == allowedIP {
|
||||
next.ServeHTTP(w, r)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
// Get rate limit context
|
||||
context, err := limiter.limiter.Get(r.Context(), clientIP)
|
||||
if err != nil {
|
||||
log.Error().Err(err).Str("ip", clientIP).Msg("Rate limit error")
|
||||
http.Error(w, "Internal server error", http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
|
||||
// Check if rate limit is exceeded
|
||||
if context.Reached > 0 {
|
||||
w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
|
||||
w.Header().Set("X-RateLimit-Remaining", "0")
|
||||
w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))
|
||||
|
||||
http.Error(w, "Too many requests", http.StatusTooManyRequests)
|
||||
return
|
||||
}
|
||||
|
||||
// Set rate limit headers
|
||||
w.Header().Set("X-RateLimit-Limit", strconv.Itoa(limiter.config.RequestsPerHour))
|
||||
w.Header().Set("X-RateLimit-Remaining", strconv.Itoa(limiter.config.RequestsPerHour-int(context.Reached)))
|
||||
w.Header().Set("X-RateLimit-Reset", strconv.Itoa(int(context.Reset)))
|
||||
|
||||
next.ServeHTTP(w, r)
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 4: Cache Invalidation Strategy
|
||||
|
||||
**Approach**: Hybrid cache invalidation with multiple strategies:
|
||||
|
||||
1. **Time-Based Expiration (TTL)**
|
||||
- All cache entries have a TTL
|
||||
- Automatic expiration prevents stale data
|
||||
- Default TTL: 5 minutes for most data
|
||||
|
||||
2. **Event-Based Invalidation**
|
||||
- Cache keys are invalidated on specific events
|
||||
- Example: User data cache invalidated on user update
|
||||
- Uses pub/sub pattern for distributed invalidation
|
||||
|
||||
3. **Versioned Cache Keys**
|
||||
- Cache keys include data version
|
||||
- When data changes, version increments
|
||||
- Old cache entries naturally expire
|
||||
|
||||
4. **Write-Through Caching**
|
||||
- Data written to database and cache simultaneously
|
||||
- Ensures cache is always up-to-date
|
||||
- Used for critical data that must be consistent
|
||||
|
||||
**Cache Key Strategy:**
|
||||
```go
|
||||
func GetCacheKey(prefix, entityType, entityID string) string {
|
||||
return fmt.Sprintf("%s:%s:%s", prefix, entityType, entityID)
|
||||
}
|
||||
|
||||
// Example: "dlc:user:123"
|
||||
// Example: "dlc:jwt:validation:token_hash"
|
||||
```
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: In-Memory Cache (Current Sprint)
|
||||
- ❌ Research and select in-memory cache library
|
||||
- ❌ Implement cache interface and in-memory service
|
||||
- ❌ Add cache configuration to config package
|
||||
- ❌ Implement basic cache operations (set, get, delete)
|
||||
- ❌ Add TTL support and automatic cleanup
|
||||
- ❌ Cache JWT validation results
|
||||
- ❌ Add cache metrics and monitoring
|
||||
|
||||
### Phase 2: Redis-Compatible Cache (Next Sprint)
|
||||
- ❌ Set up Dragonfly/KeyDB in development environment
|
||||
- ❌ Implement Redis cache service
|
||||
- ❌ Add configuration for Redis connection
|
||||
- ❌ Implement cache fallback strategy (Redis → in-memory)
|
||||
- ❌ Add health checks for Redis connection
|
||||
- ❌ Implement distributed cache invalidation
|
||||
|
||||
### Phase 3: Rate Limiting (Following Sprint)
|
||||
- ❌ Research and select rate limiting library
|
||||
- ❌ Implement rate limiter service
|
||||
- ❌ Add rate limit configuration
|
||||
- ❌ Implement Chi middleware for rate limiting
|
||||
- ❌ Add rate limit headers to responses
|
||||
- ❌ Implement IP whitelisting
|
||||
- ❌ Add endpoint-specific rate limits
|
||||
|
||||
### Phase 4: Advanced Features (Future)
|
||||
- ❌ Cache warming for critical data
|
||||
- ❌ Two-level caching (Redis + in-memory)
|
||||
- ❌ Cache compression for large objects
|
||||
- ❌ Rate limit exemptions for admin users
|
||||
- ❌ Dynamic rate limit adjustment
|
||||
- ❌ Cache analytics and usage patterns
|
||||
|
||||
## Configuration
|
||||
|
||||
```yaml
|
||||
# Cache configuration
|
||||
cache:
|
||||
in_memory:
|
||||
enabled: true
|
||||
default_ttl: "5m"
|
||||
cleanup_interval: "10m"
|
||||
max_items: 10000
|
||||
|
||||
redis:
|
||||
enabled: false
|
||||
host: "localhost"
|
||||
port: 6379
|
||||
password: ""
|
||||
database: 0
|
||||
pool_size: 10
|
||||
default_ttl: "5m"
|
||||
prefix: "dlc:"
|
||||
use_dragonfly: true
|
||||
|
||||
# Rate limiting configuration
|
||||
rate_limiting:
|
||||
enabled: true
|
||||
requests_per_hour: 1000
|
||||
burst_limit: 100
|
||||
ip_whitelist:
|
||||
- "127.0.0.1"
|
||||
- "::1"
|
||||
endpoint_specific:
|
||||
"/api/v1/auth/login":
|
||||
requests_per_hour: 100
|
||||
burst_limit: 10
|
||||
"/api/v1/auth/register":
|
||||
requests_per_hour: 50
|
||||
burst_limit: 5
|
||||
```
|
||||
|
||||
## Monitoring and Metrics
|
||||
|
||||
**Cache Metrics:**
|
||||
- Cache hit/miss ratio
|
||||
- Average cache latency
|
||||
- Cache size and memory usage
|
||||
- Eviction rate
|
||||
- TTL distribution
|
||||
|
||||
**Rate Limit Metrics:**
|
||||
- Requests allowed vs rejected
|
||||
- Rate limit exceeded events
|
||||
- Top limited IPs
|
||||
- Endpoint-specific rate limit usage
|
||||
|
||||
**Prometheus Metrics:**
|
||||
```go
|
||||
var (
|
||||
cacheHits = prometheus.NewCounterVec(prometheus.CounterOpts{
|
||||
Name: "cache_hits_total",
|
||||
Help: "Number of cache hits",
|
||||
}, []string{"cache_type", "entity_type"})
|
||||
|
||||
cacheMisses = prometheus.NewCounterVec(prometheus.CounterOpts{
|
||||
Name: "cache_misses_total",
|
||||
Help: "Number of cache misses",
|
||||
}, []string{"cache_type", "entity_type"})
|
||||
|
||||
rateLimitExceeded = prometheus.NewCounterVec(prometheus.CounterOpts{
|
||||
Name: "rate_limit_exceeded_total",
|
||||
Help: "Number of rate limit exceeded events",
|
||||
}, []string{"endpoint", "ip"})
|
||||
)
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **Cache Security:**
|
||||
- Never cache sensitive user data (passwords, tokens)
|
||||
- Use separate cache prefixes for different data types
|
||||
- Implement cache key hashing for sensitive data
|
||||
- Set appropriate TTLs to limit exposure
|
||||
|
||||
2. **Rate Limit Security:**
|
||||
- Prevent rate limit bypass attacks
|
||||
- Use X-Real-IP header for proper IP detection
|
||||
- Implement rate limit for authentication endpoints
|
||||
- Log rate limit violations for security monitoring
|
||||
|
||||
3. **Redis Security:**
|
||||
- Use authentication if enabled
|
||||
- Implement TLS for Redis connections
|
||||
- Use separate database numbers for different environments
|
||||
- Limit Redis commands to prevent abuse
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
1. **Cache Performance:**
|
||||
- Benchmark cache operations
|
||||
- Monitor cache latency
|
||||
- Optimize cache key size
|
||||
- Use appropriate data structures
|
||||
|
||||
2. **Rate Limit Performance:**
|
||||
- Use efficient rate limiting algorithm
|
||||
- Minimize middleware overhead
|
||||
- Cache rate limit decisions
|
||||
- Batch rate limit checks where possible
|
||||
|
||||
3. **Memory Management:**
|
||||
- Set reasonable cache size limits
|
||||
- Monitor memory usage
|
||||
- Implement cache eviction policies
|
||||
- Use memory-efficient data structures
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
### From No Cache to In-Memory Cache
|
||||
1. Implement cache interface and in-memory service
|
||||
2. Add cache configuration with sensible defaults
|
||||
3. Gradually add caching to critical endpoints
|
||||
4. Monitor cache performance and hit ratios
|
||||
5. Adjust TTLs based on usage patterns
|
||||
|
||||
### From In-Memory to Redis Cache
|
||||
1. Set up Dragonfly/KeyDB in development
|
||||
2. Implement Redis cache service
|
||||
3. Add fallback logic (Redis → in-memory)
|
||||
4. Test with both caches enabled
|
||||
5. Gradually migrate to Redis-only
|
||||
6. Monitor distributed cache performance
|
||||
|
||||
### From No Rate Limiting to Rate Limiting
|
||||
1. Implement rate limiter with generous limits
|
||||
2. Add monitoring for rate limit events
|
||||
3. Gradually tighten limits based on usage
|
||||
4. Add IP whitelist for critical services
|
||||
5. Implement endpoint-specific limits
|
||||
6. Monitor and adjust as needed
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Cache Libraries
|
||||
1. **`github.com/bluele/gcache`** - More features but more complex
|
||||
2. **`github.com/allegro/bigcache`** - High performance but no TTL
|
||||
3. **`github.com/coocood/freecache`** - Very fast but limited API
|
||||
|
||||
### Redis Alternatives
|
||||
1. **Redis Enterprise** - Commercial, not open-source
|
||||
2. **Memcached** - No persistence, simpler protocol
|
||||
3. **Couchbase** - More complex, document-oriented
|
||||
|
||||
### Rate Limiting Libraries
|
||||
1. **`golang.org/x/time/rate`** - Simple but no distributed support
|
||||
2. **`github.com/juju/ratelimit`** - Good but limited features
|
||||
3. **Custom implementation** - Too much development effort
|
||||
|
||||
## Success Metrics
|
||||
|
||||
1. **Cache Effectiveness:**
|
||||
- Cache hit ratio > 80%
|
||||
- Average cache latency < 1ms
|
||||
- Memory usage within limits
|
||||
|
||||
2. **Rate Limiting Effectiveness:**
|
||||
- < 1% of legitimate requests blocked
|
||||
- Effective protection against abuse
|
||||
- No impact on normal usage patterns
|
||||
|
||||
3. **System Stability:**
|
||||
- Reduced database load by 50%
|
||||
- Consistent response times under load
|
||||
- No cache-related outages
|
||||
|
||||
## Risks and Mitigations
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| Cache stampede | Implement cache warming and fallback logic |
|
||||
| Memory exhaustion | Set reasonable cache size limits and monitor usage |
|
||||
| Redis failure | Implement fallback to in-memory cache |
|
||||
| Rate limit false positives | Start with generous limits and monitor |
|
||||
| Performance degradation | Benchmark before and after implementation |
|
||||
| Cache inconsistency | Use appropriate invalidation strategies |
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Cache Pre-warming** - Load frequently used data at startup
|
||||
2. **Two-Level Caching** - Local cache + distributed cache
|
||||
3. **Cache Compression** - For large cache objects
|
||||
4. **Dynamic Rate Limits** - Adjust based on system load
|
||||
5. **User-Specific Rate Limits** - Different limits for different user tiers
|
||||
6. **Cache Analytics** - Detailed usage patterns and optimization
|
||||
|
||||
## References
|
||||
|
||||
- [go-cache documentation](https://github.com/patrickmn/go-cache)
|
||||
- [Dragonfly documentation](https://www.dragonflydb.io/docs)
|
||||
- [KeyDB documentation](https://keydb.dev/)
|
||||
- [limiter/v3 documentation](https://github.com/ulule/limiter)
|
||||
- [Chi middleware documentation](https://github.com/go-chi/chi)
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
1. **Simplicity** - Easy to implement and maintain
|
||||
2. **Performance** - Minimal impact on response times
|
||||
3. **Scalability** - Support for horizontal scaling
|
||||
4. **Reliability** - Graceful degradation on failures
|
||||
5. **Open Source** - Preference for open-source solutions
|
||||
6. **Community** - Active development and support
|
||||
|
||||
## Conclusion
|
||||
|
||||
This ADR proposes a comprehensive caching and rate limiting strategy that will significantly improve the performance, scalability, and reliability of the dance-lessons-coach application. The phased approach allows for gradual implementation and testing, minimizing risk while delivering value at each stage.
|
||||
|
||||
The combination of in-memory caching for single-instance deployments and Redis-compatible caching for distributed environments provides flexibility for different deployment scenarios. The rate limiting implementation will protect the application from abuse while maintaining a good user experience.
|
||||
|
||||
This strategy aligns with our architectural principles of simplicity, performance, and scalability while using well-established open-source technologies with strong community support.
|
||||
266
adr/0023-config-hot-reloading.md
Normal file
266
adr/0023-config-hot-reloading.md
Normal file
@@ -0,0 +1,266 @@
|
||||
# Config Hot Reloading Strategy
|
||||
|
||||
* Status: Proposed
|
||||
* Deciders: Gabriel Radureau, AI Agent
|
||||
* Date: 2026-04-05
|
||||
|
||||
> **⚠️ Not yet implemented.** No `ConfigManager` exists in `pkg/config/` and Viper's `WatchConfig()` is not wired up. However, `features/config/config_hot_reloading.feature` has been written — BDD scenarios exist for a feature that is not yet built. Those tests are expected to fail until implementation begins.
|
||||
|
||||
## Context and Problem Statement
|
||||
|
||||
The dance-lessons-coach application currently loads configuration once at startup using Viper, which supports file-based configuration, environment variables, and defaults. However, the current implementation does not support runtime configuration changes without restarting the application.
|
||||
|
||||
We need to determine whether and how to implement config hot reloading - the ability to detect changes to the optional `config.yaml` file and apply those changes without requiring a full application restart.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
* **Development convenience**: Hot reloading would allow developers to change configuration without restarting the server during development
|
||||
* **Production flexibility**: Ability to adjust certain configuration parameters without downtime
|
||||
* **Complexity**: Hot reloading adds significant complexity to the codebase
|
||||
* **Safety**: Some configuration changes require careful handling to avoid runtime errors
|
||||
* **Viper capabilities**: Viper already supports file watching through `viper.WatchConfig()`
|
||||
* **Configuration scope**: Not all configuration parameters can or should be hot-reloaded
|
||||
|
||||
## Considered Options
|
||||
|
||||
### Option 1: Full Hot Reloading with Viper WatchConfig
|
||||
|
||||
Implement comprehensive hot reloading using Viper's built-in `WatchConfig()` functionality to monitor the config file and automatically reload when changes are detected.
|
||||
|
||||
### Option 2: Selective Hot Reloading
|
||||
|
||||
Only allow hot reloading for specific configuration sections that are safe to change at runtime (e.g., logging level, feature flags) while requiring restart for others (e.g., server host/port, database credentials).
|
||||
|
||||
### Option 3: Manual Reload Endpoint
|
||||
|
||||
Add an admin endpoint (e.g., `POST /api/admin/reload-config`) that triggers configuration reload when called, giving explicit control over when reloading happens.
|
||||
|
||||
### Option 4: No Hot Reloading
|
||||
|
||||
Maintain the current approach of loading configuration only at startup, requiring application restart for any configuration changes.
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
Chosen option: **"Selective Hot Reloading"** because it provides the benefits of runtime configuration changes while maintaining safety and control. This approach:
|
||||
|
||||
* Allows safe configuration changes without restart
|
||||
* Prevents dangerous runtime changes to critical parameters
|
||||
* Leverages Viper's existing capabilities
|
||||
* Provides a clear boundary between hot-reloadable and non-hot-reloadable settings
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### Hot-Reloadable Configuration
|
||||
|
||||
The following configuration parameters will support hot reloading:
|
||||
|
||||
* **Logging level** (`logging.level`)
|
||||
* **Feature flags** (`api.v2_enabled`)
|
||||
* **Telemetry sampling** (`telemetry.sampler.type`, `telemetry.sampler.ratio`)
|
||||
* **JWT TTL** (`auth.jwt.ttl`)
|
||||
|
||||
### Non-Hot-Reloadable Configuration
|
||||
|
||||
These parameters will require application restart:
|
||||
|
||||
* **Server settings** (`server.host`, `server.port`)
|
||||
* **Database credentials** (`database.*`)
|
||||
* **JWT secret** (`auth.jwt_secret`)
|
||||
* **Admin credentials** (`auth.admin_master_password`)
|
||||
|
||||
### Implementation Plan
|
||||
|
||||
```go
|
||||
// Add to config package
|
||||
type ConfigManager struct {
|
||||
config *Config
|
||||
viper *viper.Viper
|
||||
changeChan chan struct{}
|
||||
stopChan chan struct{}
|
||||
}
|
||||
|
||||
func NewConfigManager() (*ConfigManager, error) {
|
||||
// Initialize Viper and load initial config
|
||||
// Start file watcher if config file exists
|
||||
}
|
||||
|
||||
func (cm *ConfigManager) StartWatching() {
|
||||
if cm.viper != nil {
|
||||
cm.viper.WatchConfig()
|
||||
cm.viper.OnConfigChange(func(e fsnotify.Event) {
|
||||
cm.handleConfigChange()
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func (cm *ConfigManager) handleConfigChange() {
|
||||
// Reload only safe configuration sections
|
||||
// Update logging level if changed
|
||||
// Update feature flags if changed
|
||||
// Notify other components of changes
|
||||
|
||||
log.Info().Msg("Configuration reloaded (partial)")
|
||||
}
|
||||
|
||||
// Safe getter methods that work with hot reloading
|
||||
func (cm *ConfigManager) GetLogLevel() string {
|
||||
// Return current value, potentially updated via hot reload
|
||||
}
|
||||
```
|
||||
|
||||
### Configuration File Monitoring
|
||||
|
||||
```go
|
||||
// In main application setup
|
||||
func main() {
|
||||
configManager, err := config.NewConfigManager()
|
||||
if err != nil {
|
||||
log.Fatal().Err(err).Msg("Failed to initialize config")
|
||||
}
|
||||
|
||||
// Start watching for config changes
|
||||
configManager.StartWatching()
|
||||
|
||||
// Use configManager throughout application instead of direct config access
|
||||
}
|
||||
```
|
||||
|
||||
## Pros and Cons of the Options
|
||||
|
||||
### Option 1: Full Hot Reloading with Viper WatchConfig
|
||||
|
||||
* **Good**: Maximum flexibility for configuration changes
|
||||
* **Good**: Leverages Viper's built-in capabilities
|
||||
* **Good**: Good for development workflow
|
||||
* **Bad**: High risk of runtime errors from unsafe changes
|
||||
* **Bad**: Complex to implement safely
|
||||
* **Bad**: Hard to debug configuration-related issues
|
||||
|
||||
### Option 2: Selective Hot Reloading (Chosen)
|
||||
|
||||
* **Good**: Safe approach with clear boundaries
|
||||
* **Good**: Balances flexibility and stability
|
||||
* **Good**: Easier to implement and maintain
|
||||
* **Good**: Clear documentation of what can be changed
|
||||
* **Bad**: More complex than no hot reloading
|
||||
* **Bad**: Requires careful design of config access patterns
|
||||
|
||||
### Option 3: Manual Reload Endpoint
|
||||
|
||||
* **Good**: Explicit control over when reloading happens
|
||||
* **Good**: Can be secured with authentication
|
||||
* **Good**: Good for production environments
|
||||
* **Bad**: Less convenient for development
|
||||
* **Bad**: Requires additional API endpoint management
|
||||
* **Bad**: Still needs same safety considerations as automatic reloading
|
||||
|
||||
### Option 4: No Hot Reloading
|
||||
|
||||
* **Good**: Simplest approach
|
||||
* **Good**: No risk of runtime configuration errors
|
||||
* **Good**: Easier to reason about application state
|
||||
* **Bad**: Requires restart for any configuration change
|
||||
* **Bad**: Less flexible for production adjustments
|
||||
* **Bad**: Slower development iteration
|
||||
|
||||
## Configuration Change Handling
|
||||
|
||||
### Safe Change Pattern
|
||||
|
||||
```go
|
||||
// Example: Logging level change
|
||||
func (cm *ConfigManager) handleConfigChange() {
|
||||
// Get new config values
|
||||
newConfig := &Config{}
|
||||
if err := cm.viper.Unmarshal(newConfig); err != nil {
|
||||
log.Error().Err(err).Msg("Failed to unmarshal new config")
|
||||
return
|
||||
}
|
||||
|
||||
// Apply safe changes
|
||||
if newConfig.Logging.Level != cm.config.Logging.Level {
|
||||
if err := cm.applyLogLevelChange(newConfig.Logging.Level); err != nil {
|
||||
log.Error().Err(err).Msg("Failed to apply log level change")
|
||||
}
|
||||
}
|
||||
|
||||
// Update other safe parameters...
|
||||
}
|
||||
|
||||
func (cm *ConfigManager) applyLogLevelChange(newLevel string) error {
|
||||
// Validate new level
|
||||
level := parseLogLevel(newLevel)
|
||||
|
||||
// Apply change
|
||||
zerolog.SetGlobalLevel(level)
|
||||
cm.config.Logging.Level = newLevel
|
||||
|
||||
log.Info().Str("new_level", newLevel).Msg("Log level updated")
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
|
||||
* Invalid configuration changes are logged but don't crash the application
|
||||
* Failed changes revert to previous known-good values
|
||||
* Critical errors during reload trigger application shutdown
|
||||
* All changes are logged for audit purposes
|
||||
|
||||
## Links
|
||||
|
||||
* [Viper WatchConfig Documentation](https://github.com/spf13/viper#watching-and-re-reading-config-files)
|
||||
* [Viper OnConfigChange](https://github.com/spf13/viper#example-of-watching-a-config-file)
|
||||
* [ADR-0006: Configuration Management](0006-configuration-management.md)
|
||||
|
||||
## Configuration File Example with Hot-Reloadable Settings
|
||||
|
||||
```yaml
|
||||
# config.yaml - These settings can be hot-reloaded
|
||||
server:
|
||||
host: "0.0.0.0"
|
||||
port: 8080
|
||||
|
||||
logging:
|
||||
level: "info" # Can be changed without restart
|
||||
json: false
|
||||
output: ""
|
||||
|
||||
api:
|
||||
v2_enabled: false # Can be changed without restart
|
||||
|
||||
telemetry:
|
||||
enabled: false
|
||||
sampler:
|
||||
type: "parentbased_always_on" # Can be changed without restart
|
||||
ratio: 1.0
|
||||
```
|
||||
|
||||
## Migration Plan
|
||||
|
||||
1. **Phase 1**: Implement ConfigManager wrapper around existing config
|
||||
2. **Phase 2**: Add selective hot reloading for logging level
|
||||
3. **Phase 3**: Extend to feature flags and telemetry settings
|
||||
4. **Phase 4**: Add documentation and examples
|
||||
5. **Phase 5**: Update all components to use ConfigManager instead of direct config access
|
||||
|
||||
## Monitoring and Observability
|
||||
|
||||
* Log all configuration changes with timestamps
|
||||
* Include previous and new values in change logs
|
||||
* Add metrics for configuration reload events
|
||||
* Provide admin endpoint to view current configuration
|
||||
|
||||
## Security Considerations
|
||||
|
||||
* Config file permissions should be restrictive
|
||||
* Hot reloading should be disabled in production by default
|
||||
* Configuration changes should be audited
|
||||
* Sensitive parameters should never be hot-reloadable
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
* Configuration change webhooks
|
||||
* Configuration versioning and rollback
|
||||
* Configuration validation before applying changes
|
||||
* Multi-file configuration support
|
||||
358
adr/0024-bdd-test-organization-and-isolation.md
Normal file
358
adr/0024-bdd-test-organization-and-isolation.md
Normal file
@@ -0,0 +1,358 @@
|
||||
# ADR 0024: BDD Test Organization and Isolation Strategy
|
||||
|
||||
## Status
|
||||
**Accepted** ✅
|
||||
|
||||
## Context
|
||||
|
||||
As the dance-lessons-coach project grows, our BDD test suite has encountered several challenges. While we initially followed basic Godog patterns, we need to evolve our organization to handle complex scenarios like config hot reloading while maintaining test reliability.
|
||||
|
||||
### Current Issues
|
||||
|
||||
1. **Test Interdependence**: Tests affect each other through shared state (config files, database)
|
||||
2. **Timing Issues**: Config reloading and server restarts cause race conditions
|
||||
3. **Cognitive Load**: Large test files with many scenarios are hard to maintain
|
||||
4. **Flaky Tests**: Tests pass individually but fail when run together
|
||||
5. **Edge Case Handling**: Special setup/teardown requirements for certain tests
|
||||
|
||||
### Godog Best Practices Alignment
|
||||
|
||||
According to [Godog documentation](https://github.com/cucumber/godog) and community best practices, our current organization partially follows recommendations but needs improvement in:
|
||||
|
||||
- **Feature Granularity**: Some files contain multiple unrelated features
|
||||
- **Step Organization**: Steps could be better grouped by domain
|
||||
- **Context Management**: Need better state isolation between scenarios
|
||||
- **Tagging Strategy**: Currently missing tag-based test selection
|
||||
|
||||
## Decision
|
||||
|
||||
Adopt a **modular, isolated test suite architecture** with the following principles:
|
||||
|
||||
### 1. Test Organization by Feature (Godog-Aligned)
|
||||
|
||||
Following [Godog best practices](https://github.com/cucumber/godog), we organize tests by business domain with proper feature granularity:
|
||||
|
||||
```
|
||||
features/
|
||||
├── auth/ # Business domain
|
||||
│ ├── authentication.feature # Single feature per file
|
||||
│ ├── password_reset.feature # Single feature per file
|
||||
│ └── user_management.feature # Single feature per file
|
||||
├── config/ # Business domain
|
||||
│ ├── hot_reloading.feature # Single feature per file
|
||||
│ └── validation.feature # Single feature per file
|
||||
├── greet/ # Business domain
|
||||
│ ├── v1_greeting.feature # Single feature per file
|
||||
│ └── v2_greeting.feature # Single feature per file
|
||||
├── health/ # Business domain
|
||||
│ └── health_check.feature # Single feature per file
|
||||
└── jwt/ # Business domain
|
||||
├── secret_rotation.feature # Single feature per file
|
||||
└── retention_policy.feature # Single feature per file
|
||||
```
|
||||
|
||||
**Key Improvements over current structure:**
|
||||
- ✅ **Single responsibility**: One feature per file
|
||||
- ✅ **Business alignment**: Grouped by domain, not technical concerns
|
||||
- ✅ **Scalability**: Easy to add new features without bloating files
|
||||
|
||||
### 2. Isolation Strategies
|
||||
|
||||
#### A. Config File Isolation
|
||||
- Each feature directory has its own config file pattern
|
||||
- Config files are cleaned up after each feature test run
|
||||
- Example: `features/auth/auth-test-config.yaml`
|
||||
|
||||
#### B. Database Isolation
|
||||
- Use separate database schemas or suffixes per feature
|
||||
- Example: `dance_lessons_coach_auth_test`, `dance_lessons_coach_greet_test`
|
||||
|
||||
#### C. Server Port Isolation
|
||||
- Assign different ports to different test groups
|
||||
- Prevents port conflicts during parallel testing
|
||||
|
||||
### 3. Test Execution Strategy
|
||||
|
||||
#### Option 1: Sequential Feature Testing (Recommended)
|
||||
```bash
|
||||
# Run tests by feature group
|
||||
./scripts/test-feature.sh auth
|
||||
./scripts/test-feature.sh config
|
||||
./scripts/test-feature.sh greet
|
||||
```
|
||||
|
||||
#### Option 2: Parallel Feature Testing (Advanced)
|
||||
```bash
|
||||
# Run features in parallel with isolation
|
||||
./scripts/test-all-features-parallel.sh
|
||||
```
|
||||
|
||||
### 4. Test Synchronization (Godog Best Practices)
|
||||
|
||||
#### A. Explicit Waits with Timeouts
|
||||
Following Godog's [arrange-act-assert pattern](https://alicegg.tech/2019/03/09/gobdd.html):
|
||||
|
||||
```go
|
||||
// Instead of fixed sleep times
|
||||
func waitForServerReady(maxAttempts int, delay time.Duration) error {
|
||||
for i := 0; i < maxAttempts; i++ {
|
||||
if serverIsReady() {
|
||||
return nil
|
||||
}
|
||||
time.Sleep(delay)
|
||||
}
|
||||
return fmt.Errorf("server not ready after %d attempts", maxAttempts)
|
||||
}
|
||||
```
|
||||
|
||||
#### B. Godog Context Management
|
||||
Implement proper context structs as recommended by Godog:
|
||||
|
||||
```go
|
||||
// Feature-specific context for isolation
|
||||
type AuthContext struct {
|
||||
client *testserver.Client
|
||||
db *sql.DB
|
||||
users map[string]UserData
|
||||
}
|
||||
|
||||
func InitializeAuthContext() *AuthContext {
|
||||
return &AuthContext{
|
||||
client: testserver.NewClient(),
|
||||
db: connectToFeatureDB("auth"),
|
||||
users: make(map[string]UserData),
|
||||
}
|
||||
}
|
||||
|
||||
func CleanupAuthContext(ctx *AuthContext) {
|
||||
// Cleanup resources
|
||||
ctx.db.Close()
|
||||
}
|
||||
```
|
||||
|
||||
#### C. Tag-Based Test Selection
|
||||
Add Godog tag support for selective test execution:
|
||||
|
||||
```go
|
||||
// In feature files
|
||||
@smoke @auth
|
||||
Scenario: Successful user authentication
|
||||
Given the server is running
|
||||
When I authenticate with valid credentials
|
||||
Then the authentication should be successful
|
||||
|
||||
// Run specific tags
|
||||
go test ./features/... -tags=smoke
|
||||
godog --tags=@auth features/
|
||||
```
|
||||
|
||||
#### B. Event-Based Synchronization
|
||||
```go
|
||||
// Use server lifecycle events
|
||||
func waitForConfigReload() error {
|
||||
return waitForEvent("config_reloaded", 30*time.Second)
|
||||
}
|
||||
```
|
||||
|
||||
#### C. Test Hooks with Timeouts
|
||||
```go
|
||||
// In test setup
|
||||
ctx.Step("^I wait for v2 API to be enabled$", func() error {
|
||||
return waitForCondition(30*time.Second, func() bool {
|
||||
return v2EndpointAvailable()
|
||||
})
|
||||
})
|
||||
```
|
||||
|
||||
### 5. Test Lifecycle Management
|
||||
|
||||
#### Before Suite (Feature Level)
|
||||
```go
|
||||
func InitializeFeatureSuite(featureName string) {
|
||||
// Setup feature-specific resources
|
||||
initDatabaseForFeature(featureName)
|
||||
createFeatureConfigFile(featureName)
|
||||
startIsolatedServer(featureName)
|
||||
}
|
||||
```
|
||||
|
||||
#### After Suite (Feature Level)
|
||||
```go
|
||||
func CleanupFeatureSuite(featureName string) {
|
||||
// Cleanup feature-specific resources
|
||||
cleanupDatabaseForFeature(featureName)
|
||||
removeFeatureConfigFile(featureName)
|
||||
stopIsolatedServer(featureName)
|
||||
}
|
||||
```
|
||||
|
||||
### 6. Shell Script Integration
|
||||
|
||||
Create feature-specific test scripts:
|
||||
|
||||
```bash
|
||||
# scripts/test-feature.sh
|
||||
#!/bin/bash
|
||||
|
||||
FEATURE=$1
|
||||
DATABASE="dance_lessons_coach_${FEATURE}_test"
|
||||
CONFIG="features/${FEATURE}/${FEATURE}-test-config.yaml"
|
||||
|
||||
# Setup
|
||||
setup_feature_environment() {
|
||||
echo "🧪 Setting up ${FEATURE} feature tests..."
|
||||
create_database ${DATABASE}
|
||||
generate_config ${CONFIG}
|
||||
}
|
||||
|
||||
# Run tests
|
||||
run_feature_tests() {
|
||||
echo "🚀 Running ${FEATURE} feature tests..."
|
||||
DLC_DATABASE_NAME=${DATABASE} \
|
||||
DLC_CONFIG_FILE=${CONFIG} \
|
||||
go test ./features/${FEATURE}/... -v
|
||||
}
|
||||
|
||||
# Teardown
|
||||
cleanup_feature_environment() {
|
||||
echo "🧹 Cleaning up ${FEATURE} feature tests..."
|
||||
drop_database ${DATABASE}
|
||||
remove_config ${CONFIG}
|
||||
}
|
||||
|
||||
# Main execution
|
||||
setup_feature_environment
|
||||
run_feature_tests
|
||||
cleanup_feature_environment
|
||||
```
|
||||
|
||||
### 7. Configuration Management
|
||||
|
||||
#### Feature-Specific Config Files
|
||||
```yaml
|
||||
# features/auth/auth-test-config.yaml
|
||||
server:
|
||||
host: "127.0.0.1"
|
||||
port: 9192 # Feature-specific port
|
||||
|
||||
database:
|
||||
name: "dance_lessons_coach_auth_test" # Feature-specific database
|
||||
|
||||
api:
|
||||
v2_enabled: true # Feature-specific settings
|
||||
|
||||
auth:
|
||||
jwt:
|
||||
ttl: 1h
|
||||
```
|
||||
|
||||
### 8. Test Data Management
|
||||
|
||||
#### A. Feature-Scoped Data
|
||||
- Each feature gets its own data namespace
|
||||
- Example: `auth_user_*`, `greet_message_*` prefixes
|
||||
|
||||
#### B. Automatic Cleanup
|
||||
```go
|
||||
func CleanupFeatureData(featureName string) {
|
||||
// Remove all data created by this feature
|
||||
db.Exec(fmt.Sprintf("DELETE FROM %s_* WHERE feature = '%s'", featureName, featureName))
|
||||
}
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. **Improved Test Reliability**: Tests don't interfere with each other
|
||||
2. **Better Maintainability**: Smaller, focused test files
|
||||
3. **Faster Development**: Run only relevant tests during feature development
|
||||
4. **Easier Debugging**: Isolate issues to specific features
|
||||
5. **Parallel Testing**: Enable safe parallel execution
|
||||
6. **SOLID Compliance**: Single responsibility for test files
|
||||
|
||||
### Negative
|
||||
|
||||
1. **Increased Complexity**: More moving parts in test infrastructure
|
||||
2. **Resource Usage**: Multiple databases/servers consume more resources
|
||||
3. **Setup Time**: Initial test runs may be slower due to setup
|
||||
4. **Learning Curve**: Team needs to understand the isolation patterns
|
||||
|
||||
### Neutral
|
||||
|
||||
1. **Test Execution Time**: May increase or decrease depending on parallelization
|
||||
2. **CI/CD Changes**: Pipeline needs adaptation for new test organization
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Refactor Current Tests (1-2 weeks)
|
||||
1. Split monolithic feature files into feature directories
|
||||
2. Create feature-specific test scripts
|
||||
3. Implement basic isolation (config files, database names)
|
||||
|
||||
### Phase 2: Enhance Test Infrastructure (2-3 weeks)
|
||||
1. Add synchronization helpers to test framework
|
||||
2. Implement server lifecycle management
|
||||
3. Create comprehensive cleanup routines
|
||||
|
||||
### Phase 3: Parallel Testing (Optional)
|
||||
1. Add parallel test execution capability
|
||||
2. Implement port management for parallel runs
|
||||
3. Add resource monitoring
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### 1. Single Test Suite with Better Cleanup
|
||||
**Rejected because**: Doesn't solve fundamental interdependence issues
|
||||
|
||||
### 2. Docker-Based Isolation
|
||||
**Rejected because**: Too heavyweight for local development
|
||||
|
||||
### 3. Test Virtualization
|
||||
**Rejected because**: Overkill for current project size
|
||||
|
||||
## Success Metrics
|
||||
|
||||
1. **Test Reliability**: >95% pass rate in CI/CD
|
||||
2. **Test Isolation**: Ability to run any single feature test independently
|
||||
3. **Developer Experience**: Feature tests run in <30 seconds locally
|
||||
4. **Maintainability**: New team members can understand test structure in <1 hour
|
||||
|
||||
## References
|
||||
|
||||
### Godog Official Resources
|
||||
- [Godog GitHub Repository](https://github.com/cucumber/godog)
|
||||
- [Godog Documentation](https://pkg.go.dev/github.com/cucumber/godog)
|
||||
|
||||
### BDD Best Practices
|
||||
- [BDD Best Practices](references/BDD_BEST_PRACTICES.md)
|
||||
- [Alice GG • BDD in Golang](https://alicegg.tech/2019/03/09/gobdd.html)
|
||||
- [Scrap Your TDD for BDD: Part II](https://medium.com/the-godev-corner/scrap-your-tdd-for-bdd-part-ii-heres-how-to-start-d2468dd46dda)
|
||||
|
||||
### Test Organization Patterns
|
||||
- [Test Server Implementation](references/TEST_SERVER.md)
|
||||
- [Optimizing Godog Test Execution](https://www.reddit.com/r/golang/comments/1llnlp2/optimizing_godog_bdd_test_execution_in_go_how_to/)
|
||||
|
||||
## Revision History
|
||||
|
||||
- **2026-04-09**: Initial draft based on BDD test challenges
|
||||
- **2026-04-09**: Added implementation details and examples
|
||||
|
||||
## Decision Makers
|
||||
|
||||
- **Approved by**: Gabriel Radureau
|
||||
- **Consulted**: AI Agent (Mistral Vibe)
|
||||
- **Informed**: Development Team
|
||||
|
||||
## Future Considerations
|
||||
|
||||
1. **Test Impact Analysis**: Track which tests are affected by code changes
|
||||
2. **Flaky Test Detection**: Automatically identify and quarantine flaky tests
|
||||
3. **Performance Benchmarking**: Monitor test execution times over time
|
||||
4. **Test Coverage Visualization**: Feature-level coverage reports
|
||||
|
||||
---
|
||||
|
||||
**Status**: 🟡 Proposed → Ready for team review and implementation
|
||||
|
||||
**Note**: This ADR complements ADR 0023 (Config Hot Reloading) by addressing the test organization aspects of hot reloading functionality.
|
||||
344
adr/0025-bdd-scenario-isolation-strategies.md
Normal file
344
adr/0025-bdd-scenario-isolation-strategies.md
Normal file
@@ -0,0 +1,344 @@
|
||||
# ADR 0025: BDD Scenario Isolation Strategies
|
||||
|
||||
## Status
|
||||
**Accepted (Partial)** 🟡
|
||||
|
||||
Phase 1 (schema-per-scenario DB isolation + `ScenarioState` manager in `pkg/bdd/steps/scenario_state.go`) is implemented.
|
||||
Phase 2 (cache key prefix strategy, in-memory store `Reset()` methods) is pending — blocked on ADR-0022 (rate limiting/cache) not yet implemented.
|
||||
|
||||
## Context
|
||||
|
||||
As our BDD test suite grows, we're encountering **test pollution** issues where scenarios interfere with each other through shared state. This is particularly problematic for:
|
||||
|
||||
1. **Database state**: Scenarios create users, JWT secrets, config entries that persist across scenarios
|
||||
2. **JWT secret rotation**: Multiple secrets accumulate, affecting subsequent scenario authentication
|
||||
3. **Config file modifications**: Feature flag changes persist between tests
|
||||
4. **Gherkin Background steps**: Data set up in Background is visible to all scenarios in the feature
|
||||
|
||||
Our current approach clears database tables after each scenario, but this has **race condition vulnerabilities** with concurrent scenario execution.
|
||||
|
||||
### Gherkin Background Consideration
|
||||
|
||||
Crucially, Gherkin's `Background` section runs **before each scenario** in a feature, not once before all scenarios. This means:
|
||||
|
||||
```gherkin
|
||||
Feature: User registration
|
||||
Background:
|
||||
Given the database is empty
|
||||
And a default admin user exists
|
||||
|
||||
Scenario: Register new user
|
||||
When I register user "alice"
|
||||
Then user "alice" should exist
|
||||
|
||||
Scenario: Register duplicate user
|
||||
When I register user "alice"
|
||||
Then I should see error "user already exists"
|
||||
```
|
||||
|
||||
The second scenario fails because Background creates data that persists, and the first scenario's data isn't cleaned up. Background steps are re-executed before each scenario.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
* **Isolation**: Each scenario must start with a clean slate
|
||||
* **Performance**: Cleanup must be fast enough for CI/CD pipelines
|
||||
* **Concurrency**: Must work with parallel scenario execution
|
||||
* **Compatibility**: Must work with Gherkin Background steps
|
||||
* **Maintainability**: Solution should be simple to understand and debug
|
||||
|
||||
## Considered Options
|
||||
|
||||
### Option 1: Transaction Rollback (Rejected ❌)
|
||||
|
||||
Wrap each scenario in a database transaction, rollback at the end.
|
||||
|
||||
```go
|
||||
BeforeScenario: BEGIN;
|
||||
AfterScenario: ROLLBACK;
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Simple implementation
|
||||
- Fast - transaction rollback is nearly instant
|
||||
- No data cleanup needed
|
||||
|
||||
**Cons:**
|
||||
- ❌ **Fails if scenario commits**: Nested transaction problem - `COMMIT` inside scenario releases the transaction, parent `ROLLBACK` has no effect
|
||||
- Cannot handle non-database state (JWT secrets in memory, config files)
|
||||
- Doesn't solve JWT secret pollution
|
||||
|
||||
**Verdict: Not viable** - Too many scenarios use database transactions internally.
|
||||
|
||||
---
|
||||
|
||||
### Option 2: Clear Tables in Public Schema (Current ✅/⚠️)
|
||||
|
||||
Delete all rows from all tables after each scenario.
|
||||
|
||||
```go
|
||||
AfterScenario: DELETE FROM table1; DELETE FROM table2; ...
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Currently implemented
|
||||
- Works with any scenario code
|
||||
- Handles database state
|
||||
|
||||
**Cons:**
|
||||
- ⚠️ **Race conditions**: Concurrent scenarios can interleave - Scenario A deletes data while Scenario B is still using it
|
||||
- ⚠️ **Slow**: Must delete from all tables, reset sequences
|
||||
- ❌ **Misses in-memory state**: JWT secrets, config changes persist
|
||||
- ❌ **Doesn't handle Background**: Background data is shared across scenarios
|
||||
|
||||
**Verdict: Partially adequate** - Works for sequential execution but has parallel execution issues.
|
||||
|
||||
---
|
||||
|
||||
### Option 3: Schema-per-Scenario (Recommended ✅)
|
||||
|
||||
Create a unique PostgreSQL schema for each scenario, drop it after.
|
||||
|
||||
```go
|
||||
BeforeScenario:
|
||||
schema := "test_" + sha256(scenario.Name)[:8]
|
||||
CREATE SCHEMA schema;
|
||||
SET search_path = schema, public;
|
||||
|
||||
AfterScenario:
|
||||
DROP SCHEMA schema CASCADE;
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- ✅ **True isolation**: Each scenario has its own database namespace
|
||||
- ✅ **Works with transactions**: Scenario can commit freely - entire schema is dropped
|
||||
- ✅ **Works with Background**: Background runs in scenario's schema, data is isolated
|
||||
- ✅ **Fast**: Schema drop is instant (just metadata deletion)
|
||||
- ✅ **Handles concurrent scenarios**: Different schemas = no conflicts
|
||||
|
||||
**Cons:**
|
||||
- Requires `CREATE/DROP SCHEMA` database privileges in test environment
|
||||
- Some ORMs may hardcode `public` schema - need to use `SET search_path` carefully
|
||||
- Test DB must allow many schemas (typically fine for PostgreSQL)
|
||||
- We need to handle `search_path` in connection pooling (each scenario needs its own connection)
|
||||
|
||||
**Implementation notes:**
|
||||
- Use `Luego` (PostgreSQL schema prefix) approach: `test_{hash}`
|
||||
- Hash: `sha256(feature_name + scenario_name)[:8]` for consistency across runs
|
||||
- Execute Background steps in the scenario's schema context
|
||||
- Set `search_path` at the connection level, not globally
|
||||
|
||||
---
|
||||
|
||||
### Option 4: Database-per-Feature ⚠️
|
||||
|
||||
Create a separate database for each feature file.
|
||||
|
||||
```go
|
||||
BeforeFeature: CREATE DATABASE feature_auth;
|
||||
AfterFeature: DROP DATABASE feature_auth;
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Strong isolation between features
|
||||
- Simple implementation
|
||||
|
||||
**Cons:**
|
||||
- ❌ **Doesn't isolate scenarios within a feature** - Background data shared across scenarios
|
||||
- Database creation is slower than schema creation
|
||||
- Harder to manage in CI (more databases to create/cleanup)
|
||||
- Still need table clearing between scenarios within a feature
|
||||
|
||||
**Verdict: Insufficient** - Doesn't solve intra-feature pollution.
|
||||
|
||||
---
|
||||
|
||||
### Option 5: Schema-per-Feature + Table Clearing per Scenario ⚠️
|
||||
|
||||
Create one schema per feature, clear tables between scenarios.
|
||||
|
||||
```go
|
||||
BeforeFeature: CREATE SCHEMA feature_auth;
|
||||
AfterFeature: DROP SCHEMA feature_auth;
|
||||
AfterScenario: DELETE FROM all_tables;
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Isolates features from each other
|
||||
- Simpler than per-scenario schemas
|
||||
|
||||
**Cons:**
|
||||
- ❌ **Scenarios within a feature share state** - Background data persists
|
||||
- Still has race conditions with concurrent scenarios in same feature
|
||||
- Requires table clearing overhead
|
||||
|
||||
**Verdict: Better than current but still has issues**.
|
||||
|
||||
---
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
**Chosen option: Schema-per-Scenario + In-Memory State Reset + Per-Scenario Step State (Option 3 Enhanced)**
|
||||
|
||||
We will implement schema-per-scenario because it:
|
||||
|
||||
1. Provides **true isolation** for all database state
|
||||
2. **Works with Gherkin Background** - Background runs in each scenario's schema
|
||||
3. **Handles concurrent execution** - No race conditions
|
||||
4. **Works with scenario transactions** - Scenarios can commit freely
|
||||
5. Is **fast** - Schema operations are cheap
|
||||
|
||||
**However, we discovered a critical limitation:** PostgreSQL schemas only isolate **database tables**. In-memory state (application-level caches, user stores, JWT secret managers) **persists across scenarios** because they're stored in the shared `sharedServer` Go instance. Schema isolation does NOT solve this.
|
||||
|
||||
### Enhanced Strategy: Multi-Layer Isolation
|
||||
|
||||
To achieve **complete scenario isolation**, we need a **3-layer approach:**
|
||||
|
||||
| Layer | Component | Strategy | Status |
|
||||
|-------|-----------|----------|--------|
|
||||
| DB | PostgreSQL tables | Schema-per-scenario | ✅ Implemented |
|
||||
| Memory | Server-level state (JWT secrets) | Reset to initial state | ✅ Implemented |
|
||||
| Memory | Step-level state (tokens, user IDs) | Per-scenario state map | ✅ Implemented |
|
||||
| Memory | User store | Reset/clear between scenarios | ⚠️ TODO |
|
||||
| Memory | Auth cache | Reset/clear between scenarios | ⚠️ TODO |
|
||||
| Cache | Redis/Memcached | Key prefix with schema hash | ⚠️ TODO |
|
||||
|
||||
### Layer 3: Per-Scenario Step State Isolation
|
||||
|
||||
**New insight from test failures:** Step definition structs (AuthSteps, GreetSteps, etc.) maintain state in their fields:
|
||||
- `lastToken`, `firstToken` in AuthSteps
|
||||
- `lastUserID` in AuthSteps
|
||||
|
||||
This state **spills across scenarios** even with schema isolation, because struct fields are shared across all scenarios in a test process.
|
||||
|
||||
**Solution:** Create a `ScenarioState` manager with per-scenario isolation:
|
||||
|
||||
```go
|
||||
type ScenarioState struct {
|
||||
LastToken string
|
||||
FirstToken string
|
||||
LastUserID uint
|
||||
}
|
||||
|
||||
type scenarioStateManager struct {
|
||||
mu sync.RWMutex
|
||||
states map[string]*ScenarioState // keyed by scenario hash
|
||||
}
|
||||
|
||||
// Usage in step definitions:
|
||||
func (s *AuthSteps) iShouldReceiveAValidJWTToken() error {
|
||||
state := steps.GetScenarioState(s.scenarioName)
|
||||
state.LastToken = extractedToken
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- ✅ Zero code changes to step definitions (with helper functions)
|
||||
- ✅ Thread-safe (sync.RWMutex)
|
||||
- ✅ Consistent state per scenario
|
||||
- ✅ Automatic cleanup via BeforeScenario/AfterScenario hooks
|
||||
- ✅ Works with random test order
|
||||
|
||||
**Status:** Implemented in `pkg/bdd/steps/scenario_state.go`
|
||||
|
||||
### Key Insight: Cache and In-Memory Store Isolation
|
||||
|
||||
**For caches (Redis, Memcached, in-process):**
|
||||
- Use **schema hash as key prefix/suffix**: `cache_key_{schema_hash}` or `{schema_hash}_cache_key`
|
||||
- This ensures each scenario gets isolated cache namespace
|
||||
- Works even with external cache services
|
||||
- Consistent with schema isolation philosophy
|
||||
|
||||
**For in-memory stores (user repository, etc.):**
|
||||
- Add `Reset()` methods that clear all state
|
||||
- Call in `AfterScenario` alongside schema teardown
|
||||
- Or use schema-prefix approach for shared stores
|
||||
|
||||
### Alternative Approach: Background Explicit State Setup
|
||||
|
||||
**Considered but rejected:** Adding explicit "Given no user X exists" steps or heavy Background sections.
|
||||
|
||||
**Pros:** More readable, explicit about state
|
||||
**Cons:**
|
||||
- Error-prone (must remember for every entity)
|
||||
- Verbose (many Given steps)
|
||||
- Doesn't scale with many entities
|
||||
- Still has race conditions with concurrent scenarios
|
||||
|
||||
**Verdict:** Automated cleanup (schema drop + memory reset) is more reliable than manual Background setup.
|
||||
|
||||
### Implementation Plan
|
||||
|
||||
**Phase 1: Foundation (✅ Complete)**
|
||||
- Add scenario-aware schema management to test server
|
||||
- Implement schema creation/drop in BeforeScenario/AfterScenario hooks
|
||||
- Handle `search_path` configuration for each scenario's database connection
|
||||
|
||||
**Phase 2: In-Memory State Reset (🟡 TODO)**
|
||||
- Add `ResetUsers()` method to clear in-memory user store
|
||||
- Add `ResetCache()` method for auth/rateLimiting caches
|
||||
- Call these in AfterScenario alongside JWT secret reset
|
||||
- **Cache key strategy**: `key_{schema_hash}` for all cache operations
|
||||
|
||||
**Phase 3: Connection Pooling**
|
||||
- Configure connection pool to respect per-scenario `search_path`
|
||||
- Each scenario gets isolated connections
|
||||
|
||||
**Phase 4: Validation**
|
||||
- Run full test suite to verify complete isolation
|
||||
- Fix any hardcoded `public` schema references
|
||||
|
||||
### Schema Naming Convention
|
||||
|
||||
```
|
||||
Schema name: test_{sha256(feature:scenario)[:8]}
|
||||
Cache key prefix: {sha256(feature:scenario)[:8]}_
|
||||
```
|
||||
|
||||
Example:
|
||||
- Feature: `auth`, Scenario: `Successful user authentication`
|
||||
- Hash: `sha256("auth:Successful user authentication")[:8]` = `a3f7b2c1`
|
||||
- Schema: `test_a3f7b2c1`
|
||||
- Cache key: `a3f7b2c1_user:newuser` instead of just `user:newuser`
|
||||
|
||||
Benefits:
|
||||
- Unique per scenario
|
||||
- Consistent across test runs (same scenario = same hash)
|
||||
- Short (8 chars) - efficient for cache keys
|
||||
- Identifiable for debugging
|
||||
|
||||
### Schema Naming Convention
|
||||
|
||||
```
|
||||
Schema name: test_{sha256(feature + scenario)[:8]}
|
||||
```
|
||||
|
||||
Example:
|
||||
- Feature: `auth`, Scenario: `Successful user authentication`
|
||||
- Hash: `sha256("auth_Successful user authentication")[:8]` = `a3f7b2c1`
|
||||
- Schema: `test_a3f7b2c1`
|
||||
|
||||
Benefits:
|
||||
- Unique per scenario
|
||||
- Consistent across test runs (same scenario = same schema)
|
||||
- Short (8 chars + prefix = 14 chars max)
|
||||
- Identifiable for debugging
|
||||
|
||||
## Pros and Cons Summary
|
||||
|
||||
| Aspect | Schema-per-Scenario | Current (Clear Tables) | Transaction Rollback |
|
||||
|--------|---------------------|----------------------|-------------------|
|
||||
| Isolation | ✅ Strong | ⚠️ Medium | ❌ Weak |
|
||||
| Works with Background | ✅ Yes | ⚠️ Partial | ❌ No |
|
||||
| Concurrency safe | ✅ Yes | ❌ No | ❌ No |
|
||||
| Works with TX | ✅ Yes | ✅ Yes | ❌ No |
|
||||
| Speed | ✅ Fast | ⚠️ Slow | ✅ Fast |
|
||||
| DB privileges | ⚠️ Needs CREATE | ✅ None | ✅ None |
|
||||
| Complexity | ⚠️ Medium | ✅ Low | ✅ Low |
|
||||
|
||||
## Links
|
||||
|
||||
* [ADR 0008: BDD Testing](adr/0008-bdd-testing.md) - Original BDD adoption decision
|
||||
* [ADR 0024: BDD Test Organization and Isolation](adr/0024-bdd-test-organization-and-isolation.md) - Feature isolation strategy
|
||||
* [Godog Documentation](https://github.com/cucumber/godog) - BDD framework specifics
|
||||
* [PostgreSQL Schemas](https://www.postgresql.org/docs/current/ddl-schemas.html) - Schema management
|
||||
@@ -1,6 +1,36 @@
|
||||
# Architecture Decision Records (ADRs)
|
||||
|
||||
This directory contains Architecture Decision Records (ADRs) for the DanceLessonsCoach project.
|
||||
This directory contains Architecture Decision Records (ADRs) for the dance-lessons-coach project.
|
||||
|
||||
## Index of ADRs
|
||||
|
||||
| Number | Title | Status |
|
||||
|--------|-------|--------|
|
||||
| 0001 | Go 1.26.1 Standard | ✅ Accepted |
|
||||
| 0002 | Chi Router | ✅ Accepted |
|
||||
| 0003 | Zerolog Logging | ✅ Accepted |
|
||||
| 0004 | Interface-Based Design | ✅ Accepted |
|
||||
| 0005 | Graceful Shutdown | ✅ Accepted |
|
||||
| 0006 | Configuration Management | ✅ Accepted |
|
||||
| 0007 | OpenTelemetry Integration | ✅ Accepted |
|
||||
| 0008 | BDD Testing with Godog | ✅ Accepted (structure superseded by 0024) |
|
||||
| 0009 | BDD Testing with OpenAPI Documentation | ✅ Accepted |
|
||||
| 0010 | API v2 Feature Flag | ✅ Accepted |
|
||||
| 0011 | Validation Library (go-playground/validator) | ✅ Accepted |
|
||||
| 0012 | Git Hooks: Staged-Only Formatting | ✅ Accepted |
|
||||
| 0013 | OpenAPI/Swagger Toolchain (swaggo/swag) | ✅ Accepted |
|
||||
| 0014 | gRPC Adoption Strategy | ❌ Rejected / Deferred |
|
||||
| 0015 | CLI Subcommands with Cobra | ✅ Accepted |
|
||||
| 0016 | CI/CD Pipeline Design | ✅ Accepted |
|
||||
| 0017 | Trunk-Based Development Workflow | ✅ Accepted |
|
||||
| 0018 | User Management and Auth System | ✅ Accepted |
|
||||
| 0019 | PostgreSQL Integration | ✅ Accepted (SQLite cleanup pending) |
|
||||
| 0020 | Docker Build Strategy | ✅ Accepted |
|
||||
| 0021 | JWT Secret Retention Policy | 🟡 Proposed (base JWT done; cleanup job not implemented) |
|
||||
| 0022 | Rate Limiting and Cache Strategy | 🟡 Proposed (not implemented — Gitea issue #13) |
|
||||
| 0023 | Config Hot Reloading | 🟡 Proposed (not implemented) |
|
||||
| 0024 | BDD Test Organization and Isolation | ✅ Accepted |
|
||||
| 0025 | BDD Scenario Isolation Strategies | ✅ Accepted (Partial — Phase 2 pending ADR-0022) |
|
||||
|
||||
## What is an ADR?
|
||||
|
||||
@@ -66,14 +96,24 @@ Chosen option: "[Option 1]" because [justification]
|
||||
* [0005-graceful-shutdown.md](0005-graceful-shutdown.md) - Implement graceful shutdown with readiness endpoints
|
||||
* [0006-configuration-management.md](0006-configuration-management.md) - Use Viper for configuration management
|
||||
* [0007-opentelemetry-integration.md](0007-opentelemetry-integration.md) - Integrate OpenTelemetry for distributed tracing
|
||||
* [0008-bdd-testing.md](0008-bdd-testing.md) - Adopt BDD with Godog for behavioral testing
|
||||
* [0009-hybrid-testing-approach.md](0009-hybrid-testing-approach.md) - Combine BDD and Swagger-based testing
|
||||
* [0008-bdd-testing.md](0008-bdd-testing.md) - Adopt BDD with Godog for behavioral testing (structure superseded by 0024)
|
||||
* [0009-hybrid-testing-approach.md](0009-hybrid-testing-approach.md) - BDD testing with OpenAPI documentation (SDK layer deferred)
|
||||
* [0010-api-v2-feature-flag.md](0010-api-v2-feature-flag.md) - API v2 implementation with feature flag control
|
||||
* [0011-validation-library-selection.md](0011-validation-library-selection.md) - Selection of go-playground/validator for input validation
|
||||
* [0012-git-hooks-staged-only-formatting.md](0012-git-hooks-staged-only-formatting.md) - Git hooks format only staged Go files
|
||||
* [0013-openapi-swagger-toolchain.md](0013-openapi-swagger-toolchain.md) - ✅ OpenAPI/Swagger documentation with swaggo/swag (Implemented)
|
||||
* [0014-grpc-adoption-strategy.md](0014-grpc-adoption-strategy.md) - Hybrid REST/gRPC adoption strategy
|
||||
* [0013-openapi-swagger-toolchain.md](0013-openapi-swagger-toolchain.md) - OpenAPI/Swagger documentation with swaggo/swag
|
||||
* [0014-grpc-adoption-strategy.md](0014-grpc-adoption-strategy.md) - gRPC adoption strategy (rejected/deferred)
|
||||
* [0015-cli-subcommands-cobra.md](0015-cli-subcommands-cobra.md) - Cobra CLI framework adoption
|
||||
* [0016-ci-cd-pipeline-design.md](0016-ci-cd-pipeline-design.md) - CI/CD pipeline architecture
|
||||
* [0017-trunk-based-development-workflow.md](0017-trunk-based-development-workflow.md) - Trunk-based development workflow
|
||||
* [0018-user-management-auth-system.md](0018-user-management-auth-system.md) - User management and authentication system
|
||||
* [0019-postgresql-integration.md](0019-postgresql-integration.md) - PostgreSQL database integration
|
||||
* [0020-docker-build-strategy.md](0020-docker-build-strategy.md) - Docker Build Strategy: Traditional vs Buildx
|
||||
* [0021-jwt-secret-retention-policy.md](0021-jwt-secret-retention-policy.md) - JWT Secret Retention Policy (base JWT done; cleanup job proposed)
|
||||
* [0022-rate-limiting-cache-strategy.md](0022-rate-limiting-cache-strategy.md) - Rate Limiting and Cache Strategy (not yet implemented — issue #13)
|
||||
* [0023-config-hot-reloading.md](0023-config-hot-reloading.md) - Config Hot Reloading Strategy (not yet implemented)
|
||||
* [0024-bdd-test-organization-and-isolation.md](0024-bdd-test-organization-and-isolation.md) - BDD test modular organisation by domain
|
||||
* [0025-bdd-scenario-isolation-strategies.md](0025-bdd-scenario-isolation-strategies.md) - Schema-per-scenario isolation for BDD tests (partial)
|
||||
|
||||
## How to Add a New ADR
|
||||
|
||||
|
||||
320
bdd_implementation_plan.md
Normal file
320
bdd_implementation_plan.md
Normal file
@@ -0,0 +1,320 @@
|
||||
# BDD Implementation Plan - Iterative Approach
|
||||
|
||||
Based on ADR 0024: BDD Test Organization and Isolation Strategy
|
||||
|
||||
## Phase 1: Refactor Current Tests (1-2 weeks)
|
||||
|
||||
### Objective: Split monolithic feature files into modular, isolated components
|
||||
|
||||
### Tasks:
|
||||
1. **Split feature files by business domain**
|
||||
- Create `features/auth/` directory
|
||||
- Create `features/config/` directory
|
||||
- Create `features/greet/` directory
|
||||
- Create `features/health/` directory
|
||||
- Create `features/jwt/` directory
|
||||
|
||||
2. **Implement feature-specific isolation**
|
||||
- Add config file patterns: `features/{domain}/{domain}-test-config.yaml`
|
||||
- Implement database naming: `dance_lessons_coach_{domain}_test`
|
||||
- Assign unique ports per feature group
|
||||
|
||||
3. **Create feature-specific test scripts**
|
||||
- Implement `scripts/test-feature.sh` with feature parameter
|
||||
- Add environment setup/teardown logic
|
||||
- Implement resource cleanup routines
|
||||
|
||||
### Deliverables:
|
||||
- ✅ Modular feature directory structure
|
||||
- ✅ Feature-specific configuration files
|
||||
- ✅ Basic isolation mechanisms
|
||||
- ✅ Feature-level test scripts
|
||||
|
||||
## Phase 2: Enhance Test Infrastructure (2-3 weeks)
|
||||
|
||||
### Objective: Add synchronization and lifecycle management
|
||||
|
||||
### Tasks:
|
||||
1. **Implement synchronization helpers**
|
||||
- Add `waitForServerReady()` with timeout
|
||||
- Add `waitForConfigReload()` with event-based detection
|
||||
- Add `waitForCondition()` helper function
|
||||
|
||||
2. **Add Godog context management**
|
||||
- Create feature-specific context structs
|
||||
- Implement `InitializeFeatureSuite()`
|
||||
- Implement `CleanupFeatureSuite()`
|
||||
|
||||
3. **Add tag-based test selection**
|
||||
- Implement `@smoke`, `@auth`, `@config` tags
|
||||
- Add tag filtering to test scripts
|
||||
- Document tag usage in README
|
||||
|
||||
### Deliverables:
|
||||
- ✅ Robust synchronization mechanisms
|
||||
- ✅ Proper context lifecycle management
|
||||
- ✅ Tag-based test execution
|
||||
- ✅ Improved test reliability
|
||||
|
||||
## Phase 3: Parallel Testing (Optional - 1 week)
|
||||
|
||||
### Objective: Enable safe parallel test execution
|
||||
|
||||
### Tasks:
|
||||
1. **Implement port management**
|
||||
- Add port allocation system
|
||||
- Implement port conflict detection
|
||||
- Add parallel execution flags
|
||||
|
||||
2. **Add resource monitoring**
|
||||
- Implement resource usage tracking
|
||||
- Add timeout detection
|
||||
- Implement cleanup on failure
|
||||
|
||||
3. **Update CI/CD pipeline**
|
||||
- Add parallel test execution
|
||||
- Implement resource limits
|
||||
- Add test isolation validation
|
||||
|
||||
### Deliverables:
|
||||
- ✅ Parallel test execution capability
|
||||
- ✅ Resource monitoring and limits
|
||||
- ✅ Updated CI/CD configuration
|
||||
|
||||
## Implementation Timeline
|
||||
|
||||
### Week 1-2: Phase 1 - Test Refactoring
|
||||
- Day 1-2: Create feature directory structure
|
||||
- Day 3-4: Implement feature-specific configs
|
||||
- Day 5-7: Create test scripts and isolation
|
||||
- Day 8-10: Test and validate refactoring
|
||||
|
||||
### Week 3-5: Phase 2 - Infrastructure Enhancement
|
||||
- Day 11-12: Add synchronization helpers
|
||||
- Day 13-14: Implement context management
|
||||
- Day 15-17: Add tag-based selection
|
||||
- Day 18-21: Test and validate infrastructure
|
||||
|
||||
### Week 6: Phase 3 - Parallel Testing (Optional)
|
||||
- Day 22-24: Implement port management
|
||||
- Day 25-26: Add resource monitoring
|
||||
- Day 27-28: Update CI/CD pipeline
|
||||
- Day 29-30: Test and validate parallel execution
|
||||
|
||||
## Success Criteria
|
||||
|
||||
### Phase 1 Success:
|
||||
- ✅ All tests pass in new structure
|
||||
- ✅ Feature isolation working correctly
|
||||
- ✅ Test scripts functional
|
||||
- ✅ No regression in test coverage
|
||||
|
||||
### Phase 2 Success:
|
||||
- ✅ Synchronization working reliably
|
||||
- ✅ Context management implemented
|
||||
- ✅ Tag filtering operational
|
||||
- ✅ Test reliability >95%
|
||||
|
||||
### Phase 3 Success:
|
||||
- ✅ Parallel tests execute safely
|
||||
- ✅ Resource usage within limits
|
||||
- ✅ CI/CD pipeline updated
|
||||
- ✅ Test execution time reduced
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
### Phase 1 Risks:
|
||||
- **Test failures during refactoring**: Maintain old structure until new is validated
|
||||
- **Isolation issues**: Implement gradual rollout with validation
|
||||
|
||||
### Phase 2 Risks:
|
||||
- **Synchronization complexity**: Start with simple timeouts, enhance gradually
|
||||
- **Context management bugs**: Add comprehensive logging and debugging
|
||||
|
||||
### Phase 3 Risks:
|
||||
- **Resource conflicts**: Implement strict resource limits and monitoring
|
||||
- **CI/CD instability**: Test parallel execution locally before pipeline update
|
||||
|
||||
## Monitoring and Validation
|
||||
|
||||
### Phase 1 Validation:
|
||||
```bash
|
||||
# Test each feature independently
|
||||
./scripts/test-feature.sh auth
|
||||
./scripts/test-feature.sh config
|
||||
./scripts/test-feature.sh greet
|
||||
|
||||
# Verify isolation
|
||||
./scripts/validate-isolation.sh
|
||||
```
|
||||
|
||||
### Phase 2 Validation:
|
||||
```bash
|
||||
# Test synchronization
|
||||
./scripts/test-synchronization.sh
|
||||
|
||||
# Test tag filtering
|
||||
godog --tags=@smoke features/
|
||||
|
||||
# Test context management
|
||||
./scripts/test-context-lifecycle.sh
|
||||
```
|
||||
|
||||
### Phase 3 Validation:
|
||||
```bash
|
||||
# Test parallel execution
|
||||
./scripts/test-all-features-parallel.sh
|
||||
|
||||
# Monitor resource usage
|
||||
./scripts/monitor-test-resources.sh
|
||||
|
||||
# Validate CI/CD changes
|
||||
./scripts/validate-ci-cd.sh
|
||||
```
|
||||
|
||||
## Rollback Plan
|
||||
|
||||
### Phase 1 Rollback:
|
||||
```bash
|
||||
# Revert to original structure
|
||||
git checkout HEAD~1 -- features/
|
||||
|
||||
# Restore original test scripts
|
||||
git checkout HEAD~1 -- scripts/test-*.sh
|
||||
```
|
||||
|
||||
### Phase 2 Rollback:
|
||||
```bash
|
||||
# Remove synchronization helpers
|
||||
git checkout HEAD~1 -- pkg/bdd/helpers/
|
||||
|
||||
# Restore original context management
|
||||
git checkout HEAD~1 -- pkg/bdd/context/
|
||||
```
|
||||
|
||||
### Phase 3 Rollback:
|
||||
```bash
|
||||
# Disable parallel execution
|
||||
sed -i 's/parallel=true/parallel=false/' scripts/test-all-features-parallel.sh
|
||||
|
||||
# Revert CI/CD changes
|
||||
git checkout HEAD~1 -- .github/workflows/
|
||||
```
|
||||
|
||||
## Documentation Updates
|
||||
|
||||
### Phase 1 Documentation:
|
||||
- ✅ Update README with new test structure
|
||||
- ✅ Document feature organization conventions
|
||||
- ✅ Add test execution instructions
|
||||
|
||||
### Phase 2 Documentation:
|
||||
- ✅ Document synchronization patterns
|
||||
- ✅ Add context management guide
|
||||
- ✅ Document tag usage and filtering
|
||||
|
||||
### Phase 3 Documentation:
|
||||
- ✅ Add parallel testing guide
|
||||
- ✅ Document resource limits
|
||||
- ✅ Update CI/CD documentation
|
||||
|
||||
## Team Communication
|
||||
|
||||
### Phase 1:
|
||||
- Team meeting to explain new structure
|
||||
- Hands-on workshop for test refactoring
|
||||
- Daily standups to track progress
|
||||
|
||||
### Phase 2:
|
||||
- Technical deep dive on synchronization
|
||||
- Code review sessions for context management
|
||||
- Pair programming for complex scenarios
|
||||
|
||||
### Phase 3:
|
||||
- Performance testing workshop
|
||||
- CI/CD pipeline review
|
||||
- Resource monitoring training
|
||||
|
||||
## Continuous Improvement
|
||||
|
||||
### Post-Phase 1:
|
||||
- Gather feedback on new structure
|
||||
- Identify pain points in isolation
|
||||
- Optimize test execution times
|
||||
|
||||
### Post-Phase 2:
|
||||
- Monitor test reliability metrics
|
||||
- Identify flaky tests for fixing
|
||||
- Optimize synchronization patterns
|
||||
|
||||
### Post-Phase 3:
|
||||
- Monitor parallel execution performance
|
||||
- Identify resource bottlenecks
|
||||
- Optimize CI/CD pipeline timing
|
||||
|
||||
## Metrics Tracking
|
||||
|
||||
### Test Reliability:
|
||||
```
|
||||
# Track pass rate over time
|
||||
./scripts/track-test-reliability.sh
|
||||
```
|
||||
|
||||
### Test Execution Time:
|
||||
```
|
||||
# Monitor execution times
|
||||
./scripts/monitor-execution-time.sh
|
||||
```
|
||||
|
||||
### Resource Usage:
|
||||
```
|
||||
# Track resource consumption
|
||||
./scripts/monitor-resource-usage.sh
|
||||
```
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Post-Phase 3:
|
||||
- Test impact analysis
|
||||
- Flaky test detection
|
||||
- Performance benchmarking
|
||||
- Test coverage visualization
|
||||
|
||||
### Long-term:
|
||||
- AI-assisted test generation
|
||||
- Automated test optimization
|
||||
- Predictive test failure analysis
|
||||
- Intelligent test prioritization
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
### Phase 1: Test Refactoring
|
||||
- [ ] Create feature directories
|
||||
- [ ] Split feature files
|
||||
- [ ] Implement config isolation
|
||||
- [ ] Add database isolation
|
||||
- [ ] Create test scripts
|
||||
- [ ] Test and validate
|
||||
|
||||
### Phase 2: Infrastructure Enhancement
|
||||
- [ ] Add synchronization helpers
|
||||
- [ ] Implement context management
|
||||
- [ ] Add tag filtering
|
||||
- [ ] Test and validate
|
||||
|
||||
### Phase 3: Parallel Testing
|
||||
- [ ] Implement port management
|
||||
- [ ] Add resource monitoring
|
||||
- [ ] Update CI/CD pipeline
|
||||
- [ ] Test and validate
|
||||
|
||||
## Notes
|
||||
|
||||
- Each phase builds on the previous one
|
||||
- Phase 3 is optional and can be deferred
|
||||
- Focus on reliability before performance
|
||||
- Maintain backward compatibility where possible
|
||||
- Document all changes thoroughly
|
||||
- Gather team feedback at each phase
|
||||
- Monitor metrics continuously
|
||||
- Celebrate milestones and successes
|
||||
@@ -1,7 +1,7 @@
|
||||
// Package main provides the dance-lessons-coach server entry point
|
||||
//
|
||||
// @title dance-lessons-coach API
|
||||
// @version 1.2.0
|
||||
// @version 1.4.0
|
||||
// @description API for dance-lessons-coach service providing greeting functionality
|
||||
// @termsOfService http://swagger.io/terms/
|
||||
|
||||
@@ -12,9 +12,14 @@
|
||||
// @license.name MIT
|
||||
// @license.url https://opensource.org/licenses/MIT
|
||||
|
||||
// @host localhost:8080
|
||||
// @BasePath /api
|
||||
// @schemes http https
|
||||
// @host localhost:8080
|
||||
// @BasePath /api
|
||||
// @schemes http https
|
||||
//
|
||||
// @securityDefinitions.apikey BearerAuth
|
||||
// @in header
|
||||
// @name Authorization
|
||||
// @description JWT authentication using Bearer token. Format: Bearer <token>
|
||||
|
||||
package main
|
||||
|
||||
@@ -43,8 +48,10 @@ func main() {
|
||||
log.Fatal().Err(err).Msg("Failed to load configuration")
|
||||
}
|
||||
|
||||
// Create readiness context to control readiness state
|
||||
readyCtx, readyCancel := context.WithCancel(context.Background())
|
||||
// Create readiness context to control readiness state.
|
||||
// CancelableContext exposes Cancel() so that Server.Run() can cancel
|
||||
// readiness at the start of graceful shutdown (before the propagation sleep).
|
||||
readyCtx, readyCancel := server.NewCancelableContext(context.Background())
|
||||
defer readyCancel()
|
||||
|
||||
// Create and run server
|
||||
@@ -52,4 +59,5 @@ func main() {
|
||||
if err := server.Run(); err != nil {
|
||||
log.Fatal().Err(err).Msg("Server failed")
|
||||
}
|
||||
log.Trace().Msg("Server exited")
|
||||
}
|
||||
|
||||
40
config.yaml
40
config.yaml
@@ -1,4 +1,4 @@
|
||||
# DanceLessonsCoach Configuration
|
||||
# dance-lessons-coach Configuration
|
||||
# This file serves as both the default configuration and documentation
|
||||
# All available options are shown with their default values
|
||||
|
||||
@@ -41,8 +41,8 @@ telemetry:
|
||||
# Format: host:port
|
||||
otlp_endpoint: "localhost:4317"
|
||||
|
||||
# Service name for tracing (default: "DanceLessonsCoach")
|
||||
service_name: "DanceLessonsCoach"
|
||||
# Service name for tracing (default: "dance-lessons-coach")
|
||||
service_name: "dance-lessons-coach"
|
||||
|
||||
# Use insecure connection (no TLS) (default: true)
|
||||
insecure: true
|
||||
@@ -55,4 +55,36 @@ telemetry:
|
||||
|
||||
# Sampling ratio (0.0 to 1.0, default: 1.0)
|
||||
# Only used with traceidratio and parentbased_traceidratio samplers
|
||||
ratio: 1.0
|
||||
ratio: 1.0
|
||||
|
||||
# Database configuration (PostgreSQL)
|
||||
database:
|
||||
# PostgreSQL host address (default: "localhost")
|
||||
host: "localhost"
|
||||
|
||||
# PostgreSQL port (default: 5432)
|
||||
port: 5432
|
||||
|
||||
# PostgreSQL username (default: "postgres")
|
||||
user: "postgres"
|
||||
|
||||
# PostgreSQL password (default: "postgres")
|
||||
# Change this for production!
|
||||
password: "postgres"
|
||||
|
||||
# Database name (default: "dance_lessons_coach")
|
||||
name: "dance_lessons_coach"
|
||||
|
||||
# SSL mode (default: "disable")
|
||||
# Options: "disable", "allow", "prefer", "require", "verify-ca", "verify-full"
|
||||
ssl_mode: "disable"
|
||||
|
||||
# Maximum number of open connections (default: 25)
|
||||
max_open_conns: 25
|
||||
|
||||
# Maximum number of idle connections (default: 5)
|
||||
max_idle_conns: 5
|
||||
|
||||
# Maximum lifetime of connections (default: "1h")
|
||||
# Format: number + unit (s, m, h)
|
||||
conn_max_lifetime: 1h
|
||||
@@ -1,29 +0,0 @@
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
act-runner:
|
||||
image: gitea/act_runner:latest
|
||||
volumes:
|
||||
- .:/workspace
|
||||
- ./config/runner:/data/.runner
|
||||
working_dir: /workspace
|
||||
environment:
|
||||
- GITEA_INSTANCE_URL=${GITEA_INSTANCE_URL:-https://gitea.arcodange.lab/}
|
||||
- GITEA_RUNNER_REGISTRATION_TOKEN=${GITEA_RUNNER_REGISTRATION_TOKEN}
|
||||
- GITEA_RUNNER_NAME=${GITEA_RUNNER_NAME:-local-test-runner}
|
||||
- GITEA_RUNNER_LABELS=${GITEA_RUNNER_LABELS:-ubuntu-latest:docker://node:16-bullseye,ubuntu-22.04:docker://gitea/act_runner:latest}
|
||||
command: act -W .gitea/workflows/go-ci-cd.yaml --rm
|
||||
|
||||
yamllint:
|
||||
image: pipelinecomponents/yamllint:latest
|
||||
volumes:
|
||||
- .:/workspace
|
||||
working_dir: /workspace
|
||||
command: yamllint .gitea/workflows/
|
||||
|
||||
yq-validator:
|
||||
image: mikefarah/yq:latest
|
||||
volumes:
|
||||
- .:/workspace
|
||||
working_dir: /workspace
|
||||
command: yq eval '.' .gitea/workflows/ci-cd.yaml
|
||||
47
docker-compose.yml
Normal file
47
docker-compose.yml
Normal file
@@ -0,0 +1,47 @@
|
||||
services:
|
||||
postgres:
|
||||
image: postgres:16-alpine
|
||||
container_name: dance-lessons-coach-postgres
|
||||
environment:
|
||||
POSTGRES_USER: postgres
|
||||
POSTGRES_PASSWORD: postgres
|
||||
POSTGRES_DB: dance_lessons_coach
|
||||
ports:
|
||||
- "5432:5432"
|
||||
volumes:
|
||||
- postgres_data:/var/lib/postgresql/data
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U postgres"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
networks:
|
||||
- dance-lessons-coach-network
|
||||
restart: unless-stopped
|
||||
|
||||
# Application service (for reference)
|
||||
# app:
|
||||
# build: .
|
||||
# container_name: dance-lessons-coach-app
|
||||
# ports:
|
||||
# - "8080:8080"
|
||||
# environment:
|
||||
# - DLC_DATABASE_HOST=postgres
|
||||
# - DLC_DATABASE_PORT=5432
|
||||
# - DLC_DATABASE_USER=postgres
|
||||
# - DLC_DATABASE_PASSWORD=postgres
|
||||
# - DLC_DATABASE_NAME=dance_lessons_coach
|
||||
# - DLC_DATABASE_SSL_MODE=disable
|
||||
# depends_on:
|
||||
# postgres:
|
||||
# condition: service_healthy
|
||||
# restart: unless-stopped
|
||||
|
||||
volumes:
|
||||
postgres_data:
|
||||
driver: local
|
||||
|
||||
networks:
|
||||
dance-lessons-coach-network:
|
||||
name: dance-lessons-coach-network
|
||||
driver: bridge
|
||||
@@ -1,4 +1,4 @@
|
||||
# DanceLessonsCoach Docker Image
|
||||
# dance-lessons-coach Docker Image
|
||||
# Multi-stage build for production deployment
|
||||
|
||||
# Stage 1: Build binary
|
||||
43
docker/Dockerfile.build
Normal file
43
docker/Dockerfile.build
Normal file
@@ -0,0 +1,43 @@
|
||||
# Build environment Dockerfile with pre-installed Go tools and dependencies
|
||||
# Optimized for CI/CD pipeline speed
|
||||
# Updated to include Node.js for GitHub Actions compatibility
|
||||
|
||||
FROM golang:1.26.1-alpine AS builder
|
||||
|
||||
# Install build dependencies
|
||||
RUN apk add --no-cache \
|
||||
git \
|
||||
bash \
|
||||
curl \
|
||||
make \
|
||||
gcc \
|
||||
musl-dev \
|
||||
bc \
|
||||
grep \
|
||||
sed \
|
||||
jq \
|
||||
ca-certificates \
|
||||
nodejs \
|
||||
npm \
|
||||
postgresql-client \
|
||||
tar # Add GNU tar for cache compatibility
|
||||
|
||||
# Set up Go environment
|
||||
ENV GOPATH=/go
|
||||
ENV PATH=$GOPATH/bin:/usr/local/go/bin:/usr/local/bin:/usr/bin:/bin
|
||||
WORKDIR /go/src/dance-lessons-coach
|
||||
|
||||
# Install common Go tools
|
||||
RUN go install github.com/swaggo/swag/cmd/swag@latest && \
|
||||
go install golang.org/x/tools/cmd/goimports@latest && \
|
||||
go install honnef.co/go/tools/cmd/staticcheck@latest
|
||||
|
||||
# Copy only go.mod and go.sum first for dependency caching
|
||||
COPY go.mod go.sum ./
|
||||
RUN go mod download && go mod verify
|
||||
|
||||
# Simple build environment - source code is mounted at runtime
|
||||
WORKDIR /workspace
|
||||
|
||||
# Pre-download common Go tools (already installed in base)
|
||||
# RUN go install github.com/swaggo/swag/cmd/swag@latest
|
||||
37
docker/Dockerfile.prod
Normal file
37
docker/Dockerfile.prod
Normal file
@@ -0,0 +1,37 @@
|
||||
# dance-lessons-coach Production Docker Image
|
||||
# ⚠️ DEVELOPMENT ONLY - This file uses 'latest' tag for local testing
|
||||
# ⚠️ CI/CD generates the correct Dockerfile.prod with proper dependency hash
|
||||
# ⚠️ For production use, see the CI/CD workflow which generates the correct file
|
||||
|
||||
# Use the build cache image as base (latest for local dev only)
|
||||
FROM gitea.arcodange.lab/arcodange/dance-lessons-coach-build-cache:latest AS builder
|
||||
|
||||
# Final minimal image
|
||||
FROM alpine:3.18
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install minimal dependencies
|
||||
RUN apk add --no-cache ca-certificates tzdata
|
||||
|
||||
# Copy binary from builder
|
||||
COPY --from=builder /workspace/dance-lessons-coach /app/dance-lessons-coach
|
||||
|
||||
# Copy configuration
|
||||
COPY config.yaml /app/config.yaml
|
||||
|
||||
# Set permissions
|
||||
RUN chmod +x /app/dance-lessons-coach
|
||||
|
||||
# Set timezone
|
||||
ENV TZ=UTC
|
||||
|
||||
# Expose port
|
||||
EXPOSE 8080
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=3s \
|
||||
CMD wget -q --spider http://localhost:8080/api/health || exit 1
|
||||
|
||||
# Entry point
|
||||
ENTRYPOINT ["/app/dance-lessons-coach"]
|
||||
36
docker/Dockerfile.prod.template
Normal file
36
docker/Dockerfile.prod.template
Normal file
@@ -0,0 +1,36 @@
|
||||
# dance-lessons-coach Production Docker Image
|
||||
# Minimal image using pre-built binary from CI cache
|
||||
# Template: Replace {{DEPS_HASH}} with actual dependency hash
|
||||
|
||||
# Use the build cache image as base
|
||||
FROM gitea.arcodange.lab/arcodange/dance-lessons-coach-build-cache:{{DEPS_HASH}} AS builder
|
||||
|
||||
# Final minimal image
|
||||
FROM alpine:3.18
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Install minimal dependencies
|
||||
RUN apk add --no-cache ca-certificates tzdata
|
||||
|
||||
# Copy binary from builder
|
||||
COPY --from=builder /workspace/dance-lessons-coach /app/dance-lessons-coach
|
||||
|
||||
# Copy configuration
|
||||
COPY config.yaml /app/config.yaml
|
||||
|
||||
# Set permissions
|
||||
RUN chmod +x /app/dance-lessons-coach
|
||||
|
||||
# Set timezone
|
||||
ENV TZ=UTC
|
||||
|
||||
# Expose port
|
||||
EXPOSE 8080
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=3s \
|
||||
CMD wget -q --spider http://localhost:8080/api/health || exit 1
|
||||
|
||||
# Entry point
|
||||
ENTRYPOINT ["/app/dance-lessons-coach"]
|
||||
@@ -1,16 +1,16 @@
|
||||
# DanceLessonsCoach Agent Usage Guide
|
||||
# dance-lessons-coach Agent Usage Guide
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
### Launch Programmer Agent
|
||||
```bash
|
||||
cd /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach
|
||||
cd /Users/gabrielradureau/Work/Vibe/dance-lessons-coach
|
||||
vibe start --agent dancelessonscoachprogrammer
|
||||
```
|
||||
|
||||
### Launch Product Owner Agent
|
||||
```bash
|
||||
cd /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach
|
||||
cd /Users/gabrielradureau/Work/Vibe/dance-lessons-coach
|
||||
vibe start --agent dancelessonscoach-product-owner
|
||||
```
|
||||
|
||||
@@ -141,7 +141,7 @@ skill changelog-manager add-entry \
|
||||
```toml
|
||||
# .mistral/dancelessonscoachprogrammer-agent.toml
|
||||
name: dancelessonscoachprogrammer
|
||||
role: DanceLessonsCoachProgrammer
|
||||
role: dance-lessons-coach-programmer
|
||||
goals: ["Follow BDD practices", "Use Gitmoji commits", "Respect ADR process"]
|
||||
```
|
||||
|
||||
@@ -149,7 +149,7 @@ goals: ["Follow BDD practices", "Use Gitmoji commits", "Respect ADR process"]
|
||||
```toml
|
||||
# .mistral/dancelessonscoach-product-owner-agent.toml
|
||||
name: dancelessonscoach-product-owner
|
||||
role: DanceLessonsCoachProductOwner
|
||||
role: dance-lessons-coach-product-owner
|
||||
goals: ["Facilitate stakeholder interviews", "Generate BDD tests", "Maintain documentation"]
|
||||
```
|
||||
|
||||
@@ -210,7 +210,7 @@ vibe validate --agent dancelessonscoach-product-owner
|
||||
```bash
|
||||
# List available skills
|
||||
ls /Users/gabrielradureau/Work/Vibe/.mistral/skills/
|
||||
ls /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach/.vibe/skills/
|
||||
ls /Users/gabrielradureau/Work/Vibe/dance-lessons-coach/.vibe/skills/
|
||||
|
||||
# Validate skill
|
||||
skill skill-creator validate .vibe/skills/product-owner-assistant
|
||||
@@ -222,7 +222,7 @@ skill skill-creator validate .mistral/skills/interview-facilitator
|
||||
```bash
|
||||
# Check file permissions
|
||||
chmod +x /Users/gabrielradureau/Work/Vibe/.mistral/skills/*/scripts/*
|
||||
chmod +x /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach/.vibe/skills/*/scripts/*
|
||||
chmod +x /Users/gabrielradureau/Work/Vibe/dance-lessons-coach/.vibe/skills/*/scripts/*
|
||||
```
|
||||
|
||||
## 📖 Related Documentation
|
||||
|
||||
158
documentation/API.md
Normal file
158
documentation/API.md
Normal file
@@ -0,0 +1,158 @@
|
||||
# API Endpoints
|
||||
|
||||
REST API reference for `dance-lessons-coach`. Extracted from the original `AGENTS.md` (Tâche 6 restructure) for lazy-loading compatibility with Mistral Vibe.
|
||||
|
||||
## Base URL
|
||||
|
||||
```
|
||||
http://localhost:8080
|
||||
```
|
||||
|
||||
## OpenAPI Documentation
|
||||
|
||||
- **Swagger UI:** `http://localhost:8080/swagger/`
|
||||
- **OpenAPI Spec:** `http://localhost:8080/swagger/doc.json`
|
||||
|
||||
The API provides interactive documentation using Swagger UI with complete OpenAPI 2.0 specification. All endpoints, request/response models, and validation rules are documented using a **hierarchical tagging system**.
|
||||
|
||||
**Features:**
|
||||
|
||||
- Interactive API exploration with hierarchical organization
|
||||
- Try-it-out functionality for all endpoints
|
||||
- Model schemas with examples
|
||||
- Response examples with validation rules
|
||||
- Hierarchical tag structure for better navigation
|
||||
|
||||
**Generation:** Documentation is auto-generated from code annotations using [swaggo/swag](https://github.com/swaggo/swag) with the command:
|
||||
|
||||
```bash
|
||||
go generate ./pkg/server/
|
||||
```
|
||||
|
||||
**Tag Organization:**
|
||||
|
||||
- `API/v1/Greeting` — Version 1 greeting endpoints
|
||||
- `API/v2/Greeting` — Version 2 greeting endpoints
|
||||
- `System/Health` — Health and readiness endpoints
|
||||
|
||||
**Hierarchical Benefits:**
|
||||
|
||||
- Clear separation between API domains (API vs System)
|
||||
- Version organization within each domain
|
||||
- Natural hierarchy in Swagger UI
|
||||
- Scalable for future API growth
|
||||
|
||||
**Embedded Documentation:** The OpenAPI spec is embedded in the binary using Go's `//go:embed` directive for single-binary deployment.
|
||||
|
||||
---
|
||||
|
||||
## Health Check
|
||||
|
||||
```http
|
||||
GET /api/health
|
||||
```
|
||||
|
||||
**Response:**
|
||||
|
||||
```json
|
||||
{"status":"healthy"}
|
||||
```
|
||||
|
||||
## Version Info
|
||||
|
||||
```http
|
||||
GET /api/version
|
||||
GET /api/version?format=plain
|
||||
GET /api/version?format=full
|
||||
GET /api/version?format=json
|
||||
```
|
||||
|
||||
Returns the running binary version (injected at build time via `-ldflags`). The `format` query parameter controls the response shape:
|
||||
|
||||
- `format=plain` (or `?format=short`): plain text version (e.g. `1.0.0`)
|
||||
- `format=full`: detailed multi-line text (Version, Commit, Built date, Go version)
|
||||
- `format=json` (default): structured JSON `{"version": "1.0.0", "commit": "abc1234", "built": "...", "go_version": "go1.26.1"}`
|
||||
|
||||
## Readiness Check
|
||||
|
||||
```http
|
||||
GET /api/ready
|
||||
```
|
||||
|
||||
**Responses:**
|
||||
|
||||
- Normal operation: `{"ready":true}` (HTTP 200)
|
||||
- During shutdown: `{"ready":false}` (HTTP 503 Service Unavailable)
|
||||
|
||||
**Purpose:** Indicates whether the server is ready to accept new requests. Returns false during graceful shutdown to allow existing requests to complete while preventing new ones.
|
||||
|
||||
## Greet Service v1
|
||||
|
||||
```http
|
||||
GET /api/v1/greet/
|
||||
GET /api/v1/greet/{name}
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Default greeting
|
||||
curl http://localhost:8080/api/v1/greet/
|
||||
# Response: {"message":"Hello world!"}
|
||||
|
||||
# Personalized greeting
|
||||
curl http://localhost:8080/api/v1/greet/John
|
||||
# Response: {"message":"Hello John!"}
|
||||
|
||||
# Another example
|
||||
curl http://localhost:8080/api/v1/greet/Alice
|
||||
# Response: {"message":"Hello Alice!"}
|
||||
```
|
||||
|
||||
## Greet Service v2 (Feature-flagged)
|
||||
|
||||
```http
|
||||
POST /api/v2/greet
|
||||
```
|
||||
|
||||
**Request Body:**
|
||||
|
||||
```json
|
||||
{
|
||||
"name": "John"
|
||||
}
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Valid request
|
||||
curl -X POST http://localhost:8080/api/v2/greet \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"name":"John"}'
|
||||
# Response: {"message":"Hello my friend John!"}
|
||||
|
||||
# Empty name (valid, returns default)
|
||||
curl -X POST http://localhost:8080/api/v2/greet \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"name":""}'
|
||||
# Response: {"message":"Hello my friend!"}
|
||||
|
||||
# Missing name field (valid, returns default)
|
||||
curl -X POST http://localhost:8080/api/v2/greet \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{}'
|
||||
# Response: {"message":"Hello my friend!"}
|
||||
|
||||
# Name too long (validation error)
|
||||
curl -X POST http://localhost:8080/api/v2/greet \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"name":"ThisNameIsWayTooLongAndShouldFailValidationBecauseItExceedsTheMaximumAllowedLengthOf100Characters!!!!"}'
|
||||
# Response: {"error":"validation_failed","message":"Invalid request data","details":[{"message":"Name failed validation for 'max' (parameter: 100)"}]}
|
||||
```
|
||||
|
||||
**Validation Rules:**
|
||||
|
||||
- `name`: Maximum length 100 characters (optional field)
|
||||
|
||||
**Feature Flag:** Enable with `DLC_API_V2_ENABLED=true` or in config file with `api.v2_enabled: true`.
|
||||
@@ -1,6 +1,6 @@
|
||||
# BDD Testing Guide for DanceLessonsCoach
|
||||
# BDD Testing Guide for dance-lessons-coach
|
||||
|
||||
This guide explains how to work with BDD tests using Godog in the DanceLessonsCoach project.
|
||||
This guide explains how to work with BDD tests using Godog in the dance-lessons-coach project.
|
||||
|
||||
## Installation
|
||||
|
||||
@@ -33,7 +33,7 @@ The project already includes Godog as a dependency in `go.mod`. The BDD tests ar
|
||||
|
||||
```bash
|
||||
# From project root
|
||||
cd /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach
|
||||
cd /Users/gabrielradureau/Work/Vibe/dance-lessons-coach
|
||||
go test ./features/... -v
|
||||
```
|
||||
|
||||
@@ -112,7 +112,7 @@ Create a corresponding step definition file in `pkg/bdd/steps/`:
|
||||
package steps
|
||||
|
||||
import (
|
||||
"DanceLessonsCoach/pkg/bdd/testserver"
|
||||
"dance-lessons-coach/pkg/bdd/testserver"
|
||||
"github.com/cucumber/godog"
|
||||
)
|
||||
|
||||
@@ -213,7 +213,7 @@ Add BDD tests to your CI pipeline:
|
||||
|
||||
## Modern Go Testing Practices
|
||||
|
||||
The DanceLessonsCoach project follows modern Go testing practices:
|
||||
The dance-lessons-coach project follows modern Go testing practices:
|
||||
|
||||
1. **Standard library integration**: BDD tests use `go test`
|
||||
2. **No global installation required**: Godog is a Go module dependency
|
||||
|
||||
251
documentation/CLI.md
Normal file
251
documentation/CLI.md
Normal file
@@ -0,0 +1,251 @@
|
||||
# CLI Management Guide
|
||||
|
||||
Complete reference for the `dance-lessons-coach` CLI, server lifecycle, and configuration. Extracted from the original `AGENTS.md` (Tâche 6 restructure) for lazy-loading compatibility with Mistral Vibe.
|
||||
|
||||
## Cobra CLI (Recommended)
|
||||
|
||||
`dance-lessons-coach` includes a modern CLI built with Cobra:
|
||||
|
||||
```bash
|
||||
# Show help and available commands
|
||||
./bin/dance-lessons-coach --help
|
||||
|
||||
# Show version information
|
||||
./bin/dance-lessons-coach version
|
||||
|
||||
# Greet someone by name
|
||||
./bin/dance-lessons-coach greet John
|
||||
|
||||
# Start the server
|
||||
./bin/dance-lessons-coach server
|
||||
```
|
||||
|
||||
**Available Commands:**
|
||||
|
||||
- `version` — Print version information
|
||||
- `server` — Start the dance-lessons-coach server
|
||||
- `greet [name]` — Greet someone by name
|
||||
- `help` — Built-in help system
|
||||
- `completion` — Generate shell completion scripts
|
||||
|
||||
**Server Command Flags:**
|
||||
|
||||
- `--config` — Config file path
|
||||
- `--env` — Environment (`dev`, `staging`, `prod`)
|
||||
- `--debug` — Enable debug logging
|
||||
|
||||
## Version Information
|
||||
|
||||
The server provides runtime version information:
|
||||
|
||||
```bash
|
||||
# Check version using new CLI
|
||||
./bin/dance-lessons-coach version
|
||||
|
||||
# Check version using server binary
|
||||
./bin/server --version
|
||||
|
||||
# Output:
|
||||
dance-lessons-coach Version Information:
|
||||
Version: 1.0.0
|
||||
Commit: abc1234
|
||||
Built: 2026-04-05T10:00:00+0000
|
||||
Go: go1.26.1
|
||||
```
|
||||
|
||||
For full version management workflow (bump, release, build with version), see [`version-management-guide.md`](version-management-guide.md).
|
||||
|
||||
## Server Control Script
|
||||
|
||||
A shell script manages the server lifecycle:
|
||||
|
||||
```bash
|
||||
cd /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach
|
||||
|
||||
./scripts/start-server.sh start # Start the server
|
||||
./scripts/start-server.sh status # Check server status
|
||||
./scripts/start-server.sh test # Test API endpoints
|
||||
./scripts/start-server.sh logs # View server logs
|
||||
./scripts/start-server.sh stop # Stop the server
|
||||
./scripts/start-server.sh restart # Restart
|
||||
```
|
||||
|
||||
**Available subcommands:**
|
||||
|
||||
- `start` — Start the server in background with proper logging
|
||||
- `stop` — Stop the server gracefully
|
||||
- `restart` — Restart the server
|
||||
- `status` — Check if server is running
|
||||
- `logs` — Show recent server logs
|
||||
- `test` — Test all API endpoints
|
||||
|
||||
## Manual Server Management
|
||||
|
||||
For direct control:
|
||||
|
||||
```bash
|
||||
cd /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach
|
||||
./scripts/start-server.sh start
|
||||
```
|
||||
|
||||
**Expected output:**
|
||||
|
||||
```
|
||||
Server running on :8080
|
||||
[INF] Starting HTTP server on :8080
|
||||
[TRC] Registering greet routes
|
||||
[TRC] Greet routes registered
|
||||
```
|
||||
|
||||
**Features:**
|
||||
|
||||
- Context-aware server initialization
|
||||
- Graceful shutdown handling
|
||||
- Signal-based termination (`SIGINT`, `SIGTERM`)
|
||||
- 30-second shutdown timeout
|
||||
- Proper resource cleanup
|
||||
|
||||
## Configuration
|
||||
|
||||
Configuration via environment variables with `DLC_` prefix:
|
||||
|
||||
| Option | Environment Variable | Default | Description |
|
||||
|---|---|---|---|
|
||||
| Host | `DLC_SERVER_HOST` | `0.0.0.0` | Server bind address |
|
||||
| Port | `DLC_SERVER_PORT` | `8080` | Server listening port |
|
||||
| Shutdown Timeout | `DLC_SHUTDOWN_TIMEOUT` | `30s` | Graceful shutdown timeout |
|
||||
| JSON Logging | `DLC_LOGGING_JSON` | `false` | Enable JSON format logging |
|
||||
| Log Output | `DLC_LOGGING_OUTPUT` | `""` | Log output file path (empty for stderr) |
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Custom port
|
||||
export DLC_SERVER_PORT=9090
|
||||
./scripts/start-server.sh start
|
||||
|
||||
# Custom host and port
|
||||
export DLC_SERVER_HOST="127.0.0.1"
|
||||
export DLC_SERVER_PORT=8081
|
||||
./scripts/start-server.sh start
|
||||
|
||||
# Custom shutdown timeout
|
||||
export DLC_SHUTDOWN_TIMEOUT=45s
|
||||
|
||||
# Enable JSON logging
|
||||
export DLC_LOGGING_JSON=true
|
||||
|
||||
# Log to file
|
||||
export DLC_LOGGING_OUTPUT="server.log"
|
||||
|
||||
# Combined: JSON logging to file
|
||||
export DLC_LOGGING_JSON=true
|
||||
export DLC_LOGGING_OUTPUT="server.json.log"
|
||||
```
|
||||
|
||||
**Configuration File Support:**
|
||||
|
||||
A `config.example.yaml` file is provided as a template. By default, the application looks for `config.yaml` in the current working directory.
|
||||
|
||||
To specify a custom config file path, set the `DLC_CONFIG_FILE` environment variable:
|
||||
|
||||
```bash
|
||||
DLC_CONFIG_FILE="/path/to/config.yaml" go run ./cmd/server
|
||||
```
|
||||
|
||||
Example `config.yaml`:
|
||||
|
||||
```yaml
|
||||
server:
|
||||
host: "0.0.0.0"
|
||||
port: 8080
|
||||
|
||||
shutdown:
|
||||
timeout: 30s
|
||||
|
||||
logging:
|
||||
json: false
|
||||
```
|
||||
|
||||
**Configuration Loading Precedence:**
|
||||
|
||||
1. **File-based configuration** (highest precedence)
|
||||
2. **Environment variables** (override defaults, overridden by config file)
|
||||
3. **Default values** (fallback)
|
||||
|
||||
All configuration is validated on startup. Invalid configurations cause server startup failure. Configuration values and source are logged at startup.
|
||||
|
||||
**Verification:**
|
||||
|
||||
```bash
|
||||
DLC_SERVER_PORT=9090 DLC_SERVER_HOST="127.0.0.1" ./scripts/start-server.sh start
|
||||
|
||||
curl http://127.0.0.1:9090/api/health
|
||||
# Expected: {"status":"healthy"}
|
||||
```
|
||||
|
||||
## Server Status
|
||||
|
||||
```bash
|
||||
# Check health endpoint
|
||||
curl -s http://localhost:8080/api/health
|
||||
|
||||
# Check readiness endpoint
|
||||
curl -s http://localhost:8080/api/ready
|
||||
```
|
||||
|
||||
**Expected responses:**
|
||||
|
||||
- Health: `{"status":"healthy"}`
|
||||
- Readiness (normal): `{"ready":true}`
|
||||
- Readiness (during shutdown): `{"ready":false}` (HTTP 503)
|
||||
|
||||
**Endpoint Differences:**
|
||||
|
||||
- **Health endpoint** (`/api/health`): Indicates if the application is running and functional
|
||||
- **Readiness endpoint** (`/api/ready`): Indicates if the application is ready to accept traffic
|
||||
|
||||
**Use Cases:**
|
||||
|
||||
- **Health**: Used by load balancers to check if the app is alive
|
||||
- **Readiness**: Used by Kubernetes / service meshes to determine if the app can accept new requests
|
||||
|
||||
**During Graceful Shutdown:**
|
||||
|
||||
- Health endpoint continues to return `{"status":"healthy"}`
|
||||
- Readiness endpoint returns `{"ready":false}` with HTTP 503 Service Unavailable
|
||||
- This allows existing requests to complete while preventing new requests
|
||||
|
||||
## Stopping the Server
|
||||
|
||||
To stop the server gracefully:
|
||||
|
||||
```bash
|
||||
# Send SIGTERM for graceful shutdown
|
||||
kill -TERM $(lsof -ti :8080)
|
||||
|
||||
# Or send SIGINT (Ctrl+C equivalent)
|
||||
pkill -INT -f "go run"
|
||||
```
|
||||
|
||||
**Graceful shutdown process:**
|
||||
|
||||
1. Server receives termination signal
|
||||
2. Logs shutdown message
|
||||
3. Stops accepting new connections
|
||||
4. Waits up to 30 seconds for active requests to complete
|
||||
5. Closes all connections cleanly
|
||||
6. Exits with proper cleanup
|
||||
|
||||
For force stop (if graceful shutdown hangs):
|
||||
|
||||
```bash
|
||||
kill -9 $(lsof -ti :8080)
|
||||
```
|
||||
|
||||
**Verification:**
|
||||
|
||||
```bash
|
||||
curl -s http://localhost:8080/api/health
|
||||
# Should return connection refused
|
||||
```
|
||||
59
documentation/CODE_EXAMPLES.md
Normal file
59
documentation/CODE_EXAMPLES.md
Normal file
@@ -0,0 +1,59 @@
|
||||
# Code Examples
|
||||
|
||||
Snippets and patterns used across the `dance-lessons-coach` codebase. Extracted from the original `AGENTS.md` (Tâche 6 restructure).
|
||||
|
||||
## Adding a New API Endpoint
|
||||
|
||||
```go
|
||||
// 1. Add to interface
|
||||
func (h *apiV1GreetHandler) RegisterRoutes(router chi.Router) {
|
||||
router.Get("/", h.handleGreetQuery)
|
||||
router.Get("/{name}", h.handleGreetPath)
|
||||
router.Post("/custom", h.handleCustomGreet) // New endpoint
|
||||
}
|
||||
|
||||
// 2. Implement handler
|
||||
func (h *apiV1GreetHandler) handleCustomGreet(w http.ResponseWriter, r *http.Request) {
|
||||
// Parse request
|
||||
// Call service
|
||||
// Return JSON response
|
||||
}
|
||||
```
|
||||
|
||||
## Logging with Zerolog
|
||||
|
||||
```go
|
||||
// Trace level logging
|
||||
log.Trace().Ctx(ctx).Str("key", "value").Msg("message")
|
||||
|
||||
// Info level
|
||||
log.Info().Msg("Important event")
|
||||
|
||||
// Error level
|
||||
log.Error().Err(err).Msg("Error occurred")
|
||||
```
|
||||
|
||||
For the full logging strategy (when to use Trace vs Info, performance considerations), see [ADR-0003 — Zerolog Logging](../adr/0003-zerolog-logging.md).
|
||||
|
||||
## Using `context.Context`
|
||||
|
||||
```go
|
||||
// Pass context through calls
|
||||
func handler(w http.ResponseWriter, r *http.Request) {
|
||||
result := service.Greet(r.Context(), "John")
|
||||
// ...
|
||||
}
|
||||
|
||||
// Create context with values
|
||||
ctx := context.WithValue(r.Context(), "key", "value")
|
||||
|
||||
// Create context with timeout
|
||||
ctx, cancel := context.WithTimeout(r.Context(), 5*time.Second)
|
||||
defer cancel()
|
||||
```
|
||||
|
||||
For the rationale behind context-aware services, see [ADR-0004 — Interface-Based Design](../adr/0004-interface-based-design.md).
|
||||
|
||||
## Best Practices Reminders
|
||||
|
||||
For higher-level guidance on code organization, error handling, performance, and testing, see [`AGENT_USAGE_GUIDE.md`](AGENT_USAGE_GUIDE.md#best-practices) section "Best Practices".
|
||||
83
documentation/HISTORY.md
Normal file
83
documentation/HISTORY.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# Development History
|
||||
|
||||
This document records the historical development phases of `dance-lessons-coach`. Extracted from the original `AGENTS.md` (Tâche 6 restructure) for lazy-loading compatibility with Mistral Vibe (128k context).
|
||||
|
||||
All phases below are **completed** ✅. They are kept here for traceability and onboarding context — refer to ADRs (`adr/`) for the technical decisions behind each phase.
|
||||
|
||||
## Phase 1: Foundation
|
||||
|
||||
- Go 1.26.1 environment setup
|
||||
- Project structure with `cmd/` and `pkg/` directories
|
||||
- Core Greet service implementation
|
||||
- CLI interface
|
||||
- Unit tests
|
||||
|
||||
## Phase 2: Web API
|
||||
|
||||
- Chi router integration
|
||||
- Versioned API endpoints (`/api/v1`)
|
||||
- Health endpoint (`/api/health`)
|
||||
- JSON responses with proper headers
|
||||
|
||||
## Phase 3: Logging & Architecture
|
||||
|
||||
- Zerolog integration with Trace level
|
||||
- Context-aware logging
|
||||
- Interface-based design patterns
|
||||
- Dependency injection
|
||||
|
||||
## Phase 4: Documentation & Testing
|
||||
|
||||
- Comprehensive `AGENTS.md`
|
||||
- `README.md` with usage instructions
|
||||
- Server management guide
|
||||
- API endpoint documentation
|
||||
|
||||
## Phase 5: Configuration Management
|
||||
|
||||
- Viper integration for configuration
|
||||
- Environment variable support with `DLC_` prefix
|
||||
- Customizable server host/port
|
||||
- Configurable shutdown timeout
|
||||
- Configuration validation and logging
|
||||
- Example configuration file
|
||||
|
||||
## Phase 6: Graceful Shutdown
|
||||
|
||||
- Context-aware server initialization
|
||||
- Signal-based termination (`SIGINT`, `SIGTERM`)
|
||||
- Configurable shutdown timeout
|
||||
- Readiness endpoint for Kubernetes/service mesh integration
|
||||
- Proper resource cleanup during shutdown
|
||||
- Health endpoint remains healthy during graceful shutdown
|
||||
|
||||
## Phase 7: OpenTelemetry Integration
|
||||
|
||||
- OpenTelemetry Go libraries integration
|
||||
- Jaeger compatibility for distributed tracing
|
||||
- Middleware-only approach using `otelhttp.NewHandler`
|
||||
- Configurable sampling strategies
|
||||
- Graceful shutdown of tracer provider
|
||||
- OTLP exporter with gRPC support
|
||||
|
||||
## Phase 8: Build System & Documentation
|
||||
|
||||
- Build script for binary compilation
|
||||
- Binary output to `bin/` directory
|
||||
- Comprehensive commit conventions with gitmoji reference
|
||||
- Updated documentation with Jaeger integration guide
|
||||
- Cleaned up configuration files
|
||||
- Enhanced logging configuration with file output support
|
||||
|
||||
## Phase 9: Final Refinements
|
||||
|
||||
- Removed unnecessary `time.Sleep` for log flushing
|
||||
- Changed server operational logs from Info to Trace level
|
||||
- Moved all logging setup logic to config package
|
||||
- Simplified server entrypoint to 27 lines
|
||||
- Verified all functionality with comprehensive testing
|
||||
- Updated documentation to reflect final architecture
|
||||
|
||||
## Beyond Phase 9
|
||||
|
||||
Subsequent work (CI/CD, BDD scenarios, ADR audit, JWT, config hot-reloading) is tracked in the [Changelog](../CHANGELOG.md) and the corresponding [ADRs](../adr/).
|
||||
94
documentation/OBSERVABILITY.md
Normal file
94
documentation/OBSERVABILITY.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# Observability — OpenTelemetry & Jaeger Integration
|
||||
|
||||
Tracing setup for `dance-lessons-coach`. Extracted from the original `AGENTS.md` (Tâche 6 restructure) for lazy-loading compatibility with Mistral Vibe.
|
||||
|
||||
The application supports OpenTelemetry for distributed tracing with Jaeger compatibility.
|
||||
|
||||
## Configuration
|
||||
|
||||
Enable OpenTelemetry in your `config.yaml`:
|
||||
|
||||
```yaml
|
||||
telemetry:
|
||||
enabled: true
|
||||
otlp_endpoint: "localhost:4317"
|
||||
service_name: "dance-lessons-coach"
|
||||
insecure: true
|
||||
sampler:
|
||||
type: "parentbased_always_on"
|
||||
ratio: 1.0
|
||||
```
|
||||
|
||||
Or via environment variables:
|
||||
|
||||
```bash
|
||||
export DLC_TELEMETRY_ENABLED=true
|
||||
export DLC_TELEMETRY_OTLP_ENDPOINT="localhost:4317"
|
||||
export DLC_TELEMETRY_SERVICE_NAME="dance-lessons-coach"
|
||||
export DLC_TELEMETRY_INSECURE=true
|
||||
export DLC_TELEMETRY_SAMPLER_TYPE="parentbased_always_on"
|
||||
export DLC_TELEMETRY_SAMPLER_RATIO=1.0
|
||||
```
|
||||
|
||||
## Testing with Jaeger
|
||||
|
||||
**1. Start Jaeger in Docker:**
|
||||
|
||||
```bash
|
||||
docker run -d --name jaeger \
|
||||
-e COLLECTOR_OTLP_ENABLED=true \
|
||||
-p 16686:16686 \
|
||||
-p 4317:4317 \
|
||||
jaegertracing/all-in-one:latest
|
||||
```
|
||||
|
||||
**2. Start the server with OpenTelemetry enabled:**
|
||||
|
||||
```bash
|
||||
# Using config file
|
||||
./scripts/start-server.sh start
|
||||
|
||||
# Or with environment variables
|
||||
DLC_TELEMETRY_ENABLED=true ./scripts/start-server.sh start
|
||||
```
|
||||
|
||||
**3. Make API requests:**
|
||||
|
||||
```bash
|
||||
curl http://localhost:8080/api/v1/greet/John
|
||||
```
|
||||
|
||||
**4. View traces in Jaeger UI:**
|
||||
|
||||
Open http://localhost:16686 and select the `dance-lessons-coach` service.
|
||||
|
||||
## Sampler Types
|
||||
|
||||
| Sampler | Behavior |
|
||||
|---|---|
|
||||
| `always_on` | Sample all traces |
|
||||
| `always_off` | Sample no traces |
|
||||
| `traceidratio` | Sample based on trace ID ratio |
|
||||
| `parentbased_always_on` | Sample based on parent span (always on) |
|
||||
| `parentbased_always_off` | Sample based on parent span (always off) |
|
||||
| `parentbased_traceidratio` | Sample based on parent span with ratio |
|
||||
|
||||
## Testing Script
|
||||
|
||||
A convenience script is provided:
|
||||
|
||||
```bash
|
||||
./scripts/test-opentelemetry.sh
|
||||
```
|
||||
|
||||
This script:
|
||||
|
||||
1. Starts Jaeger container
|
||||
2. Starts the server with OpenTelemetry
|
||||
3. Makes test API calls
|
||||
4. Shows Jaeger UI URL
|
||||
5. Cleans up on exit
|
||||
|
||||
## ADR Reference
|
||||
|
||||
See [ADR-0007 — OpenTelemetry Integration](../adr/0007-opentelemetry-integration.md) for the full architectural decision and rationale (middleware-only approach, sampling strategy, OTLP/gRPC choice).
|
||||
40
documentation/ROADMAP.md
Normal file
40
documentation/ROADMAP.md
Normal file
@@ -0,0 +1,40 @@
|
||||
# Roadmap & Future Enhancements
|
||||
|
||||
Tracking pending features and architectural improvements. Extracted from the original `AGENTS.md` (Tâche 6 restructure). Status updated continuously — items move to "Completed Features" section once shipped.
|
||||
|
||||
## Potential Features
|
||||
|
||||
- [ ] Database integration
|
||||
- [ ] Authentication / Authorization
|
||||
- [ ] Rate limiting
|
||||
- [ ] Metrics and monitoring
|
||||
- [ ] Docker containerization
|
||||
- ✅ CI/CD pipeline ([ADR-0016](../adr/0016-ci-cd-pipeline-design.md), [ADR-0017](../adr/0017-trunk-based-development-workflow.md))
|
||||
- [ ] Configuration hot reload
|
||||
- [ ] Circuit breakers
|
||||
|
||||
## Architectural Improvements
|
||||
|
||||
- [ ] Request validation middleware
|
||||
- ✅ OpenAPI / Swagger documentation with embedded spec
|
||||
- [ ] Enhanced OpenTelemetry instrumentation
|
||||
- [ ] Metrics collection and visualization
|
||||
- [ ] Health check improvements
|
||||
- [ ] Configuration validation enhancements
|
||||
|
||||
## Completed Features
|
||||
|
||||
- ✅ Graceful shutdown with readiness endpoint
|
||||
- ✅ OpenTelemetry integration with Jaeger support
|
||||
- ✅ Configuration management with Viper
|
||||
- ✅ Comprehensive logging with Zerolog
|
||||
- ✅ Build system with binary output
|
||||
- ✅ Complete documentation with commit conventions
|
||||
- ✅ Version management with runtime info
|
||||
|
||||
## How to Propose a New Feature
|
||||
|
||||
1. Open a Gitea issue describing the use case and acceptance criteria
|
||||
2. If the feature implies an architectural decision, draft an ADR (`adr/<NNNN>-<slug>.md`) following the template
|
||||
3. Reference the ADR + issue in any PR introducing the feature
|
||||
4. Update this roadmap (move from "Potential" to "Completed" when shipped)
|
||||
107
documentation/TROUBLESHOOTING.md
Normal file
107
documentation/TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,107 @@
|
||||
# Troubleshooting
|
||||
|
||||
Common issues and their resolution. Extracted from the original `AGENTS.md` and merged with relevant sections from `AGENT_USAGE_GUIDE.md` and `BDD_GUIDE.md`. Refer back to those guides for context-specific troubleshooting (agent workflows, BDD test failures).
|
||||
|
||||
## Port Already in Use
|
||||
|
||||
```bash
|
||||
# Find and kill process using port 8080
|
||||
kill -TERM $(lsof -ti :8080)
|
||||
|
||||
# Force kill if graceful does not work
|
||||
kill -9 $(lsof -ti :8080)
|
||||
```
|
||||
|
||||
## Server Not Responding
|
||||
|
||||
```bash
|
||||
# Check if running
|
||||
curl -s http://localhost:8080/api/health
|
||||
|
||||
# Restart server using control script
|
||||
./scripts/start-server.sh restart
|
||||
|
||||
# View recent logs
|
||||
./scripts/start-server.sh logs
|
||||
```
|
||||
|
||||
If health endpoint returns connection refused, the server may have crashed. Check logs in `./scripts/start-server.sh logs` for stack traces.
|
||||
|
||||
## Dependency Issues
|
||||
|
||||
```bash
|
||||
# Clean and rebuild
|
||||
go mod tidy
|
||||
go build ./...
|
||||
|
||||
# If dependency version conflicts persist
|
||||
go mod download
|
||||
go mod verify
|
||||
```
|
||||
|
||||
## Tests Failing
|
||||
|
||||
### Unit tests
|
||||
|
||||
```bash
|
||||
# Run with verbose output
|
||||
go test -v ./...
|
||||
|
||||
# Check specific test
|
||||
go test ./pkg/greet/ -run TestName
|
||||
```
|
||||
|
||||
### BDD tests
|
||||
|
||||
See [`BDD_GUIDE.md`](BDD_GUIDE.md) for the full BDD troubleshooting workflow (Godog setup, scenario isolation, step matching). Common BDD issues:
|
||||
|
||||
- **Step not found** → check `pkg/bdd/steps/` for the step definition file
|
||||
- **Scenario state leaking** → review [ADR-0025](../adr/0025-bdd-scenario-isolation-strategies.md) for the isolation pattern
|
||||
- **Database not reset** → ensure the test fixtures cleanup runs (BDD scenario After hooks)
|
||||
|
||||
## Configuration Not Loading
|
||||
|
||||
The application logs the configuration source at startup. Check logs for:
|
||||
|
||||
```
|
||||
[INF] Configuration loaded from: file:config.yaml
|
||||
# or
|
||||
[INF] Configuration loaded from: env
|
||||
# or
|
||||
[INF] Configuration loaded from: defaults
|
||||
```
|
||||
|
||||
If config is not loading as expected:
|
||||
|
||||
1. Verify file exists and is readable: `ls -la config.yaml`
|
||||
2. Verify env vars are exported: `env | grep DLC_`
|
||||
3. Check for typos in keys (case-sensitive)
|
||||
4. Review [`AGENT_USAGE_GUIDE.md`](AGENT_USAGE_GUIDE.md) section "Configuration troubleshooting"
|
||||
|
||||
## OpenTelemetry Not Tracing
|
||||
|
||||
1. Verify Jaeger is running: `docker ps | grep jaeger`
|
||||
2. Check `DLC_TELEMETRY_ENABLED=true` in environment or `telemetry.enabled: true` in config
|
||||
3. Verify OTLP endpoint reachable: `nc -zv localhost 4317`
|
||||
4. Check sampler is not `always_off`
|
||||
5. See [`OBSERVABILITY.md`](OBSERVABILITY.md) for full setup
|
||||
|
||||
## Build Failures
|
||||
|
||||
```bash
|
||||
# Clear caches
|
||||
go clean -cache -modcache
|
||||
go mod download
|
||||
|
||||
# Rebuild
|
||||
go build ./...
|
||||
```
|
||||
|
||||
If errors persist, see [`local-ci-cd-testing.md`](local-ci-cd-testing.md) for the CI/CD pipeline that mirrors the production build.
|
||||
|
||||
## Where to Look Next
|
||||
|
||||
- **Agent-specific issues** (vibe, mistral, programmer agent) → [`AGENT_USAGE_GUIDE.md`](AGENT_USAGE_GUIDE.md)
|
||||
- **BDD-specific issues** → [`BDD_GUIDE.md`](BDD_GUIDE.md)
|
||||
- **Version/release issues** → [`version-management-guide.md`](version-management-guide.md)
|
||||
- **CI/CD issues** → [`local-ci-cd-testing.md`](local-ci-cd-testing.md)
|
||||
@@ -69,7 +69,7 @@ This workflow can be triggered manually or on test/feature branches.
|
||||
### 1. Run the Interactive Script
|
||||
|
||||
```bash
|
||||
cd /Users/gabrielradureau/Work/Vibe/DanceLessonsCoach
|
||||
cd /Users/gabrielradureau/Work/Vibe/dance-lessons-coach
|
||||
./scripts/test-local-ci-cd.sh
|
||||
```
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@ This document clarifies the security-critical aspect of the password reset workf
|
||||
|
||||
## 🎯 Security Principle
|
||||
|
||||
The DanceLessonsCoach password reset system follows a **zero-trust, admin-controlled** security model:
|
||||
The dance-lessons-coach password reset system follows a **zero-trust, admin-controlled** security model:
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
@@ -234,4 +234,4 @@ func (s *AuthService) ResetPasswordWithoutAuth(username, newPassword string) err
|
||||
|
||||
---
|
||||
|
||||
*DanceLessonsCoach - Secure by design, private by default 🔒*
|
||||
*dance-lessons-coach - Secure by design, private by default 🔒*
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## Overview
|
||||
|
||||
The DanceLessonsCoach user management and authentication system provides secure user authentication, personalized experiences, and administrative capabilities. This document describes the system architecture, API endpoints, and integration points.
|
||||
The dance-lessons-coach user management and authentication system provides secure user authentication, personalized experiences, and administrative capabilities. This document describes the system architecture, API endpoints, and integration points.
|
||||
|
||||
## Architecture
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Version Management Guide
|
||||
|
||||
This guide provides comprehensive instructions for managing versions in the DanceLessonsCoach project.
|
||||
This guide provides comprehensive instructions for managing versions in the dance-lessons-coach project.
|
||||
|
||||
## 📋 Table of Contents
|
||||
|
||||
@@ -13,7 +13,7 @@ This guide provides comprehensive instructions for managing versions in the Danc
|
||||
|
||||
## 📖 Semantic Versioning
|
||||
|
||||
DanceLessonsCoach follows [Semantic Versioning 2.0.0](https://semver.org/):
|
||||
dance-lessons-coach follows [Semantic Versioning 2.0.0](https://semver.org/):
|
||||
|
||||
### Version Format: `MAJOR.MINOR.PATCH-PRERELEASE`
|
||||
|
||||
@@ -360,6 +360,6 @@ git push origin v1.0.1
|
||||
|
||||
---
|
||||
|
||||
**Maintained by:** DanceLessonsCoach Team
|
||||
**Maintained by:** dance-lessons-coach Team
|
||||
**Last Updated:** 2026-04-05
|
||||
**Version:** 1.0
|
||||
346
features/BDD_TAGS.md
Normal file
346
features/BDD_TAGS.md
Normal file
@@ -0,0 +1,346 @@
|
||||
# BDD Test Tags Documentation
|
||||
|
||||
This document describes the tagging system used in the dance-lessons-coach BDD tests for selective test execution.
|
||||
|
||||
## Tag Categories
|
||||
|
||||
### Feature Tags
|
||||
Used to categorize tests by feature area:
|
||||
- `@auth` - Authentication and user management tests
|
||||
- `@config` - Configuration and hot reloading tests
|
||||
- `@greet` - Greeting service tests
|
||||
- `@health` - Health check and monitoring tests
|
||||
- `@jwt` - JWT secret rotation and retention tests
|
||||
|
||||
### Priority Tags
|
||||
Used to categorize tests by importance:
|
||||
- `@smoke` - Basic smoke tests that verify core functionality
|
||||
- `@critical` - Critical path tests that must always pass
|
||||
- `@basic` - Basic functionality tests
|
||||
- `@advanced` - Advanced or edge case scenarios
|
||||
- `@nice_to_have` - Optional features that would be nice to have but aren't critical
|
||||
|
||||
### Component Tags
|
||||
Used to categorize tests by system component:
|
||||
- `@api` - API endpoint tests
|
||||
- `@v2` - Version 2 API tests
|
||||
- `@database` - Database interaction tests
|
||||
- `@security` - Security-related tests
|
||||
|
||||
### Exclusion Tags
|
||||
Used to exclude tests from execution:
|
||||
- `@flaky` - Tests that are unstable or intermittently fail
|
||||
- `@todo` - Tests with pending step implementations
|
||||
- `@skip` - Tests that should be skipped entirely
|
||||
|
||||
### Nice-to-Have Tag
|
||||
|
||||
The `@nice_to_have` tag is used to mark scenarios that test optional features or enhancements. These are features that would be beneficial to have but aren't critical for the core functionality of the system.
|
||||
|
||||
**Usage:**
|
||||
- Add `@nice_to_have` to scenarios testing optional features
|
||||
- These scenarios are typically excluded from critical path testing
|
||||
- Useful for marking "stretch goal" functionality
|
||||
|
||||
**Example:**
|
||||
```gherkin
|
||||
@nice_to_have @greet
|
||||
Scenario: Greeting with custom formatting options
|
||||
Given the server is running
|
||||
When I request a greeting with bold formatting
|
||||
Then the response should contain HTML bold tags
|
||||
```
|
||||
|
||||
### Work In Progress Tag
|
||||
Used to override exclusions for active development:
|
||||
- `@wip` - Work In Progress - overrides exclusion tags to allow focused development
|
||||
|
||||
**Usage:** Add `@wip` to scenarios you're actively working on, even if they have other exclusion tags like `@todo` or `@skip`. The `@wip` tag takes precedence and allows the scenario to run.
|
||||
|
||||
**Example:**
|
||||
```gherkin
|
||||
@todo @wip
|
||||
Scenario: JWT authentication with multiple secrets
|
||||
Given the server is running with multiple JWT secrets
|
||||
When I authenticate with valid credentials
|
||||
Then I should receive a valid JWT token
|
||||
```
|
||||
|
||||
### Command-Line Tag Override
|
||||
You can override the default tag filtering by setting the `GODOG_TAGS` environment variable when running tests.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Run only @wip scenarios
|
||||
GODOG_TAGS="@wip" go test ./features/jwt/...
|
||||
|
||||
# Run smoke tests only
|
||||
GODOG_TAGS="@smoke" go test ./features/...
|
||||
|
||||
# Run specific combination
|
||||
GODOG_TAGS="@jwt && ~@todo" go test ./features/...
|
||||
|
||||
# Combine with other environment variables
|
||||
DLC_DATABASE_HOST=localhost GODOG_TAGS="@wip" go test ./features/jwt/...
|
||||
```
|
||||
|
||||
### Test Randomization Control
|
||||
You can control test execution order using the `GODOG_RANDOM_SEED` environment variable.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Use random test order (default)
|
||||
GODOG_RANDOM_SEED="" go test ./features/
|
||||
|
||||
# Use fixed seed for reproducible test runs
|
||||
GODOG_RANDOM_SEED=17925 go test ./features/
|
||||
|
||||
# Combine with tag filtering
|
||||
GODOG_RANDOM_SEED=17925 GODOG_TAGS="@wip" go test ./features/
|
||||
|
||||
# Debug specific test failures by reproducing exact execution order
|
||||
GODOG_RANDOM_SEED=17925 DLC_DATABASE_HOST=localhost go test ./features/jwt/
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- **Reproducibility**: Same seed produces same test order
|
||||
- **Debugging**: Easily reproduce failed test runs
|
||||
- **CI/CD**: Set fixed seeds for consistent test execution
|
||||
- **Backward compatible**: Defaults to random order when not specified
|
||||
|
||||
**Example from test output:**
|
||||
```
|
||||
30 scenarios (11 passed, 19 failed)
|
||||
147 steps (104 passed, 19 failed, 24 skipped)
|
||||
4.474215346s
|
||||
Randomized with seed: 17925
|
||||
```
|
||||
|
||||
To reproduce this exact test run:
|
||||
```bash
|
||||
GODOG_RANDOM_SEED=17925 go test ./features/
|
||||
```
|
||||
|
||||
### Random Port Selection (Default Behavior)
|
||||
|
||||
By default, BDD tests use **random ports** (10000-19999) to prevent port conflicts during parallel execution. This ensures tests can run reliably in CI/CD pipelines and when executed multiple times.
|
||||
|
||||
**Benefits:**
|
||||
- ✅ No port conflicts in parallel test execution
|
||||
- ✅ Safe for repeated test runs
|
||||
- ✅ Better for CI/CD environments
|
||||
|
||||
**Disable random ports (not recommended):**
|
||||
```bash
|
||||
FIXED_TEST_PORT=true go test ./features/...
|
||||
```
|
||||
|
||||
**Force specific port (debugging only):**
|
||||
```bash
|
||||
# Create a test config file with fixed port
|
||||
echo "server:
|
||||
port: 9191" > test-config.yaml
|
||||
FEATURE=debug FIXED_TEST_PORT=true go test ./features/...
|
||||
```
|
||||
|
||||
### Test Validation Process
|
||||
|
||||
To ensure test suite stability, follow this validation process:
|
||||
|
||||
**Validation Command:**
|
||||
```bash
|
||||
# Clean cache and run all tests 20 times
|
||||
echo "🧪 Validating test suite stability..."
|
||||
for i in {1..20}; do
|
||||
echo "Run $i/20..."
|
||||
go clean -testcache
|
||||
if ! go test ./... > /dev/null 2>&1; then
|
||||
echo "❌ Test run $i failed"
|
||||
go test ./... -v
|
||||
exit 1
|
||||
fi
|
||||
done
|
||||
echo "✅ All 20 test runs passed successfully!"
|
||||
```
|
||||
|
||||
**Failure Handling:**
|
||||
- If any test fails during validation, mark it as `@wip` and investigate
|
||||
- Use `@flaky` tag for intermittently failing tests
|
||||
- Document the issue in the test scenario comments
|
||||
|
||||
**Success Criteria:**
|
||||
- ✅ 100% pass rate across 20 consecutive runs
|
||||
- ✅ No undefined/pending steps
|
||||
- ✅ No race conditions or port conflicts
|
||||
- ✅ Consistent execution time
|
||||
|
||||
**CI/CD Integration:**
|
||||
```yaml
|
||||
- name: Validate Test Suite
|
||||
run: |
|
||||
echo "🧪 Running 20 validation runs..."
|
||||
for i in {1..20}; do
|
||||
echo "Run $i/20"
|
||||
go clean -testcache
|
||||
go test ./... || exit 1
|
||||
done
|
||||
echo "✅ Test suite validated successfully"
|
||||
```
|
||||
|
||||
### Stop On Failure Control
|
||||
You can control whether tests stop on first failure using the `GODOG_STOP_ON_FAILURE` environment variable.
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
# Stop on first failure (strict mode)
|
||||
GODOG_STOP_ON_FAILURE="true" go test ./features/jwt/...
|
||||
|
||||
# Continue after failures (lenient mode)
|
||||
GODOG_STOP_ON_FAILURE="false" go test ./features/jwt/...
|
||||
|
||||
# Combine with tag filtering
|
||||
GODOG_TAGS="@wip" GODOG_STOP_ON_FAILURE="true" go test ./features/jwt/...
|
||||
```
|
||||
|
||||
**Default Behavior:**
|
||||
- If `GODOG_TAGS` is not set, the test uses the default tag filter: `~@flaky && ~@todo && ~@skip`
|
||||
- If `GODOG_STOP_ON_FAILURE` is not set, each feature uses its default:
|
||||
- `jwt`, `greet`, `auth`, `health`: `true` (stop on failure)
|
||||
- `config`, `all features`: `false` (continue after failures)
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Running Smoke Tests
|
||||
```bash
|
||||
# Run all smoke tests
|
||||
godog --tags=@smoke features/
|
||||
|
||||
# Run smoke tests for specific feature
|
||||
godog --tags=@smoke features/auth/
|
||||
```
|
||||
|
||||
### Running Critical Tests
|
||||
```bash
|
||||
# Run all critical tests
|
||||
godog --tags=@critical features/
|
||||
|
||||
# Run critical health tests
|
||||
godog --tags=@critical,@health features/
|
||||
```
|
||||
|
||||
### Running Feature-Specific Tests
|
||||
```bash
|
||||
# Run all auth tests
|
||||
godog --tags=@auth features/
|
||||
|
||||
# Run v2 API tests
|
||||
godog --tags=@v2 features/
|
||||
```
|
||||
|
||||
### Combining Tags
|
||||
```bash
|
||||
# Run smoke tests for auth and health features
|
||||
godog --tags=@smoke,@auth,@health features/
|
||||
|
||||
# Run critical API tests
|
||||
godog --tags=@critical,@api features/
|
||||
```
|
||||
|
||||
## Tagging Conventions
|
||||
|
||||
1. **Feature tags** should be applied at the feature level
|
||||
2. **Priority tags** should be applied at the scenario level
|
||||
3. **Component tags** should be applied at the scenario level
|
||||
4. **Multiple tags** can be applied to a single scenario
|
||||
|
||||
### Example Feature File
|
||||
```gherkin
|
||||
@health @smoke
|
||||
Feature: Health Endpoint
|
||||
The health endpoint should indicate server status
|
||||
|
||||
@basic @critical
|
||||
Scenario: Health check returns healthy status
|
||||
Given the server is running
|
||||
When I request the health endpoint
|
||||
Then the response should be "{\"status\":\"healthy\"}"
|
||||
|
||||
@advanced @api
|
||||
Scenario: Health check with authentication
|
||||
Given the server is running with auth enabled
|
||||
When I request the health endpoint with valid token
|
||||
Then the response should be "{\"status\":\"healthy\"}"
|
||||
```
|
||||
|
||||
## Test Execution Scripts
|
||||
|
||||
### Feature-Specific Testing
|
||||
```bash
|
||||
# Test specific feature
|
||||
./scripts/test-feature.sh greet
|
||||
|
||||
# Test with specific tags
|
||||
./scripts/test-by-tag.sh @smoke greet
|
||||
```
|
||||
|
||||
### Tag-Based Testing
|
||||
```bash
|
||||
# Run smoke tests for all features
|
||||
./scripts/test-by-tag.sh @smoke
|
||||
|
||||
# Run critical auth tests
|
||||
./scripts/test-by-tag.sh @critical auth
|
||||
```
|
||||
|
||||
## CI/CD Integration
|
||||
|
||||
### Smoke Test Pipeline
|
||||
```yaml
|
||||
- name: Run Smoke Tests
|
||||
run: godog --tags=@smoke features/
|
||||
```
|
||||
|
||||
### Critical Path Testing
|
||||
```yaml
|
||||
- name: Run Critical Tests
|
||||
run: godog --tags=@critical features/
|
||||
```
|
||||
|
||||
### Feature-Specific Testing
|
||||
```yaml
|
||||
- name: Test Auth Feature
|
||||
run: ./scripts/test-feature.sh auth
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Tag consistently** - Apply tags consistently across similar scenarios
|
||||
2. **Prioritize tests** - Use priority tags to identify critical tests
|
||||
3. **Document tags** - Keep this documentation updated with new tags
|
||||
4. **Review tags** - Regularly review tag usage to ensure relevance
|
||||
5. **CI/CD optimization** - Use tags to optimize CI/CD pipeline execution times
|
||||
|
||||
## Tag Reference
|
||||
|
||||
| Tag | Purpose | Example Usage |
|
||||
|-----|---------|--------------|
|
||||
| `@smoke` | Smoke tests | `@smoke` on critical features |
|
||||
| `@critical` | Critical path | `@critical` on essential scenarios |
|
||||
| `@basic` | Basic functionality | `@basic` on standard scenarios |
|
||||
| `@advanced` | Advanced scenarios | `@advanced` on edge cases |
|
||||
| `@nice_to_have` | Optional features | `@nice_to_have` on stretch goal scenarios |
|
||||
| `@auth` | Authentication | `@auth` on auth features |
|
||||
| `@config` | Configuration | `@config` on config scenarios |
|
||||
| `@api` | API endpoints | `@api` on endpoint tests |
|
||||
| `@v2` | V2 API | `@v2` on version 2 tests |
|
||||
| `@flaky` | Exclude flaky tests | `@flaky` on unstable scenarios |
|
||||
| `@todo` | Exclude pending tests | `@todo` on unimplemented scenarios |
|
||||
| `@skip` | Exclude tests entirely | `@skip` on disabled scenarios |
|
||||
| `@wip` | Work in progress | `@wip` on actively developed scenarios |
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- **Performance tags** - `@fast`, `@slow` for performance categorization
|
||||
- **Environment tags** - `@ci`, `@local` for environment-specific tests
|
||||
- **Risk tags** - `@high-risk`, `@low-risk` for risk-based testing
|
||||
- **Automated tag validation** - Script to validate tag usage consistency
|
||||
16
features/auth/auth_test.go
Normal file
16
features/auth/auth_test.go
Normal file
@@ -0,0 +1,16 @@
|
||||
package auth
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"dance-lessons-coach/pkg/bdd/testsetup"
|
||||
)
|
||||
|
||||
func TestAuthBDD(t *testing.T) {
|
||||
config := testsetup.NewFeatureConfig("auth", "progress", false)
|
||||
suite := testsetup.CreateTestSuite(t, config, "dance-lessons-coach BDD Tests - Auth Feature")
|
||||
|
||||
if suite.Run() != 0 {
|
||||
t.Fatal("non-zero status returned, failed to run auth BDD tests")
|
||||
}
|
||||
}
|
||||
152
features/auth/user_authentication.feature
Normal file
152
features/auth/user_authentication.feature
Normal file
@@ -0,0 +1,152 @@
|
||||
# features/user_authentication.feature
|
||||
Feature: User Authentication
|
||||
As a user
|
||||
I want to authenticate with the system
|
||||
So I can access personalized features
|
||||
|
||||
Scenario: Successful user authentication
|
||||
Given the server is running
|
||||
And a user "testuser" exists with password "testpass123"
|
||||
When I authenticate with username "testuser" and password "testpass123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token
|
||||
|
||||
Scenario: Failed authentication with wrong password
|
||||
Given the server is running
|
||||
And a user "testuser" exists with password "testpass123"
|
||||
When I authenticate with username "testuser" and password "wrongpassword"
|
||||
Then the authentication should fail
|
||||
And the response should contain error "invalid_credentials"
|
||||
|
||||
Scenario: Failed authentication with non-existent user
|
||||
Given the server is running
|
||||
When I authenticate with username "nonexistent" and password "somepassword"
|
||||
Then the authentication should fail
|
||||
And the response should contain error "invalid_credentials"
|
||||
|
||||
Scenario: Admin authentication with master password
|
||||
Given the server is running
|
||||
When I authenticate as admin with master password "admin123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token
|
||||
And the token should contain admin claims
|
||||
|
||||
Scenario: User registration
|
||||
Given the server is running
|
||||
When I register a new user "newuser_" with password "newpass123"
|
||||
Then the registration should be successful
|
||||
And I should be able to authenticate with the new credentials
|
||||
|
||||
Scenario: Password reset request by admin
|
||||
Given the server is running
|
||||
And a user "resetuser" exists with password "oldpass123"
|
||||
And I am authenticated as admin
|
||||
When I request password reset for user "resetuser"
|
||||
Then the password reset should be allowed
|
||||
And the user should be flagged for password reset
|
||||
|
||||
Scenario: User completes password reset
|
||||
Given the server is running
|
||||
And a user "resetuser" exists and is flagged for password reset
|
||||
When I complete password reset for "resetuser" with new password "newpass123"
|
||||
Then the password reset should be successful
|
||||
And I should be able to authenticate with the new password
|
||||
|
||||
Scenario: Failed password reset for non-existent user
|
||||
Given the server is running
|
||||
When I request password reset for user "nonexistent"
|
||||
Then the password reset should fail
|
||||
And the response should contain error "server_error"
|
||||
|
||||
Scenario: Failed password reset completion for non-existent user
|
||||
Given the server is running
|
||||
When I complete password reset for "nonexistent" with new password "newpass123"
|
||||
Then the password reset should fail
|
||||
And the response should contain error "server_error"
|
||||
|
||||
Scenario: Failed password reset completion for user not flagged
|
||||
Given the server is running
|
||||
And a user "normaluser" exists with password "oldpass123"
|
||||
When I complete password reset for "normaluser" with new password "newpass123"
|
||||
Then the password reset should fail
|
||||
And the response should contain error "server_error"
|
||||
|
||||
Scenario: Failed registration with existing username
|
||||
Given the server is running
|
||||
And a user "existinguser" exists with password "testpass123"
|
||||
When I register a new user "existinguser" with password "newpass123"
|
||||
Then the registration should fail
|
||||
And the response should contain error "user_exists"
|
||||
And the status code should be 409
|
||||
|
||||
Scenario: Failed registration with invalid username
|
||||
Given the server is running
|
||||
When I register a new user "ab" with password "validpass123"
|
||||
Then the registration should fail
|
||||
And the status code should be 400
|
||||
|
||||
Scenario: Failed registration with invalid password
|
||||
Given the server is running
|
||||
When I register a new user "validuser" with password "short"
|
||||
Then the registration should fail
|
||||
And the status code should be 400
|
||||
|
||||
Scenario: Failed authentication with empty username
|
||||
Given the server is running
|
||||
When I authenticate with username "" and password "somepassword"
|
||||
Then the authentication should fail with validation error
|
||||
And the status code should be 400
|
||||
|
||||
Scenario: Failed authentication with empty password
|
||||
Given the server is running
|
||||
When I authenticate with username "someuser" and password ""
|
||||
Then the authentication should fail with validation error
|
||||
And the status code should be 400
|
||||
|
||||
Scenario: Failed admin authentication with wrong password
|
||||
Given the server is running
|
||||
When I authenticate as admin with master password "wrongadmin"
|
||||
Then the authentication should fail
|
||||
And the response should contain error "invalid_credentials"
|
||||
|
||||
Scenario: Multiple consecutive authentications
|
||||
Given the server is running
|
||||
And a user "multiuser" exists with password "testpass123"
|
||||
When I authenticate with username "multiuser" and password "testpass123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token
|
||||
When I authenticate with username "multiuser" and password "testpass123" again
|
||||
Then the authentication should be successful
|
||||
And I should receive a different JWT token
|
||||
|
||||
Scenario: JWT token validation
|
||||
Given the server is running
|
||||
And a user "tokenuser" exists with password "testpass123"
|
||||
When I authenticate with username "tokenuser" and password "testpass123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token
|
||||
When I validate the received JWT token
|
||||
Then the token should be valid
|
||||
And it should contain the correct user ID
|
||||
|
||||
Scenario: Authentication with expired JWT token
|
||||
Given the server is running
|
||||
And a user "expireduser" exists with password "testpass123"
|
||||
When I authenticate with username "expireduser" and password "testpass123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token
|
||||
When I use an expired JWT token for authentication
|
||||
Then the authentication should fail
|
||||
And the response should contain error "invalid_token"
|
||||
|
||||
Scenario: Authentication with JWT token signed with wrong secret
|
||||
Given the server is running
|
||||
When I use a JWT token signed with wrong secret for authentication
|
||||
Then the authentication should fail
|
||||
And the response should contain error "invalid_token"
|
||||
|
||||
Scenario: Authentication with malformed JWT token
|
||||
Given the server is running
|
||||
When I use a malformed JWT token for authentication
|
||||
Then the authentication should fail
|
||||
And the response should contain error "invalid_token"
|
||||
@@ -3,22 +3,29 @@ package features
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"dance-lessons-coach/pkg/bdd"
|
||||
"github.com/cucumber/godog"
|
||||
"dance-lessons-coach/pkg/bdd/testsetup"
|
||||
)
|
||||
|
||||
func TestBDD(t *testing.T) {
|
||||
suite := godog.TestSuite{
|
||||
Name: "dance-lessons-coach BDD Tests",
|
||||
TestSuiteInitializer: bdd.InitializeTestSuite,
|
||||
ScenarioInitializer: bdd.InitializeScenario,
|
||||
Options: &godog.Options{
|
||||
Format: "progress",
|
||||
Paths: []string{"."},
|
||||
TestingT: t,
|
||||
},
|
||||
// Get feature name from environment variable or default to all features
|
||||
feature := testsetup.GetFeatureFromEnv()
|
||||
|
||||
var suiteName string
|
||||
var paths []string
|
||||
|
||||
if feature == "" {
|
||||
// Run all features
|
||||
suiteName = "dance-lessons-coach BDD Tests - All Features"
|
||||
paths = testsetup.GetAllFeaturePaths()
|
||||
} else {
|
||||
// Run specific feature
|
||||
suiteName = "dance-lessons-coach BDD Tests - " + feature + " Feature"
|
||||
paths = []string{feature}
|
||||
}
|
||||
|
||||
config := testsetup.NewMultiFeatureConfig(paths, "progress", false)
|
||||
suite := testsetup.CreateMultiFeatureTestSuite(t, config, suiteName)
|
||||
|
||||
if suite.Run() != 0 {
|
||||
t.Fatal("non-zero status returned, failed to run BDD tests")
|
||||
}
|
||||
|
||||
83
features/config/config_hot_reloading.feature
Normal file
83
features/config/config_hot_reloading.feature
Normal file
@@ -0,0 +1,83 @@
|
||||
# features/config_hot_reloading.feature
|
||||
Feature: Config Hot Reloading
|
||||
The system should support selective hot reloading of configuration changes
|
||||
|
||||
@flaky
|
||||
Scenario: Hot reloading logging level changes
|
||||
Given the server is running with config file monitoring enabled
|
||||
When I update the logging level to "debug" in the config file
|
||||
Then the logging level should be updated without restart
|
||||
And debug logs should appear in the output
|
||||
|
||||
@flaky
|
||||
Scenario: Hot reloading feature flags
|
||||
Given the server is running with config file monitoring enabled
|
||||
And the v2 API is disabled
|
||||
When I enable the v2 API in the config file
|
||||
Then the v2 API should become available without restart
|
||||
And v2 API requests should succeed
|
||||
|
||||
@flaky
|
||||
Scenario: Hot reloading telemetry sampling settings
|
||||
Given the server is running with config file monitoring enabled
|
||||
And telemetry is enabled
|
||||
When I update the sampler type to "parentbased_traceidratio" in the config file
|
||||
And I set the sampler ratio to "0.5" in the config file
|
||||
Then the telemetry sampling should be updated without restart
|
||||
And the new sampling settings should be applied
|
||||
|
||||
@flaky
|
||||
Scenario: Hot reloading JWT TTL
|
||||
Given the server is running with config file monitoring enabled
|
||||
And JWT TTL is set to 1 hour
|
||||
When I update the JWT TTL to 2 hours in the config file
|
||||
Then the JWT TTL should be updated without restart
|
||||
And new JWT tokens should have the updated expiration
|
||||
|
||||
@flaky
|
||||
Scenario: Attempting to hot reload non-reloadable settings should be ignored
|
||||
Given the server is running with config file monitoring enabled
|
||||
When I update the server port to 9090 in the config file
|
||||
Then the server port should remain unchanged
|
||||
And the server should continue running on the original port
|
||||
And a warning should be logged about ignored configuration change
|
||||
|
||||
@flaky
|
||||
Scenario: Invalid configuration changes should be handled gracefully
|
||||
Given the server is running with config file monitoring enabled
|
||||
When I update the logging level to "invalid_level" in the config file
|
||||
Then the logging level should remain unchanged
|
||||
And an error should be logged about invalid configuration
|
||||
And the server should continue running normally
|
||||
|
||||
@flaky
|
||||
Scenario: Config file monitoring should handle file deletion gracefully
|
||||
Given the server is running with config file monitoring enabled
|
||||
When I delete the config file
|
||||
Then the server should continue running with last known good configuration
|
||||
And a warning should be logged about missing config file
|
||||
|
||||
@flaky
|
||||
Scenario: Config file monitoring should handle file recreation
|
||||
Given the server is running with config file monitoring enabled
|
||||
And I have deleted the config file
|
||||
When I recreate the config file with valid configuration
|
||||
Then the server should reload the configuration
|
||||
And the new configuration should be applied
|
||||
|
||||
@flaky
|
||||
Scenario: Multiple rapid configuration changes should be handled
|
||||
Given the server is running with config file monitoring enabled
|
||||
When I rapidly update the logging level multiple times
|
||||
Then all changes should be processed in order
|
||||
And the final configuration should be applied
|
||||
And no configuration changes should be lost
|
||||
|
||||
@flaky
|
||||
Scenario: Configuration changes should be audited
|
||||
Given the server is running with config file monitoring enabled
|
||||
And audit logging is enabled
|
||||
When I update the logging level to "info" in the config file
|
||||
Then an audit log entry should be created
|
||||
And the audit entry should contain the previous and new values
|
||||
And the audit entry should contain the timestamp of the change
|
||||
16
features/config/config_test.go
Normal file
16
features/config/config_test.go
Normal file
@@ -0,0 +1,16 @@
|
||||
package config
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"dance-lessons-coach/pkg/bdd/testsetup"
|
||||
)
|
||||
|
||||
func TestConfigBDD(t *testing.T) {
|
||||
config := testsetup.NewFeatureConfig("config", "progress", false)
|
||||
suite := testsetup.CreateTestSuite(t, config, "dance-lessons-coach BDD Tests - Config Feature")
|
||||
|
||||
if suite.Run() != 0 {
|
||||
t.Fatal("non-zero status returned, failed to run config BDD tests")
|
||||
}
|
||||
}
|
||||
@@ -1,17 +1,21 @@
|
||||
# features/greet.feature
|
||||
@greet @smoke
|
||||
Feature: Greet Service
|
||||
The greet service should return appropriate greetings
|
||||
|
||||
@basic
|
||||
Scenario: Default greeting
|
||||
Given the server is running
|
||||
When I request the default greeting
|
||||
Then the response should be "{\"message\":\"Hello world!\"}"
|
||||
|
||||
@basic
|
||||
Scenario: Personalized greeting
|
||||
Given the server is running
|
||||
When I request a greeting for "John"
|
||||
Then the response should be "{\"message\":\"Hello John!\"}"
|
||||
|
||||
@v2 @api
|
||||
Scenario: v2 greeting with JSON POST request
|
||||
Given the server is running with v2 enabled
|
||||
When I send a POST request to v2 greet with name "John"
|
||||
30
features/greet/greet_test.go
Normal file
30
features/greet/greet_test.go
Normal file
@@ -0,0 +1,30 @@
|
||||
package greet
|
||||
|
||||
import (
|
||||
"os"
|
||||
"testing"
|
||||
|
||||
"dance-lessons-coach/pkg/bdd/testsetup"
|
||||
)
|
||||
|
||||
func TestGreetBDD(t *testing.T) {
|
||||
// Test suite with v2 disabled - run non-v2 scenarios only
|
||||
t.Run("v1", func(t *testing.T) {
|
||||
os.Setenv("GODOG_TAGS", "~@v2 && ~@skip")
|
||||
config := testsetup.NewFeatureConfig("greet", "progress", false)
|
||||
suite := testsetup.CreateTestSuite(t, config, "dance-lessons-coach BDD Tests - Greet Feature v1")
|
||||
if suite.Run() != 0 {
|
||||
t.Fatal("non-zero status returned, failed to run greet BDD tests with v2 disabled")
|
||||
}
|
||||
})
|
||||
|
||||
// Test suite with v2 enabled - run v2 scenarios only
|
||||
t.Run("v2", func(t *testing.T) {
|
||||
os.Setenv("GODOG_TAGS", "@v2 && ~@skip")
|
||||
config := testsetup.NewFeatureConfig("greet", "progress", false)
|
||||
suite := testsetup.CreateTestSuite(t, config, "dance-lessons-coach BDD Tests - Greet Feature v2")
|
||||
if suite.Run() != 0 {
|
||||
t.Fatal("non-zero status returned, failed to run greet BDD tests with v2 enabled")
|
||||
}
|
||||
})
|
||||
}
|
||||
@@ -1,7 +1,9 @@
|
||||
# features/health.feature
|
||||
@health @smoke @critical
|
||||
Feature: Health Endpoint
|
||||
The health endpoint should indicate server status
|
||||
|
||||
@basic @critical
|
||||
Scenario: Health check returns healthy status
|
||||
Given the server is running
|
||||
When I request the health endpoint
|
||||
16
features/health/health_test.go
Normal file
16
features/health/health_test.go
Normal file
@@ -0,0 +1,16 @@
|
||||
package health
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"dance-lessons-coach/pkg/bdd/testsetup"
|
||||
)
|
||||
|
||||
func TestHealthBDD(t *testing.T) {
|
||||
config := testsetup.NewFeatureConfig("health", "progress", false)
|
||||
suite := testsetup.CreateTestSuite(t, config, "dance-lessons-coach BDD Tests - Health Feature")
|
||||
|
||||
if suite.Run() != 0 {
|
||||
t.Fatal("non-zero status returned, failed to run health BDD tests")
|
||||
}
|
||||
}
|
||||
181
features/jwt/jwt_secret_retention.feature
Normal file
181
features/jwt/jwt_secret_retention.feature
Normal file
@@ -0,0 +1,181 @@
|
||||
# features/jwt_secret_retention.feature
|
||||
Feature: JWT Secret Retention Policy
|
||||
As a system administrator
|
||||
I want automatic cleanup of expired JWT secrets
|
||||
So that we can maintain security while ensuring system performance
|
||||
|
||||
Background:
|
||||
Given the server is running with JWT secret retention configured
|
||||
And the default JWT TTL is 24 hours
|
||||
And the retention factor is 2.0
|
||||
And the maximum retention is 72 hours
|
||||
|
||||
Scenario: Automatic cleanup of expired secrets
|
||||
Given a primary JWT secret exists
|
||||
And I add a secondary JWT secret with 1 hour expiration
|
||||
When I wait for the retention period to elapse
|
||||
Then the expired secondary secret should be automatically removed
|
||||
And the primary secret should remain active
|
||||
And I should see cleanup event in logs
|
||||
|
||||
Scenario: Secret retention based on TTL factor
|
||||
Given the JWT TTL is set to 2 hours
|
||||
And the retention factor is 3.0
|
||||
When I add a new JWT secret
|
||||
Then the secret should expire after 6 hours
|
||||
And the retention period should be 6 hours
|
||||
|
||||
Scenario: Maximum retention period enforcement
|
||||
Given the JWT TTL is set to 72 hours
|
||||
And the retention factor is 3.0
|
||||
And the maximum retention is 72 hours
|
||||
When I add a new JWT secret
|
||||
Then the retention period should be capped at 72 hours
|
||||
And not exceed the maximum retention limit
|
||||
|
||||
Scenario: Cleanup preserves primary secret
|
||||
Given a primary JWT secret exists
|
||||
And the primary secret is older than retention period
|
||||
When the cleanup job runs
|
||||
Then the primary secret should not be removed
|
||||
And the primary secret should remain active
|
||||
|
||||
@todo
|
||||
Scenario: Multiple secrets with different ages
|
||||
Given I have 3 JWT secrets of different ages
|
||||
And secret A is 1 hour old (within retention)
|
||||
And secret B is 50 hours old (expired)
|
||||
And secret C is the primary secret
|
||||
When the cleanup job runs
|
||||
Then secret A should be retained
|
||||
And secret B should be removed
|
||||
And secret C should be retained as primary
|
||||
|
||||
@todo
|
||||
Scenario: Cleanup frequency configuration
|
||||
Given the cleanup interval is set to 30 minutes
|
||||
When I add an expired JWT secret
|
||||
Then it should be removed within 30 minutes
|
||||
And I should see cleanup events every 30 minutes
|
||||
|
||||
@todo
|
||||
Scenario: Token validation with expired secret
|
||||
Given a user "retentionuser" exists with password "testpass123"
|
||||
And I authenticate with username "retentionuser" and password "testpass123"
|
||||
And I receive a valid JWT token signed with current secret
|
||||
When I wait for the secret to expire
|
||||
And I try to validate the expired token
|
||||
Then the token validation should fail
|
||||
And I should receive "invalid_token" error
|
||||
|
||||
@todo
|
||||
Scenario: Graceful rotation during retention period
|
||||
Given a user "gracefuluser" exists with password "testpass123"
|
||||
And I authenticate with username "gracefuluser" and password "testpass123"
|
||||
And I receive a valid JWT token signed with primary secret
|
||||
When I add a new secondary secret and rotate to it
|
||||
And I authenticate again with username "gracefuluser" and password "testpass123"
|
||||
Then I should receive a new token signed with secondary secret
|
||||
And the old token should still be valid during retention period
|
||||
And both tokens should work until retention period expires
|
||||
|
||||
Scenario: Configuration validation
|
||||
Given I set retention factor to 0.5
|
||||
When I try to start the server
|
||||
Then I should receive configuration validation error
|
||||
And the error should mention "retention_factor must be ≥ 1.0"
|
||||
|
||||
@todo @nice_to_have
|
||||
Scenario: Metrics for secret retention
|
||||
Given I have enabled Prometheus metrics
|
||||
When the cleanup job removes expired secrets
|
||||
Then I should see "jwt_secrets_expired_total" metric increment
|
||||
And I should see "jwt_secrets_active_count" metric decrease
|
||||
And I should see "jwt_secret_retention_duration_seconds" histogram update
|
||||
|
||||
@todo @nice_to_have
|
||||
Scenario: Log masking for security
|
||||
Given I add a new JWT secret "super-secret-key-123456"
|
||||
When the cleanup job runs
|
||||
Then the logs should show masked secret "supe****123456"
|
||||
And not expose the full secret in logs
|
||||
|
||||
@todo
|
||||
Scenario: Cleanup with high volume of secrets
|
||||
Given I have 1000 JWT secrets
|
||||
And 300 of them are expired
|
||||
When the cleanup job runs
|
||||
Then it should complete within 100 milliseconds
|
||||
And remove all 300 expired secrets
|
||||
And not impact server performance
|
||||
|
||||
@todo
|
||||
Scenario: Disabled cleanup via configuration
|
||||
Given I set cleanup interval to 8760 hours
|
||||
When I add expired JWT secrets
|
||||
Then they should not be automatically removed
|
||||
And manual cleanup should still be possible
|
||||
|
||||
@todo
|
||||
Scenario: Retention period calculation edge cases
|
||||
Given the JWT TTL is 1 hour
|
||||
And the retention factor is 1.0
|
||||
When I add a new JWT secret
|
||||
Then the retention period should be 1 hour
|
||||
And the secret should expire after 1 hour
|
||||
|
||||
@todo
|
||||
Scenario: Secret validation with retention policy
|
||||
Given I try to add an invalid JWT secret
|
||||
When the secret is less than 16 characters
|
||||
Then I should receive validation error
|
||||
And the error should mention "must be at least 16 characters"
|
||||
|
||||
@todo
|
||||
Scenario: Cleanup job error handling
|
||||
Given the cleanup job encounters an error
|
||||
When it tries to remove a secret
|
||||
Then it should log the error
|
||||
And continue with remaining secrets
|
||||
And not crash the cleanup process
|
||||
|
||||
@todo
|
||||
Scenario: Configuration reload without restart
|
||||
Given the server is running with default retention settings
|
||||
When I update the retention factor via configuration
|
||||
Then the new settings should take effect immediately
|
||||
And existing secrets should be reevaluated
|
||||
And cleanup should use new retention periods
|
||||
|
||||
@todo @nice_to_have
|
||||
Scenario: Audit trail for secret operations
|
||||
Given I enable audit logging
|
||||
When I add a new JWT secret
|
||||
Then I should see audit log entry with event type "secret_added"
|
||||
And when the secret is removed by cleanup
|
||||
Then I should see audit log entry with event type "secret_removed"
|
||||
|
||||
@todo
|
||||
Scenario: Retention policy with token refresh
|
||||
Given a user "refreshuser" exists with password "testpass123"
|
||||
And I authenticate and receive token A
|
||||
When I refresh my token during retention period
|
||||
Then I should receive new token B
|
||||
And token A should still be valid until retention expires
|
||||
And both tokens should work concurrently
|
||||
|
||||
@todo
|
||||
Scenario: Emergency secret rotation
|
||||
Given a security incident requires immediate rotation
|
||||
When I rotate to a new primary secret
|
||||
Then old tokens should be invalidated immediately
|
||||
And new tokens should use the emergency secret
|
||||
And cleanup should remove compromised secrets
|
||||
|
||||
@todo @nice_to_have
|
||||
Scenario: Monitoring and alerting
|
||||
Given I have monitoring configured
|
||||
When the cleanup job fails repeatedly
|
||||
Then I should receive alert notification
|
||||
And the alert should include error details
|
||||
And suggest remediation steps
|
||||
54
features/jwt/jwt_secret_rotation.feature
Normal file
54
features/jwt/jwt_secret_rotation.feature
Normal file
@@ -0,0 +1,54 @@
|
||||
# features/jwt_secret_rotation.feature
|
||||
Feature: JWT Secret Rotation
|
||||
As a system administrator
|
||||
I want to rotate JWT secrets without disrupting users
|
||||
So that we can maintain security while ensuring continuous service
|
||||
|
||||
Scenario: Authentication with multiple valid JWT secrets
|
||||
Given the server is running with multiple JWT secrets
|
||||
And a user "multiuser" exists with password "testpass123"
|
||||
When I authenticate with username "multiuser" and password "testpass123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token signed with the primary secret
|
||||
|
||||
Scenario: Token validation with multiple valid secrets
|
||||
Given the server is running with multiple JWT secrets
|
||||
And a user "tokenuser" exists with password "testpass123"
|
||||
When I authenticate with username "tokenuser" and password "testpass123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token
|
||||
When I validate a JWT token signed with the secondary secret
|
||||
Then the token should be valid
|
||||
And it should contain the correct user ID
|
||||
|
||||
Scenario: Secret rotation - adding new secret while keeping old one valid
|
||||
Given the server is running with primary JWT secret
|
||||
And a user "rotateuser" exists with password "testpass123"
|
||||
When I authenticate with username "rotateuser" and password "testpass123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token signed with the primary secret
|
||||
When I add a new secondary JWT secret to the server
|
||||
And I authenticate with username "rotateuser" and password "testpass123" again
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token signed with the new secondary secret
|
||||
When I validate the old JWT token signed with primary secret
|
||||
Then the token should still be valid
|
||||
|
||||
Scenario: Token rejection after secret expiration
|
||||
Given the server is running with primary and expired secondary JWT secrets
|
||||
When I use a JWT token signed with the expired secondary secret for authentication
|
||||
Then the authentication should fail
|
||||
And the response should contain error "invalid_token"
|
||||
|
||||
Scenario: Graceful secret rotation with user continuity
|
||||
Given the server is running with primary JWT secret
|
||||
And a user "gracefuluser" exists with password "testpass123"
|
||||
When I authenticate with username "gracefuluser" and password "testpass123"
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token signed with the primary secret
|
||||
When I add a new secondary JWT secret and rotate to it
|
||||
And I use the old JWT token signed with primary secret
|
||||
Then the token should still be valid during retention period
|
||||
When I authenticate with username "gracefuluser" and password "testpass123" after rotation
|
||||
Then the authentication should be successful
|
||||
And I should receive a valid JWT token signed with the new secondary secret
|
||||
16
features/jwt/jwt_test.go
Normal file
16
features/jwt/jwt_test.go
Normal file
@@ -0,0 +1,16 @@
|
||||
package jwt
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
"dance-lessons-coach/pkg/bdd/testsetup"
|
||||
)
|
||||
|
||||
func TestJWTBDD(t *testing.T) {
|
||||
config := testsetup.NewFeatureConfig("jwt", "pretty", false)
|
||||
suite := testsetup.CreateTestSuite(t, config, "dance-lessons-coach BDD Tests - JWT Feature")
|
||||
|
||||
if suite.Run() != 0 {
|
||||
t.Fatal("non-zero status returned, failed to run jwt BDD tests")
|
||||
}
|
||||
}
|
||||
18
go.mod
18
go.mod
@@ -8,9 +8,12 @@ require (
|
||||
github.com/go-playground/locales v0.14.1
|
||||
github.com/go-playground/universal-translator v0.18.1
|
||||
github.com/go-playground/validator/v10 v10.30.2
|
||||
github.com/golang-jwt/jwt/v5 v5.3.1
|
||||
github.com/lib/pq v1.12.3
|
||||
github.com/rs/zerolog v1.35.0
|
||||
github.com/spf13/cobra v1.8.0
|
||||
github.com/spf13/viper v1.21.0
|
||||
github.com/stretchr/testify v1.11.1
|
||||
github.com/swaggo/http-swagger v1.3.4
|
||||
github.com/swaggo/swag v1.16.6
|
||||
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.67.0
|
||||
@@ -18,6 +21,10 @@ require (
|
||||
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.43.0
|
||||
go.opentelemetry.io/otel/sdk v1.43.0
|
||||
go.opentelemetry.io/otel/trace v1.43.0
|
||||
golang.org/x/crypto v0.49.0
|
||||
gorm.io/driver/postgres v1.6.0
|
||||
gorm.io/driver/sqlite v1.6.0
|
||||
gorm.io/gorm v1.31.1
|
||||
)
|
||||
|
||||
require (
|
||||
@@ -26,6 +33,7 @@ require (
|
||||
github.com/cespare/xxhash/v2 v2.3.0 // indirect
|
||||
github.com/cucumber/gherkin/go/v26 v26.2.0 // indirect
|
||||
github.com/cucumber/messages/go/v21 v21.0.1 // indirect
|
||||
github.com/davecgh/go-spew v1.1.1 // indirect
|
||||
github.com/felixge/httpsnoop v1.0.4 // indirect
|
||||
github.com/fsnotify/fsnotify v1.9.0 // indirect
|
||||
github.com/gabriel-vasile/mimetype v1.4.13 // indirect
|
||||
@@ -43,12 +51,20 @@ require (
|
||||
github.com/hashicorp/go-memdb v1.3.5 // indirect
|
||||
github.com/hashicorp/golang-lru v1.0.2 // indirect
|
||||
github.com/inconshreveable/mousetrap v1.1.0 // indirect
|
||||
github.com/jackc/pgpassfile v1.0.0 // indirect
|
||||
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 // indirect
|
||||
github.com/jackc/pgx/v5 v5.6.0 // indirect
|
||||
github.com/jackc/puddle/v2 v2.2.2 // indirect
|
||||
github.com/jinzhu/inflection v1.0.0 // indirect
|
||||
github.com/jinzhu/now v1.1.5 // indirect
|
||||
github.com/josharian/intern v1.0.0 // indirect
|
||||
github.com/leodido/go-urn v1.4.0 // indirect
|
||||
github.com/mailru/easyjson v0.7.6 // indirect
|
||||
github.com/mattn/go-colorable v0.1.14 // indirect
|
||||
github.com/mattn/go-isatty v0.0.20 // indirect
|
||||
github.com/mattn/go-sqlite3 v1.14.22 // indirect
|
||||
github.com/pelletier/go-toml/v2 v2.2.4 // indirect
|
||||
github.com/pmezard/go-difflib v1.0.0 // indirect
|
||||
github.com/sagikazarmark/locafero v0.11.0 // indirect
|
||||
github.com/sourcegraph/conc v0.3.1-0.20240121214520-5f936abd7ae8 // indirect
|
||||
github.com/spf13/afero v1.15.0 // indirect
|
||||
@@ -61,7 +77,6 @@ require (
|
||||
go.opentelemetry.io/otel/metric v1.43.0 // indirect
|
||||
go.opentelemetry.io/proto/otlp v1.10.0 // indirect
|
||||
go.yaml.in/yaml/v3 v3.0.4 // indirect
|
||||
golang.org/x/crypto v0.49.0 // indirect
|
||||
golang.org/x/mod v0.33.0 // indirect
|
||||
golang.org/x/net v0.52.0 // indirect
|
||||
golang.org/x/sync v0.20.0 // indirect
|
||||
@@ -73,4 +88,5 @@ require (
|
||||
google.golang.org/grpc v1.80.0 // indirect
|
||||
google.golang.org/protobuf v1.36.11 // indirect
|
||||
gopkg.in/yaml.v2 v2.4.0 // indirect
|
||||
gopkg.in/yaml.v3 v3.0.1 // indirect
|
||||
)
|
||||
|
||||
25
go.sum
25
go.sum
@@ -56,6 +56,8 @@ github.com/gofrs/uuid v4.2.0+incompatible/go.mod h1:b2aQJv3Z4Fp6yNu3cdSllBxTCLRx
|
||||
github.com/gofrs/uuid v4.3.1+incompatible/go.mod h1:b2aQJv3Z4Fp6yNu3cdSllBxTCLRxnplIgP/c0N/04lM=
|
||||
github.com/gofrs/uuid v4.4.0+incompatible h1:3qXRTX8/NbyulANqlc0lchS1gqAVxRgsuW1YrTJupqA=
|
||||
github.com/gofrs/uuid v4.4.0+incompatible/go.mod h1:b2aQJv3Z4Fp6yNu3cdSllBxTCLRxnplIgP/c0N/04lM=
|
||||
github.com/golang-jwt/jwt/v5 v5.3.1 h1:kYf81DTWFe7t+1VvL7eS+jKFVWaUnK9cB1qbwn63YCY=
|
||||
github.com/golang-jwt/jwt/v5 v5.3.1/go.mod h1:fxCRLWMO43lRc8nhHWY6LGqRcf+1gQWArsqaEUEa5bE=
|
||||
github.com/golang/protobuf v1.5.4 h1:i7eJL8qZTpSEXOPTxNKhASYpMn+8e5Q6AdndVa1dWek=
|
||||
github.com/golang/protobuf v1.5.4/go.mod h1:lnTiLA8Wa4RWRcIUkrtSVa5nRhsEGBg48fD6rSs7xps=
|
||||
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
|
||||
@@ -79,6 +81,18 @@ github.com/hashicorp/golang-lru v1.0.2 h1:dV3g9Z/unq5DpblPpw+Oqcv4dU/1omnb4Ok8iP
|
||||
github.com/hashicorp/golang-lru v1.0.2/go.mod h1:iADmTwqILo4mZ8BN3D2Q6+9jd8WM5uGBxy+E8yxSoD4=
|
||||
github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8=
|
||||
github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=
|
||||
github.com/jackc/pgpassfile v1.0.0 h1:/6Hmqy13Ss2zCq62VdNG8tM1wchn8zjSGOBJ6icpsIM=
|
||||
github.com/jackc/pgpassfile v1.0.0/go.mod h1:CEx0iS5ambNFdcRtxPj5JhEz+xB6uRky5eyVu/W2HEg=
|
||||
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 h1:iCEnooe7UlwOQYpKFhBabPMi4aNAfoODPEFNiAnClxo=
|
||||
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761/go.mod h1:5TJZWKEWniPve33vlWYSoGYefn3gLQRzjfDlhSJ9ZKM=
|
||||
github.com/jackc/pgx/v5 v5.6.0 h1:SWJzexBzPL5jb0GEsrPMLIsi/3jOo7RHlzTjcAeDrPY=
|
||||
github.com/jackc/pgx/v5 v5.6.0/go.mod h1:DNZ/vlrUnhWCoFGxHAG8U2ljioxukquj7utPDgtQdTw=
|
||||
github.com/jackc/puddle/v2 v2.2.2 h1:PR8nw+E/1w0GLuRFSmiioY6UooMp6KJv0/61nB7icHo=
|
||||
github.com/jackc/puddle/v2 v2.2.2/go.mod h1:vriiEXHvEE654aYKXXjOvZM39qJ0q+azkZFrfEOc3H4=
|
||||
github.com/jinzhu/inflection v1.0.0 h1:K317FqzuhWc8YvSVlFMCCUb36O/S9MCKRDI7QkRKD/E=
|
||||
github.com/jinzhu/inflection v1.0.0/go.mod h1:h+uFLlag+Qp1Va5pdKtLDYj+kHp5pxUVkryuEj+Srlc=
|
||||
github.com/jinzhu/now v1.1.5 h1:/o9tlHleP7gOFmsnYNz3RGnqzefHA47wQpKrrdTIwXQ=
|
||||
github.com/jinzhu/now v1.1.5/go.mod h1:d3SSVoowX0Lcu0IBviAWJpolVfI5UJVZZ7cO71lE/z8=
|
||||
github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
|
||||
github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
|
||||
github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=
|
||||
@@ -91,6 +105,8 @@ github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
|
||||
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
|
||||
github.com/leodido/go-urn v1.4.0 h1:WT9HwE9SGECu3lg4d/dIA+jxlljEa1/ffXKmRjqdmIQ=
|
||||
github.com/leodido/go-urn v1.4.0/go.mod h1:bvxc+MVxLKB4z00jd1z+Dvzr47oO32F/QSNjSBOlFxI=
|
||||
github.com/lib/pq v1.12.3 h1:tTWxr2YLKwIvK90ZXEw8GP7UFHtcbTtty8zsI+YjrfQ=
|
||||
github.com/lib/pq v1.12.3/go.mod h1:/p+8NSbOcwzAEI7wiMXFlgydTwcgTr3OSKMsD2BitpA=
|
||||
github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
|
||||
github.com/mailru/easyjson v0.0.0-20190626092158-b2ccc519800e/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc=
|
||||
github.com/mailru/easyjson v0.7.6 h1:8yTIVnZgCoiM1TgqoeTl+LfU5Jg6/xL3QhGQnimLYnA=
|
||||
@@ -99,6 +115,8 @@ github.com/mattn/go-colorable v0.1.14 h1:9A9LHSqF/7dyVVX6g0U9cwm9pG3kP9gSzcuIPHP
|
||||
github.com/mattn/go-colorable v0.1.14/go.mod h1:6LmQG8QLFO4G5z1gPvYEzlUgJ2wF+stgPZH1UqBm1s8=
|
||||
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
|
||||
github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
|
||||
github.com/mattn/go-sqlite3 v1.14.22 h1:2gZY6PC6kBnID23Tichd1K+Z0oS6nE/XwU+Vz/5o4kU=
|
||||
github.com/mattn/go-sqlite3 v1.14.22/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y=
|
||||
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno=
|
||||
github.com/pelletier/go-toml/v2 v2.2.4 h1:mye9XuhQ6gvn5h28+VilKrrPoQVanw5PMw/TB0t5Ec4=
|
||||
github.com/pelletier/go-toml/v2 v2.2.4/go.mod h1:2gIqNv+qfxSVS7cM2xJQKtLSTLUE9V8t9Stt+h56mCY=
|
||||
@@ -131,6 +149,7 @@ github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSS
|
||||
github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
|
||||
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
|
||||
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
|
||||
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
|
||||
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
|
||||
github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
|
||||
github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
|
||||
@@ -212,3 +231,9 @@ gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C
|
||||
gopkg.in/yaml.v3 v3.0.0-20200615113413-eeeca48fe776/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
|
||||
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
|
||||
gorm.io/driver/postgres v1.6.0 h1:2dxzU8xJ+ivvqTRph34QX+WrRaJlmfyPqXmoGVjMBa4=
|
||||
gorm.io/driver/postgres v1.6.0/go.mod h1:vUw0mrGgrTK+uPHEhAdV4sfFELrByKVGnaVRkXDhtWo=
|
||||
gorm.io/driver/sqlite v1.6.0 h1:WHRRrIiulaPiPFmDcod6prc4l2VGVWHz80KspNsxSfQ=
|
||||
gorm.io/driver/sqlite v1.6.0/go.mod h1:AO9V1qIQddBESngQUKWL9yoH93HIeA1X6V633rBwyT8=
|
||||
gorm.io/gorm v1.31.1 h1:7CA8FTFz/gRfgqgpeKIBcervUn3xSyPUmr6B2WXJ7kg=
|
||||
gorm.io/gorm v1.31.1/go.mod h1:XyQVbO2k6YkOis7C2437jSit3SsDK72s7n7rsSHd+Gs=
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user