From b299469d00f1cbb4f23f79002a8bd7772ad96e87 Mon Sep 17 00:00:00 2001 From: Gabriel Radureau Date: Wed, 8 Apr 2026 11:09:34 +0200 Subject: [PATCH] Consolidate ADRs into docs/adr/ This commit moves Architecture Decision Records (ADRs) from ../../../docs/adr/ to docs/adr/ in the arcodange/factory repository. This centralizes all ADRs in one location for better maintainability and discoverability. Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe --- .../docs/adr/20260407-cicd-architecture.md | 160 +++++++++ .../20260407-docker-storage-gitea-runner.md | 130 +++++++ .../docs/adr/20260407-network-architecture.md | 334 ++++++++++++++++++ 3 files changed, 624 insertions(+) create mode 100644 ansible/arcodange/factory/docs/adr/20260407-cicd-architecture.md create mode 100644 ansible/arcodange/factory/docs/adr/20260407-docker-storage-gitea-runner.md create mode 100644 ansible/arcodange/factory/docs/adr/20260407-network-architecture.md diff --git a/ansible/arcodange/factory/docs/adr/20260407-cicd-architecture.md b/ansible/arcodange/factory/docs/adr/20260407-cicd-architecture.md new file mode 100644 index 0000000..842b4e6 --- /dev/null +++ b/ansible/arcodange/factory/docs/adr/20260407-cicd-architecture.md @@ -0,0 +1,160 @@ +# ADR 20260407: CI/CD Architecture with ArgoCD, Gitea, and Vault + +## Status +Proposed + +## Context +The home lab requires a secure and automated CI/CD pipeline to deploy applications to the k3s cluster. The pipeline must integrate with: +- **Gitea**: For Git repository management and CI runners. +- **ArgoCD**: For GitOps-based continuous deployment. +- **Vault**: For secrets management and OIDC authentication. +- **Gitea Act Runner**: For executing CI jobs. + +## Decision +We will implement a **GitOps-driven CI/CD pipeline** with the following components: + +### 1. Gitea OIDC Authentication with Vault +- Gitea is registered as an OIDC application in Vault. +- Vault issues short-lived tokens for Gitea users. +- The `gitea_oidc_auth.yml` playbook automates this setup using Playwright and OpenTofu. +- **OIDC Workflow**: + 1. The `oidc_jwt_token.sh` script (base64-encoded in `secrets.vault_oauth__sh_b64`) handles the OIDC flow. + 2. Gitea Act Runner executes the script to obtain an ID token from Gitea. + 3. The ID token is used to authenticate with Vault and retrieve secrets. + +### 2. Gitea Act Runner +- Deployed on `pi1` and `pi3` (not on the Gitea host, which is `pi2`). +- Uses Docker-in-Docker for job execution. +- **Custom Runner Image (`ubuntu-latest-ca`)**: Required due to the self-signed `.lab` domain. The custom image includes the local CA certificate to trust the Gitea instance (`gitea.arcodange.lab`). +- Managed via Docker Compose (`03_cicd.yml`). + +### 3. ArgoCD +- Deployed on the k3s cluster (via HelmChart in `/var/lib/rancher/k3s/server/manifests/argocd.yaml`). +- Uses Gitea as the source of truth for GitOps. +- Synchronizes the `factory` repository to deploy applications. +- Configured with Traefik for TLS termination. + +### 4. Vault Secrets Operator +- Deployed in the `tools` namespace. +- Manages secrets for applications deployed via ArgoCD. +- Integrates with Gitea OIDC for authentication. +- **Helm Chart Integration**: + - `VaultAuth`: Authenticates with Vault using Kubernetes service accounts. + - `VaultStaticSecret`: Retrieves static secrets (e.g., `kvv2/webapp/config`). + - `VaultDynamicSecret`: Generates dynamic secrets (e.g., PostgreSQL credentials). + +### 5. Security +- **TLS**: Traefik terminates TLS using Let's Encrypt. +- **OIDC**: Gitea authentication via Vault. +- **Secrets**: Stored in Vault, injected via the Vault Secrets Operator. + +## Architecture Diagram + +```mermaid +%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#333333', 'edgeLabelBackground':'#f0f0f0', 'tertiaryColor': '#e67e22'}}}%% +graph TD + %% Styles + classDef gitea fill:#ffcc99,stroke:#cc9966,color:#333; + classDef argocd fill:#99ffcc,stroke:#66cc99,color:#333; + classDef vault fill:#ccccff,stroke:#6666cc,color:#333; + classDef k3s fill:#ff9999,stroke:#cc0000,color:#333; + classDef runner fill:#ffff99,stroke:#cccc00,color:#333; + + %% Components + Gitea["Gitea (pi2)"]:::gitea + ArgoCD["ArgoCD (k3s)"]:::argocd + Vault["Vault (k3s/tools)"]:::vault + Runner1["Gitea Act Runner (pi1)"]:::runner + Runner2["Gitea Act Runner (pi3)"]:::runner + VaultOperator["Vault Secrets Operator (k3s/tools)"]:::vault + k3s["k3s Cluster"]:::k3s + + %% Workflow + Gitea -->|OIDC Auth| Vault + Gitea -->|Trigger CI| Runner1 + Gitea -->|Trigger CI| Runner2 + Runner1 -->|Deploy to| k3s + Runner2 -->|Deploy to| k3s + ArgoCD -->|GitOps Sync| Gitea + ArgoCD -->|Deploy Apps| k3s + VaultOperator -->|Inject Secrets| k3s + Vault -->|Secrets| VaultOperator + + %% Annotations + linkStyle 0,1,2,3,4,5,6,7 stroke:#999,stroke-width:1px; +``` + +## Consequences + +### Positive +- **Automated Deployments**: ArgoCD ensures the cluster state matches Git. +- **Secure Secrets**: Vault centralizes secret management. +- **Scalable CI**: Gitea Act Runners can be added to any host. +- **OIDC Integration**: Secure authentication via Vault. + +### Negative +- **Complexity**: Multiple moving parts (Gitea, ArgoCD, Vault). +- **Dependency on Vault**: If Vault fails, CI/CD may be disrupted. +- **Learning Curve**: Requires familiarity with GitOps and Vault. + +## Alternatives Considered + +### Alternative 1: GitHub Actions +- **Rejected**: Self-hosted Gitea aligns better with the home lab's privacy goals. + +### Alternative 2: Jenkins +- **Rejected**: ArgoCD + Gitea Act Runner is lighter and more GitOps-native. + +### Alternative 3: No CI/CD +- **Rejected**: Manual deployments are error-prone and unscalable. + +## Sequence Diagrams + +### 1. CI/CD Workflow for OpenTofu/Terraform + +```mermaid +sequenceDiagram + participant Gitea + participant Runner as Gitea Act Runner (pi1/pi3) + participant Vault + participant WebApp as WebApp (k3s) + + Gitea->>Runner: Trigger vault.yaml workflow + Runner->>Gitea: Execute vault_oauth__sh_b64 (OIDC) + Gitea-->>Runner: Return ID Token + Runner->>Vault: Authenticate with ID Token + Vault-->>Runner: Return Vault Token + Runner->>Runner: Run OpenTofu/Terraform + Runner->>Vault: Fetch Secrets (via Vault Action) + Vault-->>Runner: Return Secrets + Runner->>WebApp: Deploy Changes +``` + +### 2. Vault Secrets Operator Workflow + +```mermaid +sequenceDiagram + participant ArgoCD + participant WebApp as WebApp (k3s) + participant VaultOperator as Vault Secrets Operator + participant Vault + + ArgoCD->>WebApp: Deploy Helm Chart + WebApp->>VaultOperator: Create VaultAuth (K8s Auth) + VaultOperator->>Vault: Authenticate (K8s Service Account) + Vault-->>VaultOperator: Return Vault Token + WebApp->>VaultOperator: Create VaultStaticSecret (kvv2/webapp/config) + VaultOperator->>Vault: Fetch Static Secret + Vault-->>VaultOperator: Return Secret + VaultOperator->>WebApp: Inject Secret (secretkv) + WebApp->>VaultOperator: Create VaultDynamicSecret (postgres/creds/webapp) + VaultOperator->>Vault: Generate Dynamic Secret + Vault-->>VaultOperator: Return Credentials + VaultOperator->>WebApp: Inject Credentials (vso-db-credentials) + WebApp->>WebApp: Restart Pods (Rollout) +``` + +## Success Metrics +- Gitea Act Runners successfully execute CI jobs. +- ArgoCD synchronizes the `factory` repository without errors. +- Vault Secrets Operator injects secrets into deployed applications. diff --git a/ansible/arcodange/factory/docs/adr/20260407-docker-storage-gitea-runner.md b/ansible/arcodange/factory/docs/adr/20260407-docker-storage-gitea-runner.md new file mode 100644 index 0000000..88116ba --- /dev/null +++ b/ansible/arcodange/factory/docs/adr/20260407-docker-storage-gitea-runner.md @@ -0,0 +1,130 @@ +# ADR 20260407: Docker Storage Optimization for Gitea Act Runner + +## Status +Proposed + +## Context +The `pi3` machine (Raspberry Pi) is running both Docker and k3s, with the following storage constraints: +- Root filesystem (`/dev/mmcblk0p2`): 58G total, 89% used (6.4G free) +- External disk (`/dev/sda1`): 458G total, 22G used (413G free) + +Gitea Act Runner images (`ubuntu-latest` and `ubuntu-latest-ca`) are frequently deleted, likely due to Docker's automatic garbage collection triggered by low disk space. This disrupts CI/CD pipelines. + +### Current Setup +- Docker is configured via Ansible (`system_docker.yml`) using the `geerlingguy.docker` role. +- k3s is configured to use Docker as the container runtime (`--docker` flag). +- Longhorn is used for persistent storage in k3s, and we want to preserve its performance. + +## Decision +We will implement a **hybrid storage strategy** to prevent Gitea Act Runner image deletion while maintaining Longhorn performance: + +### 1. Pin Critical Images +Use a dummy container to pin the Gitea Act Runner images: +```yaml +# Add to system_docker.yml or a new playbook +- name: Pin Gitea Act Runner images + community.docker.docker_container: + name: pin-gitea-runner-ubuntu-latest-ca + image: gitea.arcodange.lab/arcodange-org/runner-images:ubuntu-latest-ca + state: present + command: ["sh", "-c", "sleep infinity"] + auto_remove: false + restart_policy: unless-stopped +``` + +### 2. Configure Docker Storage with Overlay on External Disk +Modify `/etc/docker/daemon.json` to use the external disk for storage while keeping the root filesystem for metadata: +```json +{ + "data-root": "/mnt/arcodange/docker", + "storage-driver": "overlay2", + "storage-opts": ["overlay2.override_kernel_check=true"] +} +``` + +### 3. Ansible Implementation +Update `system_docker.yml` to: +1. Create `/mnt/arcodange/docker` if it doesn't exist. +2. Configure Docker to use the external disk. +3. Pin critical images post-installation. + +```yaml +# Add to system_docker.yml tasks +- name: Ensure Docker storage directory exists on external disk + ansible.builtin.file: + path: /mnt/arcodange/docker + state: directory + mode: '0755' + owner: root + group: docker + +- name: Configure Docker to use external storage + ansible.builtin.copy: + dest: /etc/docker/daemon.json + content: | + { + "data-root": "/mnt/arcodange/docker", + "storage-driver": "overlay2", + "storage-opts": ["overlay2.override_kernel_check=true"], + "log-driver": "json-file", + "log-opts": { + "max-size": "10m", + "max-file": "5" + } + } + mode: '0644' + notify: Redémarrer Docker + +- name: Pin Gitea Act Runner images + community.docker.docker_container: + name: "{{ item.name }}" + image: "{{ item.image }}" + state: present + command: ["sh", "-c", "sleep infinity"] + auto_remove: false + restart_policy: unless-stopped + loop: + - { name: "pin-gitea-runner-ubuntu-latest", image: "gitea/runner-images:ubuntu-latest" } + - { name: "pin-gitea-runner-ubuntu-latest-ca", image: "gitea.arcodange.lab/arcodange-org/runner-images:ubuntu-latest-ca" } +``` + +## Consequences + +### Positive +- **Prevents Image Deletion**: Critical images are pinned and won't be garbage-collected. +- **Preserves Longhorn Performance**: Longhorn continues to use the root filesystem for its operations, maintaining performance. +- **Scalable Storage**: Docker images are stored on the external disk (413G free), preventing root filesystem exhaustion. +- **No k3s Changes Required**: k3s continues to use Docker as the runtime without modification. + +### Negative +- **Migration Effort**: Existing Docker data must be migrated to the external disk (one-time operation). +- **Dependency on External Disk**: If `/dev/sda1` fails, Docker will not function until the disk is remounted or the configuration is reverted. +- **Slight Performance Overhead**: Accessing images from the external disk may be slightly slower than the root filesystem (mitigated by SSD/HDD performance). + +## Alternatives Considered + +### Alternative 1: Increase Root Filesystem Size +- **Rejected**: The SD card is already at capacity, and expanding it is not feasible. + +### Alternative 2: Disable Docker Garbage Collection +- **Rejected**: This would risk filling the root filesystem completely, causing system instability. + +### Alternative 3: Use k3s Image Garbage Collection +- **Rejected**: k3s does not provide fine-grained control over image retention for non-k8s workloads (e.g., Gitea Act Runner). + +### Alternative 4: Save/Load Images Manually +- **Rejected**: Manual intervention is not scalable and does not address the root cause. + +## Migration Plan +1. **Backup**: Save critical images to `/mnt/arcodange`: + ```bash + docker save gitea.arcodange.lab/arcodange-org/runner-images:ubuntu-latest-ca -o /mnt/arcodange/gitea-runner-backup.tar + ``` +2. **Update Ansible**: Apply the changes to `system_docker.yml`. +3. **Run Playbook**: Execute the playbook to reconfigure Docker. +4. **Verify**: Ensure Gitea Act Runner functions correctly post-migration. + +## Success Metrics +- Gitea Act Runner images are no longer deleted between runs. +- Root filesystem usage drops below 80%. +- CI/CD pipelines complete without image pull errors. diff --git a/ansible/arcodange/factory/docs/adr/20260407-network-architecture.md b/ansible/arcodange/factory/docs/adr/20260407-network-architecture.md new file mode 100644 index 0000000..b40d097 --- /dev/null +++ b/ansible/arcodange/factory/docs/adr/20260407-network-architecture.md @@ -0,0 +1,334 @@ +# ADR 20260407: Network Architecture + +## Status +Proposed + +## Context +The home lab requires a secure and resilient network architecture to support: +- Internal services (`.lab` domain). +- External services (`.arcodange.fr` domain). +- DNS resolution and ad-blocking (Pi-hole). +- TLS certificate management (Step CA). +- Ingress routing (Traefik). +- CDN and DDoS protection (Cloudflare). + +## Decision +We will implement a **multi-layered network architecture** with the following components: + +### 1. External Layer (Internet) +- **Cloudflare**: CDN, DDoS protection, and DNS for `.arcodange.fr`. +- **DuckDNS**: Dynamic DNS for external access. +- **Livebox**: ISP-provided gateway (NAT, DHCP, firewall). + +### 2. Internal Layer (Home Lab) +- **Pi-hole (pi1, pi3)**: DNS sinkhole for ad-blocking and internal DNS resolution. +- **Step CA (pi1)**: Internal certificate authority for `.lab` domain. +- **Traefik (k3s)**: Ingress controller with TLS termination. +- **k3s Cluster**: Hosts internal services with Longhorn storage. + +### 3. DNS Architecture +- **Pi-hole**: Primary DNS for internal clients. + - Forwards `.lab` queries to Step CA. + - Forwards external queries to Cloudflare (1.1.1.1). +- **Step CA**: Issues certificates for `.lab` services. +- **Cloudflare**: Manages `.arcodange.fr` DNS records. + +### 4. Ingress and TLS +- **Traefik**: Terminates TLS for both `.lab` and `.arcodange.fr` domains. + - Uses Let's Encrypt for `.arcodange.fr`. + - Uses Step CA for `.lab`. +- **Helm Chart Annotations**: + - `traefik.ingress.kubernetes.io/router.entrypoints: websecure` + - `traefik.ingress.kubernetes.io/router.tls.certresolver: letsencrypt` + - `traefik.ingress.kubernetes.io/router.middlewares: localIp@file` + +### 5. Security +- **Cloudflare Tunnel**: Securely exposes internal services without port forwarding. +- **CrowdSec**: Intrusion detection and banning. +- **Traefik Middlewares**: IP filtering, rate limiting, and authentication. +- **Cloudflare Turnstile**: CAPTCHA protection for public-facing services. + +## Architecture Diagrams + +### 0. High-Level Network Architecture (Architecture Beta) + +```mermaid +%%{init: {'theme': 'neutral', 'themeVariables': { + 'primaryColor': '#f0f0f0', + 'primaryBorderColor': '#333333', + 'primaryTextColor': '#333333', + 'lineColor': '#333333', + 'tertiaryColor': '#e67e22' +}}}%% +architectureBeta + %% External Layer + box "Internet" #f9f9f9 + component Cloudflare["Cloudflare\n(CDN/DNS)"] #f9f9f9 + component DuckDNS["DuckDNS\n(DDNS)"] #f9f9f9 + end + + %% External Gateway + box "External Gateway" #e6e6e6 + component Livebox["Livebox\n(NAT/Firewall)"] #e6e6e6 + end + + %% Internal Layer + box "Internal Network\n(192.168.1.0/24)" #d4d4d4 + %% DNS Layer + box "DNS" #ffff99 + component PiHole1["Pi-hole\n(pi1)"] #ffff99 + component PiHole3["Pi-hole\n(pi3)"] #ffff99 + component StepCA["Step CA\n(pi1)"] #ccccff + end + + %% k3s Layer + box "k3s Cluster" #ff9999 + component Traefik["Traefik\n(Ingress)"] #ff9999 + component CrowdSec["CrowdSec\n(Security)"] #ff9999 + component Gitea["Gitea\n(pi2)"] #ffcc99 + component Vault["Vault\n(Secrets)"] #ccccff + end + end + + %% Connections + Cloudflare --> Livebox : "DNS" + DuckDNS --> Livebox : "DDNS" + Livebox --> PiHole1 : "NAT" + Livebox --> PiHole3 : "NAT" + Livebox --> Traefik : "NAT" + PiHole1 --> StepCA : "Forward .lab" + PiHole1 --> Cloudflare : "Forward External" + PiHole3 --> StepCA : "Forward .lab" + PiHole3 --> Cloudflare : "Forward External" + Traefik --> Cloudflare : "TLS (Let's Encrypt)" + Traefik --> StepCA : "TLS (Step CA)" + CrowdSec --> Traefik : "Ban IPs" + Traefik --> Gitea : "Route" + Traefik --> Vault : "Route" +``` + +### 1. High-Level Network Architecture + +```mermaid +%%{init: {'theme': 'base', 'themeVariables': { 'primaryColor': '#333333', 'edgeLabelBackground':'#f0f0f0', 'tertiaryColor': '#f89136'}}}%% +graph TD + %% Styles + classDef internet fill:#f9f9f9,stroke:#999,color:#333; + classDef external fill:#e6e6e6,stroke:#555,color:#333; + classDef internal fill:#d4d4d4,stroke:#777,color:#333; + classDef security fill:#ff9999,stroke:#cc0000,color:#333; + classDef dns fill:#ffff99,stroke:#cccc00,color:#333; + classDef ca fill:#ccccff,stroke:#6666cc,color:#333; + + %% Internet + subgraph "Internet" + Cloudflare["Cloudflare (CDN/DNS)"]:::internet + DuckDNS["DuckDNS (DDNS)"]:::internet + end + + %% External Gateway + subgraph "External Gateway" + Livebox["Livebox (NAT/Firewall)"]:::external + end + + %% Internal Network + subgraph "Internal Network (192.168.1.0/24)" + %% Pi-hole DNS + PiHole1["Pi-hole (pi1)"]:::dns + PiHole3["Pi-hole (pi3)"]:::dns + + %% Step CA + StepCA["Step CA (pi1)"]:::ca + + %% k3s Cluster + k3s["k3s Cluster"]:::internal + Traefik["Traefik (k3s)"]:::internal + CrowdSec["CrowdSec (k3s)"]:::security + + %% Services + Gitea["Gitea (pi2)"]:::internal + Vault["Vault (k3s)"]:::internal + end + + %% Connections + Cloudflare -->|DNS| Livebox + DuckDNS -->|DDNS| Livebox + Livebox -->|NAT| PiHole1 + Livebox -->|NAT| PiHole3 + Livebox -->|NAT| k3s + + %% Internal DNS + PiHole1 -->|Forward .lab| StepCA + PiHole1 -->|Forward External| Cloudflare + PiHole3 -->|Forward .lab| StepCA + PiHole3 -->|Forward External| Cloudflare + + %% Ingress + Traefik -->|"TLS (Let's Encrypt)"| Cloudflare + Traefik -->|"TLS (Step CA)"| StepCA + CrowdSec -->|Ban IPs| Traefik + + %% Service Access + Traefik -->|Route| Gitea + Traefik -->|Route| Vault +``` + +### 2. DNS Resolution Flow + +```mermaid +sequenceDiagram + participant Client + participant PiHole + participant StepCA + participant Cloudflare + participant ExternalDNS + + Client->>PiHole: Query example.lab + PiHole->>StepCA: Forward .lab query + StepCA-->>PiHole: Return A record + PiHole-->>Client: Return response + + Client->>PiHole: Query example.com + PiHole->>Cloudflare: Forward to 1.1.1.1 + Cloudflare->>ExternalDNS: Resolve externally + ExternalDNS-->>Cloudflare: Return response + Cloudflare-->>PiHole: Return response + PiHole-->>Client: Return response +``` + +### 3. Ingress and TLS Flow + +```mermaid +sequenceDiagram + participant User + participant Cloudflare + participant Traefik + participant StepCA + participant Service + + User->>Cloudflare: HTTPS Request (webapp.arcodange.fr) + Cloudflare->>Traefik: Forward to internal IP + Traefik->>Let's Encrypt: Request Certificate + Let's Encrypt-->>Traefik: Issue Certificate + Traefik->>Service: Route request + Service-->>Traefik: Return response + Traefik-->>Cloudflare: Return HTTPS response + Cloudflare-->>User: Return response + + User->>Traefik: HTTPS Request (webapp.arcodange.lab) + Traefik->>StepCA: Request Certificate + StepCA-->>Traefik: Issue Certificate + Traefik->>Service: Route request + Service-->>Traefik: Return response + Traefik-->>User: Return HTTPS response +``` + +### 4. Security Flow (CrowdSec + Traefik) + +```mermaid +sequenceDiagram + participant Attacker + participant Traefik + participant CrowdSec + participant BannedIPs + + Attacker->>Traefik: Malicious Request + Traefik->>CrowdSec: Log suspicious activity + CrowdSec->>BannedIPs: Add IP to ban list + BannedIPs-->>Traefik: Update middleware + Traefik-->>Attacker: Block request (403) +``` + +## Playbook and Role Analysis + +### 1. Pi-hole Deployment +- **Playbook**: `playbooks/system/pihole.yml` +- **Role**: `arcodange.factory.pihole` +- **Configuration**: + - Upstream DNS: Cloudflare (1.1.1.1) and Step CA for `.lab`. + - Blocklists: Ad-blocking and malware domains. + +### 2. Step CA Deployment +- **Playbook**: `playbooks/ssl/ssl.yml` +- **Role**: `step_ca` +- **Configuration**: + - Internal CA for `.lab` domain. + - Short-lived certificates (default: 24h). + +### 3. Traefik Deployment +- **Playbook**: `playbooks/system/system_k3s.yml` (via k3s) +- **Helm Chart**: `traefik` (installed via k3s) +- **Key Annotations**: + ```yaml + traefik.ingress.kubernetes.io/router.entrypoints: websecure + traefik.ingress.kubernetes.io/router.tls.certresolver: letsencrypt + traefik.ingress.kubernetes.io/router.middlewares: localIp@file + ``` + +### 4. CrowdSec Deployment +- **Playbook**: `playbooks/tools/crowdsec.yml` +- **Role**: `arcodange.factory.crowdsec` +- **Configuration**: + - Bouncer integration with Traefik. + - Custom scenarios for brute-force and bot detection. + +## Consequences + +### Positive +- **Resilient DNS**: Pi-hole provides ad-blocking and internal DNS resolution. +- **Secure TLS**: Step CA for internal services, Let's Encrypt for external. +- **DDoS Protection**: Cloudflare absorbs external attacks. +- **Intrusion Detection**: CrowdSec bans malicious IPs automatically. + +### Negative +- **Complexity**: Multiple layers require careful configuration. +- **Single Point of Failure**: Pi-hole is critical for internal DNS. +- **Certificate Management**: Step CA requires maintenance for `.lab` domain. + +## Alternatives Considered + +### Alternative 1: Public DNS for `.lab` +- **Rejected**: Exposing internal domains is a security risk. + +### Alternative 2: No Ad-Blocking +- **Rejected**: Pi-hole provides essential security and privacy. + +### Alternative 3: Self-Signed Certificates +- **Rejected**: Step CA provides better usability with short-lived certs. + +### 5. Cloudflare Turnstile + CrowdSec Flow + +```mermaid +sequenceDiagram + participant User + participant Cloudflare + participant Turnstile + participant Traefik + participant CrowdSec + participant BannedIPs + + User->>Cloudflare: Request protected endpoint + Cloudflare->>Turnstile: Challenge (CAPTCHA) + Turnstile-->>Cloudflare: Return token + Cloudflare->>Traefik: Forward request with token + + alt Valid Token + Traefik->>Service: Route request + Service-->>Traefik: Return response + Traefik-->>Cloudflare: Return response + Cloudflare-->>User: Return success + else Invalid Token + Traefik->>CrowdSec: Log suspicious activity + CrowdSec->>BannedIPs: Add IP to ban list + BannedIPs-->>Traefik: Update middleware + Traefik-->>Cloudflare: Block request (403) + Cloudflare-->>User: Return "Access Denied" + end +``` + +## Success Metrics +- Pi-hole blocks >50% of ads and trackers. +- Step CA issues certificates without downtime. +- Traefik routes 100% of external traffic via Cloudflare. +- CrowdSec bans >10 malicious IPs per day. +- Cloudflare Turnstile blocks >90% of bot traffic.