fix(dns): harden DNS resilience after power-cut incident

During the 2026-04-13 power cut recovery, DNS resolution failures blocked Longhorn reinstall. Root causes: - CoreDNS forwarded to a single hardcoded Pi-hole IP instead of both HA instances - CoreDNS main Corefile forwarded to /etc/resolv.conf which pointed to itself on pi3 - Pi-hole lacked explicit upstream DNS, relying on DHCP-provided config - dnsmasq system service conflicted with pihole-FTL on port 53 Changes: - k3s_dns: forward CoreDNS to both Pi-hole HA instances (pi1 + pi3) dynamically - k3s_dns: update main CoreDNS Corefile to forward to Pi-holes instead of resolv.conf - pihole defaults: add explicit upstream DNS servers (8.8.8.8, 1.1.1.1, 8.8.4.4) - pihole ha_setup: write /etc/dnsmasq.d/99-upstream.conf with explicit upstreams - rpi: add dnsmasq user to dip group and disable conflicting dnsmasq service on Pi-hole nodes See docs/adr/20260414-internal-dns-architecture.md for full rationale. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:54:42 +02:00
parent 355ab11c4d
commit e6fc24c101
5 changed files with 191 additions and 3 deletions
--- a/ansible/arcodange/factory/docs/adr/20260414-internal-dns-architecture.md
+++ b/ansible/arcodange/factory/docs/adr/20260414-internal-dns-architecture.md
@@ -0,0 +1,126 @@
+# ADR 20260414: Internal DNS Architecture
+
+## Status
+Accepted
+
+## Context
+
+During the 2026-04-13 power cut incident, cluster recovery was blocked by DNS resolution failures. The investigation revealed:
+
+1. **CoreDNS forwarding loop**: CoreDNS was configured to forward queries to `/etc/resolv.conf`, which on the node (pi3) pointed to itself (`192.168.1.203`) - a host without a running DNS service
+2. **Pi-hole HA misconfiguration**: Both pi1 and pi3 run Pi-hole (pihole-FTL) but:
+   - pi1's `dnsmasq` service was in a **failed state** due to missing `dip` group membership
+   - pi3's Pi-hole was running but CoreDNS couldn't reach it due to the forwarding configuration
+3. **No explicit upstream DNS**: Pi-hole instances lacked explicitly configured upstream DNS servers
+
+The cluster's HelmChart controller requires external DNS resolution to fetch charts from `charts.longhorn.io`, making DNS a critical dependency for storage provisioning and thus the entire cluster recovery process.
+
+## Decision
+
+### 1. DNS Service Hierarchy
+
+```
+┌─────────────────┐     ┌─────────────────┐
+│   CoreDNS Pod    │────▶│  Pi-hole (pi1)  │──┐
+│   (kube-system)  │     │  Pi-hole (pi3)  │  │
+└─────────────────┘     └─────────────────┘  │
+                                              ▼
+                                       ┌──────────────┐
+                                       │ 8.8.8.8      │
+                                       │ 1.1.1.1      │
+                                       │ 8.8.4.4      │
+                                       └──────────────┘
+```
+
+### 2. CoreDNS Configuration
+
+CoreDNS will forward **all non-cluster DNS queries** to **both Pi-hole instances** in HA configuration:
+
+```coredns
+.:53 {
+    errors
+    health
+    ready
+    kubernetes cluster.local in-addr.arpa ip6.arpa {
+        pods insecure
+        fallthrough in-addr.arpa ip6.arpa
+    }
+    hosts /etc/coredns/NodeHosts {
+        ttl 60
+        reload 15s
+        fallthrough
+    }
+    prometheus :9153
+    cache 30
+    loop
+    reload
+    import /etc/coredns/custom/*.override
+    import /etc/coredns/custom/*.server
+    forward . 192.168.1.201:53 192.168.1.203:53
+}
+```
+
+### 3. Pi-hole HA Configuration
+
+- **Primary**: pi1 (192.168.1.201)
+- **Secondary**: pi3 (192.168.1.203)
+- **Synchronization**: Gravity Sync for configuration consistency
+- **Upstream DNS**: Explicitly configured to Cloudflare (1.1.1.1) and Google (8.8.8.8, 8.8.4.4)
+
+### 4. Pi-hole DNS Service Fix
+
+The `dnsmasq` user must be a member of the `dip` group to bind to privileged port 53:
+
+```bash
+usermod -aG dip dnsmasq
+```
+
+This is managed via Ansible in `playbooks/system/rpi.yml`.
+
+## Consequences
+
+### Positive
+- **Resilience**: DNS resolution continues if one Pi-hole node fails
+- **Consistency**: Both Pi-hole instances maintain synchronized configuration via Gravity Sync
+- **Recovery**: Cluster can recover from power failures without manual DNS intervention
+- **Explicit configuration**: Upstream DNS servers are explicitly defined, avoiding reliance on DHCP-provided config
+
+### Negative
+- **Complexity**: Additional Ansible tasks required to maintain DNS infrastructure
+- **Dependency**: Cluster recovery depends on Pi-hole availability (mitigated by HA)
+
+## Implementation
+
+See related changes in:
+- `playbooks/system/rpi.yml` - dnsmasq group membership fix
+- `playbooks/dns/k3s_dns.yml` - CoreDNS forwarding to HA Pi-hole instances
+- `playbooks/dns/roles/pihole/defaults/main.yml` - Explicit upstream DNS configuration
+
+## Post-Implementation Notes
+
+### Issue Encountered: dnsmasq vs pihole-FTL Port Conflict
+
+During execution, we discovered that **dnsmasq** and **pihole-FTL** both attempt to bind to port 53. On pi1:
+- pihole-FTL was running and handling DNS on port 53
+- dnsmasq service was failing because port 53 was already in use
+
+**Resolution**: The dnsmasq service on Pi-hole nodes is **not needed** when pihole-FTL is running, as pihole-FTL includes its own DNS server (dnsmasq) internally. The system dnsmasq service should remain **disabled** on Pi-hole nodes to avoid conflicts.
+
+### Verification Commands
+
+Check DNS resolution from cluster:
+```bash
+kubectl run dns-test --image=busybox:1.28 -it --rm --restart=Never -- \
+  nslookup charts.longhorn.io 192.168.1.201
+
+# Check CoreDNS forward to both Pi-holes
+kubectl get cm -n kube-system coredns -o yaml
+
+# Check Pi-hole instances
+ssh pi1 "dig @127.0.0.1 google.com +short"
+ssh pi3 "dig @127.0.0.1 google.com +short"
+```
+
+## Related Incidents
+
+- [2026-04-13-power-cut](../incidents/2026-04-13-power-cut/README.md) - Power cut caused DNS resolution failure, blocking Longhorn reinstall and Traefik recovery
--- a/ansible/arcodange/factory/playbooks/dns/roles/pihole/defaults/main.yml
+++ b/ansible/arcodange/factory/playbooks/dns/roles/pihole/defaults/main.yml
@@ -5,3 +5,4 @@ pihole_dns_domain: lab
 pihole_ports: '8081o,443os,[::]:8081o,[::]:443os' # web interface
 pihole_gravity_conf: /etc/gravity-sync/gravity-sync.conf # should not be changed
 pihole_custom_dns: {}
+pihole_upstream_dns: ["8.8.8.8", "1.1.1.1", "8.8.4.4"]  # Explicit upstream DNS servers
--- a/ansible/arcodange/factory/playbooks/dns/roles/pihole/tasks/ha_pihole_setup.yml
+++ b/ansible/arcodange/factory/playbooks/dns/roles/pihole/tasks/ha_pihole_setup.yml
@@ -98,3 +98,17 @@
      address=/{{ host }}.home/{{ hostvars[host].preferred_ip }}
      {% endfor %}
  notify: Restart Pi-hole
+
+- name: Configure explicit upstream DNS servers for Pi-hole
+  copy:
+    dest: /etc/dnsmasq.d/99-upstream.conf
+    owner: root
+    group: root
+    mode: '0644'
+    content: |
+      # Generated by Ansible – Explicit upstream DNS servers
+      # Fixes issue where Pi-hole relies on DHCP-provided DNS which may be unavailable
+      {% for dns_server in pihole_upstream_dns %}
+      server={{ dns_server }}
+      {% endfor %}
+  notify: Restart Pi-hole
--- a/ansible/arcodange/factory/playbooks/system/k3s_dns.yml
+++ b/ansible/arcodange/factory/playbooks/system/k3s_dns.yml
@@ -5,7 +5,7 @@
  gather_facts: false

  vars:
-    pihole_ip: "192.168.1.201"
+    pihole_ips: "{{ groups['pihole'] | map('extract', hostvars) | map(attribute='preferred_ip') | list }}"
    coredns_namespace: "kube-system"

  tasks:
@@ -23,5 +23,38 @@
              arcodange.lab:53 {
                  errors
                  cache 30
-                  forward . {{ pihole_ip }}:53
+                  forward . {{ pihole_ips | map('regex_replace', '^(.*)$', '\1:53') | join(' ') }}
+              }
+
+    - name: "Mettre à jour le ConfigMap CoreDNS principal pour utiliser les Pi-holes HA"
+      kubernetes.core.k8s:
+        state: present
+        definition:
+          apiVersion: v1
+          kind: ConfigMap
+          metadata:
+            name: coredns
+            namespace: "{{ coredns_namespace }}"
+          data:
+            Corefile: |
+              .:53 {
+                  errors
+                  health
+                  ready
+                  kubernetes cluster.local in-addr.arpa ip6.arpa {
+                      pods insecure
+                      fallthrough in-addr.arpa ip6.arpa
+                  }
+                  hosts /etc/coredns/NodeHosts {
+                      ttl 60
+                      reload 15s
+                      fallthrough
+                  }
+                  prometheus :9153
+                  cache 30
+                  loop
+                  reload
+                  import /etc/coredns/custom/*.override
+                  import /etc/coredns/custom/*.server
+                  forward . {{ pihole_ips | map('regex_replace', '^(.*)$', '\1:53') | join(' ') }}
              }
--- a/ansible/arcodange/factory/playbooks/system/rpi.yml
+++ b/ansible/arcodange/factory/playbooks/system/rpi.yml
@@ -11,3 +11,17 @@
        name: "{{ inventory_hostname }}"
      become: yes
      when: inventory_hostname != ansible_hostname
+
+    - name: Ensure dnsmasq user is in dip group for Pi-hole DNS
+      ansible.builtin.user:
+        name: dnsmasq
+        groups: dip
+        append: yes
+      when: "'pihole' in group_names"
+
+    - name: Disable dnsmasq service on Pi-hole nodes to avoid port 53 conflict with pihole-FTL
+      ansible.builtin.systemd:
+        name: dnsmasq
+        state: stopped
+        enabled: no
+      when: "'pihole' in group_names"