Co-authored-by: Gabriel Radureau <arcodange@gmail.com> Co-committed-by: Gabriel Radureau <arcodange@gmail.com>
9.3 KiB
30. BDD email assertions with parallel test execution
Date: 2026-05-05 Status: Proposed Authors: Gabriel Radureau, AI Agent
Context and Problem Statement
ADR-0028 introduces magic-link auth, which requires the application to send emails. ADR-0029 chose Mailpit as the local SMTP receiver for dev and BDD tests. The remaining decision : how do BDD scenarios assert on the email content while running in parallel ?
Today (since PR #35), the BDD suite runs in parallel via per-package PostgreSQL schema isolation (cf. ADR-0025). Each Go test package has its own schema ; tests inside a package run serially within that schema. This works because Postgres has named schemas with strong isolation. Mailpit has no equivalent — there is one inbox per Mailpit instance, shared across all senders.
A naive integration would have parallel scenarios fight over each other's emails :
- Scenario A : "request magic link for
test@example.com" → email arrives - Scenario B (in parallel) : "request magic link for
test@example.com" → email arrives - Both scenarios query Mailpit for
test@example.com— they see each other's messages, assertions become flaky.
We need a way to scope each scenario's emails so it only sees its own messages.
Decision Drivers
- No regression on parallelism — BDD-isolation Phase 3 (PR #35) achieved a 2.85x speedup ; the email-assertion solution must not undo that
- No new container per test — running one Mailpit per scenario would defeat the simplicity that made us choose Mailpit
- Determinism — a scenario's email assertions must succeed regardless of how many other scenarios are running
- Realistic SMTP path — we still want the full SMTP wire format exercised (cf. ADR-0029) ; we don't want to bypass Mailpit
- Cleanup hygiene — old messages from previous test runs must not leak into a new run
Considered Options
Option 1 (Chosen): Per-test recipient scoping with deterministic addresses
Each BDD scenario generates a unique email address for its test user, derived from the scenario key + a random suffix. Examples :
- Scenario
magic-link-happy-path→magic-link-happy-path-<8hex>@bdd.local - Scenario
magic-link-expired-token→magic-link-expired-token-<8hex>@bdd.local
The application code accepts any email format. The BDD scenario asserts on Mailpit's HTTP API filtering by the to address. Two parallel scenarios with different addresses can NEVER see each other's emails.
Cleanup : at the start of each scenario, the BDD framework calls DELETE /api/v1/search?query=to:<scenario-address> on Mailpit to purge any leftover messages from prior runs.
Option 2: One Mailpit instance per Go test package
Spawn a fresh Mailpit container in TestMain of each features/<area>/ package. Each gets its own port range.
- Good — strong isolation
- Bad — heavyweight (one container per package = 5+ containers running)
- Bad — port allocation complexity (similar to existing
pkg/bdd/parallel/port_manager.go, but applied to Mailpit) - Bad — slow startup (Mailpit boot is ~200ms but adds up)
Option 3: One Mailpit instance, scenario-scoped via custom SMTP header
Add a custom header X-BDD-Scenario-ID: <key> to outgoing emails. Tests query Mailpit filtered on that header.
- Good — same single Mailpit
- Bad — requires the application code to know the scenario ID at email-send time, which means a test-only path in production code
- Bad — header propagation is fragile (gets stripped by some SMTP relays — not Mailpit, but real production providers might) ; we don't want a different code path between dev and prod
Option 4: Sequence parallel scenarios via per-scenario Mailpit lock
Use a mutex / queue so no two scenarios that send email run concurrently.
- Good — minimal code change
- Bad — gives up the parallel speedup for any feature that involves email — that's most auth-related features going forward
Decision Outcome
Chosen option : Option 1 — per-test recipient scoping.
Rationale :
- Recipient scoping is the simplest abstraction : the address IS the identity ; Mailpit's HTTP API natively supports filtering by recipient
- Application code stays clean : it just sends to whatever address it's given. No test-mode branching.
- Parallel-safe by construction : two scenarios cannot collide if they don't share an address
- Cheap to implement : a few helper functions in
pkg/bdd/steps/email_steps.goand amailpit.Clientpackage wrapping the HTTP API - Cleanup is per-scenario, not global — no "delete all messages" race between scenarios
Implementation Plan
Helper package : pkg/bdd/mailpit/client.go
type Client struct {
BaseURL string // default: http://localhost:8025
HTTP *http.Client
}
// AwaitMessageTo polls Mailpit's HTTP API for a message addressed
// to the given recipient, with a deadline. Returns the most recent
// matching message or an error on timeout.
func (c *Client) AwaitMessageTo(ctx context.Context, to string, timeout time.Duration) (*Message, error)
// PurgeMessagesTo removes all messages addressed to the given
// recipient. Idempotent and parallel-safe.
func (c *Client) PurgeMessagesTo(ctx context.Context, to string) error
type Message struct {
ID string
From string
To []string
Subject string
Text string
HTML string
Headers map[string][]string
}
Helper steps : pkg/bdd/steps/email_steps.go
func (s *EmailSteps) iHaveAnEmailAddressForThisScenario() error
// Generates `<scenario-key>-<8hex>@bdd.local`, stores it in the scenario state.
func (s *EmailSteps) iShouldReceiveAnEmailWithSubject(subject string) error
// Polls AwaitMessageTo on the scenario's address, asserts subject equality.
func (s *EmailSteps) theEmailShouldContain(snippet string) error
// Re-fetches the most recent message and checks for substring in body.
func (s *EmailSteps) theEmailContainsAMagicLinkToken() (string, error)
// Extracts the token from the magic-link URL via regex, returns it.
Scenario lifecycle
- Before each scenario :
iHaveAnEmailAddressForThisScenariois called (either explicitly via Background, or implicitly via a hook). The unique address is stored in the scenario's state. PurgeMessagesTo is called to clear any leftovers from prior runs of the same address (defensive — should be impossible since the suffix is random, but cheap). - During the scenario : the application sends to that address. Tests query for it.
- After each scenario : no global cleanup needed — addresses are per-scenario unique, so they don't accumulate beyond Mailpit's
MP_MAX_MESSAGES=5000cap.
Race-free deletion
Mailpit's DELETE /api/v1/search?query=to:<addr> is atomic per recipient. Two concurrent scenarios with different addresses cannot interfere.
Sample scenario (auth-magic-link.feature)
@critical @magic-link
Scenario: User receives a magic link by email
Given I have an email address for this scenario
When I request a magic link for my email address
Then I should receive an email with subject "Your magic link"
And the email contains a magic link token
When I consume the magic link token
Then I should receive a JWT
Pros and Cons of the Options
Option 1 (Chosen)
- Good — parallel-safe by construction
- Good — application code unchanged ; test-only logic stays in the BDD layer
- Good — Mailpit API supports the filter natively
- Good — cleanup is fine-grained, no race
- Bad — requires cooperative scenarios (each must request a unique address)
- Mitigation : Background steps in feature files make it automatic
Option 2 (Mailpit per package)
- Bad — operational complexity not justified for the test-only concern
Option 3 (Custom header scoping)
- Bad — production code dirtied by test concerns
Option 4 (Lock-and-sequence)
- Bad — gives up parallelism (the whole point of PR #35 + ADR-0025)
Consequences
pkg/bdd/mailpit/package is created with HTTP client + helper typespkg/bdd/steps/email_steps.gopackage is created and registered insteps.gofeatures/auth/and any other email-using features have new BDD steps available- The local development docker-compose must run Mailpit before BDD tests run — to be added to the BDD test runner script
scripts/run-bdd-tests.sh - Mailpit message TTL is governed by
MP_MAX_MESSAGES(5000) — at parallel BDD volumes, that's enough headroom for ~50 scenarios × 100 messages each before any pruning kicks in
Out of scope
- Visual regression on email rendering — text body assertions only ; HTML rendering checks belong in a separate Storybook-style harness
- Attachment handling — magic-link emails are text-only ; ADRs for attachments will come if/when needed
- Email volume / rate-limit testing — that's a load-test concern, not a BDD concern
Links
- Auth migration depending on this : ADR-0028
- Email infrastructure choice : ADR-0029
- BDD parallelism foundation : ADR-0025, PR #35
- Mailpit API : https://mailpit.axllent.org/docs/api-v1/