✨ feat: implement schema-per-scenario isolation for BDD tests
- Add BDD_SCHEMA_ISOLATION env var to enable schema-per-scenario mode - Generate unique schema names using SHA256 hash of feature+scenario - Implement SetupScenarioSchema() and TeardownScenarioSchema() methods - Handle search_path configuration for schema isolation - Use CASCADE drop to clean up all scenario-created DB objects - Add isSchemaIsolationEnabled() helper to suite.go - Update cleanup flow: skip table clearing when schema isolation is active - Add ADR 0025 documenting isolation strategies and decision rationale Activation: Set BDD_SCHEMA_ISOLATION=true to enable Debug: Set BDD_ENABLE_CLEANUP_LOGS=true for verbose isolation logging Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>
This commit is contained in:
240
adr/0025-bdd-scenario-isolation-strategies.md
Normal file
240
adr/0025-bdd-scenario-isolation-strategies.md
Normal file
@@ -0,0 +1,240 @@
|
||||
# ADR 0025: BDD Scenario Isolation Strategies
|
||||
|
||||
## Status
|
||||
**Proposed** 🟡
|
||||
|
||||
## Context
|
||||
|
||||
As our BDD test suite grows, we're encountering **test pollution** issues where scenarios interfere with each other through shared state. This is particularly problematic for:
|
||||
|
||||
1. **Database state**: Scenarios create users, JWT secrets, config entries that persist across scenarios
|
||||
2. **JWT secret rotation**: Multiple secrets accumulate, affecting subsequent scenario authentication
|
||||
3. **Config file modifications**: Feature flag changes persist between tests
|
||||
4. **Gherkin Background steps**: Data set up in Background is visible to all scenarios in the feature
|
||||
|
||||
Our current approach clears database tables after each scenario, but this has **race condition vulnerabilities** with concurrent scenario execution.
|
||||
|
||||
### Gherkin Background Consideration
|
||||
|
||||
Crucially, Gherkin's `Background` section runs **before each scenario** in a feature, not once before all scenarios. This means:
|
||||
|
||||
```gherkin
|
||||
Feature: User registration
|
||||
Background:
|
||||
Given the database is empty
|
||||
And a default admin user exists
|
||||
|
||||
Scenario: Register new user
|
||||
When I register user "alice"
|
||||
Then user "alice" should exist
|
||||
|
||||
Scenario: Register duplicate user
|
||||
When I register user "alice"
|
||||
Then I should see error "user already exists"
|
||||
```
|
||||
|
||||
The second scenario fails because Background creates data that persists, and the first scenario's data isn't cleaned up. Background steps are re-executed before each scenario.
|
||||
|
||||
## Decision Drivers
|
||||
|
||||
* **Isolation**: Each scenario must start with a clean slate
|
||||
* **Performance**: Cleanup must be fast enough for CI/CD pipelines
|
||||
* **Concurrency**: Must work with parallel scenario execution
|
||||
* **Compatibility**: Must work with Gherkin Background steps
|
||||
* **Maintainability**: Solution should be simple to understand and debug
|
||||
|
||||
## Considered Options
|
||||
|
||||
### Option 1: Transaction Rollback (Rejected ❌)
|
||||
|
||||
Wrap each scenario in a database transaction, rollback at the end.
|
||||
|
||||
```go
|
||||
BeforeScenario: BEGIN;
|
||||
AfterScenario: ROLLBACK;
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Simple implementation
|
||||
- Fast - transaction rollback is nearly instant
|
||||
- No data cleanup needed
|
||||
|
||||
**Cons:**
|
||||
- ❌ **Fails if scenario commits**: Nested transaction problem - `COMMIT` inside scenario releases the transaction, parent `ROLLBACK` has no effect
|
||||
- Cannot handle non-database state (JWT secrets in memory, config files)
|
||||
- Doesn't solve JWT secret pollution
|
||||
|
||||
**Verdict: Not viable** - Too many scenarios use database transactions internally.
|
||||
|
||||
---
|
||||
|
||||
### Option 2: Clear Tables in Public Schema (Current ✅/⚠️)
|
||||
|
||||
Delete all rows from all tables after each scenario.
|
||||
|
||||
```go
|
||||
AfterScenario: DELETE FROM table1; DELETE FROM table2; ...
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Currently implemented
|
||||
- Works with any scenario code
|
||||
- Handles database state
|
||||
|
||||
**Cons:**
|
||||
- ⚠️ **Race conditions**: Concurrent scenarios can interleave - Scenario A deletes data while Scenario B is still using it
|
||||
- ⚠️ **Slow**: Must delete from all tables, reset sequences
|
||||
- ❌ **Misses in-memory state**: JWT secrets, config changes persist
|
||||
- ❌ **Doesn't handle Background**: Background data is shared across scenarios
|
||||
|
||||
**Verdict: Partially adequate** - Works for sequential execution but has parallel execution issues.
|
||||
|
||||
---
|
||||
|
||||
### Option 3: Schema-per-Scenario (Recommended ✅)
|
||||
|
||||
Create a unique PostgreSQL schema for each scenario, drop it after.
|
||||
|
||||
```go
|
||||
BeforeScenario:
|
||||
schema := "test_" + sha256(scenario.Name)[:8]
|
||||
CREATE SCHEMA schema;
|
||||
SET search_path = schema, public;
|
||||
|
||||
AfterScenario:
|
||||
DROP SCHEMA schema CASCADE;
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- ✅ **True isolation**: Each scenario has its own database namespace
|
||||
- ✅ **Works with transactions**: Scenario can commit freely - entire schema is dropped
|
||||
- ✅ **Works with Background**: Background runs in scenario's schema, data is isolated
|
||||
- ✅ **Fast**: Schema drop is instant (just metadata deletion)
|
||||
- ✅ **Handles concurrent scenarios**: Different schemas = no conflicts
|
||||
|
||||
**Cons:**
|
||||
- Requires `CREATE/DROP SCHEMA` database privileges in test environment
|
||||
- Some ORMs may hardcode `public` schema - need to use `SET search_path` carefully
|
||||
- Test DB must allow many schemas (typically fine for PostgreSQL)
|
||||
- We need to handle `search_path` in connection pooling (each scenario needs its own connection)
|
||||
|
||||
**Implementation notes:**
|
||||
- Use `Luego` (PostgreSQL schema prefix) approach: `test_{hash}`
|
||||
- Hash: `sha256(feature_name + scenario_name)[:8]` for consistency across runs
|
||||
- Execute Background steps in the scenario's schema context
|
||||
- Set `search_path` at the connection level, not globally
|
||||
|
||||
---
|
||||
|
||||
### Option 4: Database-per-Feature ⚠️
|
||||
|
||||
Create a separate database for each feature file.
|
||||
|
||||
```go
|
||||
BeforeFeature: CREATE DATABASE feature_auth;
|
||||
AfterFeature: DROP DATABASE feature_auth;
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Strong isolation between features
|
||||
- Simple implementation
|
||||
|
||||
**Cons:**
|
||||
- ❌ **Doesn't isolate scenarios within a feature** - Background data shared across scenarios
|
||||
- Database creation is slower than schema creation
|
||||
- Harder to manage in CI (more databases to create/cleanup)
|
||||
- Still need table clearing between scenarios within a feature
|
||||
|
||||
**Verdict: Insufficient** - Doesn't solve intra-feature pollution.
|
||||
|
||||
---
|
||||
|
||||
### Option 5: Schema-per-Feature + Table Clearing per Scenario ⚠️
|
||||
|
||||
Create one schema per feature, clear tables between scenarios.
|
||||
|
||||
```go
|
||||
BeforeFeature: CREATE SCHEMA feature_auth;
|
||||
AfterFeature: DROP SCHEMA feature_auth;
|
||||
AfterScenario: DELETE FROM all_tables;
|
||||
```
|
||||
|
||||
**Pros:**
|
||||
- Isolates features from each other
|
||||
- Simpler than per-scenario schemas
|
||||
|
||||
**Cons:**
|
||||
- ❌ **Scenarios within a feature share state** - Background data persists
|
||||
- Still has race conditions with concurrent scenarios in same feature
|
||||
- Requires table clearing overhead
|
||||
|
||||
**Verdict: Better than current but still has issues**.
|
||||
|
||||
---
|
||||
|
||||
## Decision Outcome
|
||||
|
||||
**Chosen option: Schema-per-Scenario (Option 3)**
|
||||
|
||||
We will implement schema-per-scenario because it:
|
||||
|
||||
1. Provides **true isolation** for all database state
|
||||
2. **Works with Gherkin Background** - Background runs in each scenario's schema
|
||||
3. **Handles concurrent execution** - No race conditions
|
||||
4. **Works with scenario transactions** - Scenarios can commit freely
|
||||
5. Is **fast** - Schema operations are cheap
|
||||
|
||||
### Implementation Plan
|
||||
|
||||
**Phase 1: Foundation**
|
||||
- Add scenario-aware schema management to test server
|
||||
- Implement schema creation/drop in BeforeScenario/AfterScenario hooks
|
||||
- Handle `search_path` configuration for each scenario's database connection
|
||||
|
||||
**Phase 2: Connection Pooling**
|
||||
- Configure connection pool to respect per-scenario `search_path`
|
||||
- Each scenario gets isolated connections
|
||||
|
||||
**Phase 3: In-Memory State**
|
||||
- Extend cleanup to handle JWT secrets (already implemented in suite.go)
|
||||
- Add config reset capability
|
||||
|
||||
**Phase 4: Validation**
|
||||
- Run full test suite to identify ORM/schema issues
|
||||
- Fix any hardcoded `public` schema references
|
||||
|
||||
### Schema Naming Convention
|
||||
|
||||
```
|
||||
Schema name: test_{sha256(feature + scenario)[:8]}
|
||||
```
|
||||
|
||||
Example:
|
||||
- Feature: `auth`, Scenario: `Successful user authentication`
|
||||
- Hash: `sha256("auth_Successful user authentication")[:8]` = `a3f7b2c1`
|
||||
- Schema: `test_a3f7b2c1`
|
||||
|
||||
Benefits:
|
||||
- Unique per scenario
|
||||
- Consistent across test runs (same scenario = same schema)
|
||||
- Short (8 chars + prefix = 14 chars max)
|
||||
- Identifiable for debugging
|
||||
|
||||
## Pros and Cons Summary
|
||||
|
||||
| Aspect | Schema-per-Scenario | Current (Clear Tables) | Transaction Rollback |
|
||||
|--------|---------------------|----------------------|-------------------|
|
||||
| Isolation | ✅ Strong | ⚠️ Medium | ❌ Weak |
|
||||
| Works with Background | ✅ Yes | ⚠️ Partial | ❌ No |
|
||||
| Concurrency safe | ✅ Yes | ❌ No | ❌ No |
|
||||
| Works with TX | ✅ Yes | ✅ Yes | ❌ No |
|
||||
| Speed | ✅ Fast | ⚠️ Slow | ✅ Fast |
|
||||
| DB privileges | ⚠️ Needs CREATE | ✅ None | ✅ None |
|
||||
| Complexity | ⚠️ Medium | ✅ Low | ✅ Low |
|
||||
|
||||
## Links
|
||||
|
||||
* [ADR 0008: BDD Testing](adr/0008-bdd-testing.md) - Original BDD adoption decision
|
||||
* [ADR 0024: BDD Test Organization and Isolation](adr/0024-bdd-test-organization-and-isolation.md) - Feature isolation strategy
|
||||
* [Godog Documentation](https://github.com/cucumber/godog) - BDD framework specifics
|
||||
* [PostgreSQL Schemas](https://www.postgresql.org/docs/current/ddl-schemas.html) - Schema management
|
||||
@@ -30,6 +30,7 @@ This directory contains Architecture Decision Records (ADRs) for the dance-lesso
|
||||
| 0022 | Rate Limiting and Cache Strategy | ✅ Accepted |
|
||||
| 0023 | Config Hot Reloading | 🟡 Proposed |
|
||||
| 0024 | BDD Test Organization and Isolation | 🟡 Proposed |
|
||||
| 0025 | BDD Scenario Isolation Strategies | 🟡 Proposed |
|
||||
|
||||
## What is an ADR?
|
||||
|
||||
@@ -111,6 +112,7 @@ Chosen option: "[Option 1]" because [justification]
|
||||
* [0021-jwt-secret-retention-policy.md](0021-jwt-secret-retention-policy.md) - JWT Secret Retention Policy with Configurable TTL and Retention
|
||||
* [0022-rate-limiting-cache-strategy.md](0022-rate-limiting-cache-strategy.md) - Rate Limiting and Cache Strategy with Multi-Phase Implementation
|
||||
* [0023-config-hot-reloading.md](0023-config-hot-reloading.md) - Config Hot Reloading Strategy
|
||||
* [0025-bdd-scenario-isolation-strategies.md](0025-bdd-scenario-isolation-strategies.md) - Schema-per-scenario isolation for BDD tests
|
||||
|
||||
## How to Add a New ADR
|
||||
|
||||
|
||||
@@ -20,6 +20,11 @@ func isCleanupLoggingEnabled() bool {
|
||||
return os.Getenv("BDD_ENABLE_CLEANUP_LOGS") == "true"
|
||||
}
|
||||
|
||||
// isSchemaIsolationEnabled returns true if BDD_SCHEMA_ISOLATION environment variable is set to "true"
|
||||
func isSchemaIsolationEnabled() bool {
|
||||
return os.Getenv("BDD_SCHEMA_ISOLATION") == "true"
|
||||
}
|
||||
|
||||
func InitializeTestSuite(ctx *godog.TestSuiteContext) {
|
||||
ctx.BeforeSuite(func() {
|
||||
// Small delay to ensure any previous server instances are fully cleaned up
|
||||
@@ -37,8 +42,24 @@ func InitializeTestSuite(ctx *godog.TestSuiteContext) {
|
||||
|
||||
sc := ctx.ScenarioContext()
|
||||
sc.BeforeScenario(func(s *godog.Scenario) {
|
||||
// Get feature name from context or environment
|
||||
feature := os.Getenv("FEATURE")
|
||||
if feature == "" {
|
||||
// Try to extract feature from scenario tags or path
|
||||
feature = "unknown"
|
||||
}
|
||||
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Info().Str("scenario", s.Name).Msg("CLEANUP: Scenario starting")
|
||||
log.Info().Str("feature", feature).Str("scenario", s.Name).Msg("CLEANUP: Scenario starting")
|
||||
}
|
||||
|
||||
// Setup schema isolation if enabled
|
||||
if sharedServer != nil {
|
||||
if err := sharedServer.SetupScenarioSchema(feature, s.Name); err != nil {
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Warn().Err(err).Msg("ISOLATION: Failed to setup scenario schema")
|
||||
}
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
@@ -48,17 +69,27 @@ func InitializeTestSuite(ctx *godog.TestSuiteContext) {
|
||||
}
|
||||
|
||||
if sharedServer != nil {
|
||||
// Teardown schema isolation if enabled
|
||||
if teardownErr := sharedServer.TeardownScenarioSchema(); teardownErr != nil {
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Warn().Err(teardownErr).Msg("ISOLATION: Failed to teardown scenario schema")
|
||||
}
|
||||
}
|
||||
|
||||
// Reset JWT secrets after every scenario to prevent pollution
|
||||
// Note: This is still needed for in-memory state even with schema isolation
|
||||
if resetErr := sharedServer.ResetJWTSecrets(); resetErr != nil {
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Warn().Err(resetErr).Msg("CLEANUP: Failed to reset JWT secrets after scenario")
|
||||
}
|
||||
}
|
||||
|
||||
// Clean database after every scenario
|
||||
if cleanupErr := sharedServer.CleanupDatabase(); cleanupErr != nil {
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Warn().Err(cleanupErr).Msg("CLEANUP: Failed to cleanup database after scenario")
|
||||
// Clean database after every scenario (only if schema isolation is disabled)
|
||||
if !isSchemaIsolationEnabled() {
|
||||
if cleanupErr := sharedServer.CleanupDatabase(); cleanupErr != nil {
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Warn().Err(cleanupErr).Msg("CLEANUP: Failed to cleanup database after scenario")
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -2,13 +2,16 @@ package testserver
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/sha256"
|
||||
"database/sql"
|
||||
"encoding/hex"
|
||||
"fmt"
|
||||
"math/rand"
|
||||
"net/http"
|
||||
"os"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"dance-lessons-coach/pkg/config"
|
||||
@@ -25,6 +28,30 @@ func isCleanupLoggingEnabled() bool {
|
||||
return os.Getenv("BDD_ENABLE_CLEANUP_LOGS") == "true"
|
||||
}
|
||||
|
||||
// isSchemaIsolationEnabled returns true if BDD_SCHEMA_ISOLATION environment variable is set to "true"
|
||||
func isSchemaIsolationEnabled() bool {
|
||||
return os.Getenv("BDD_SCHEMA_ISOLATION") == "true"
|
||||
}
|
||||
|
||||
// generateSchemaName creates a unique schema name for a scenario
|
||||
// Format: test_{sha256(feature_scenario)[:8]}
|
||||
func generateSchemaName(feature, scenario string) string {
|
||||
hash := sha256.Sum256([]byte(feature + ":" + scenario))
|
||||
hashStr := hex.EncodeToString(hash[:])
|
||||
return "test_" + hashStr[:8]
|
||||
}
|
||||
|
||||
type Server struct {
|
||||
httpServer *http.Server
|
||||
port int
|
||||
baseURL string
|
||||
db *sql.DB
|
||||
authService user.AuthService // Reference to auth service for cleanup
|
||||
schemaMutex sync.Mutex // Protects schema operations
|
||||
currentSchema string // Current schema being used
|
||||
originalSearchPath string // Original search_path to restore
|
||||
}
|
||||
|
||||
// getDatabaseHost returns the database host from environment variable or defaults to localhost
|
||||
func getDatabaseHost() string {
|
||||
host := os.Getenv("DLC_DATABASE_HOST")
|
||||
@@ -45,14 +72,6 @@ func getDatabasePort() int {
|
||||
return port
|
||||
}
|
||||
|
||||
type Server struct {
|
||||
httpServer *http.Server
|
||||
port int
|
||||
baseURL string
|
||||
db *sql.DB
|
||||
authService user.AuthService // Reference to auth service for cleanup
|
||||
}
|
||||
|
||||
func init() {
|
||||
// Seed the random number generator for random port selection
|
||||
rand.Seed(time.Now().UnixNano())
|
||||
@@ -95,7 +114,9 @@ func NewServer() *Server {
|
||||
}
|
||||
|
||||
return &Server{
|
||||
port: port,
|
||||
port: port,
|
||||
currentSchema: "public",
|
||||
originalSearchPath: "public",
|
||||
}
|
||||
}
|
||||
|
||||
@@ -430,6 +451,96 @@ func (s *Server) CleanupDatabase() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// SetupScenarioSchema creates and activates a unique schema for the scenario
|
||||
func (s *Server) SetupScenarioSchema(feature, scenario string) error {
|
||||
if !isSchemaIsolationEnabled() {
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Info().Str("feature", feature).Str("scenario", scenario).Msg("ISOLATION: Schema isolation disabled, using public schema")
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
schemaName := generateSchemaName(feature, scenario)
|
||||
s.schemaMutex.Lock()
|
||||
defer s.schemaMutex.Unlock()
|
||||
|
||||
// Store original search path if not already stored
|
||||
if s.originalSearchPath == "" {
|
||||
var err error
|
||||
s.originalSearchPath, err = s.getCurrentSearchPath()
|
||||
if err != nil {
|
||||
log.Warn().Err(err).Msg("ISOLATION: Failed to get current search_path")
|
||||
s.originalSearchPath = "public"
|
||||
}
|
||||
}
|
||||
|
||||
// Create the schema
|
||||
createSQL := fmt.Sprintf("CREATE SCHEMA IF NOT EXISTS %s", schemaName)
|
||||
if _, err := s.db.Exec(createSQL); err != nil {
|
||||
return fmt.Errorf("failed to create schema %s: %w", schemaName, err)
|
||||
}
|
||||
|
||||
// Set search path to use the new schema
|
||||
searchPathSQL := fmt.Sprintf("SET search_path = %s, %s", schemaName, s.originalSearchPath)
|
||||
if _, err := s.db.Exec(searchPathSQL); err != nil {
|
||||
return fmt.Errorf("failed to set search_path: %w", err)
|
||||
}
|
||||
|
||||
s.currentSchema = schemaName
|
||||
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Info().Str("feature", feature).Str("scenario", scenario).Str("schema", schemaName).Msg("ISOLATION: Created and activated schema")
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// TeardownScenarioSchema drops the scenario's schema and restores search path
|
||||
func (s *Server) TeardownScenarioSchema() error {
|
||||
if !isSchemaIsolationEnabled() {
|
||||
return nil
|
||||
}
|
||||
|
||||
s.schemaMutex.Lock()
|
||||
defer s.schemaMutex.Unlock()
|
||||
|
||||
if s.currentSchema == "" || s.currentSchema == "public" {
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Info().Msg("ISOLATION: No custom schema to teardown")
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
schemaName := s.currentSchema
|
||||
|
||||
// Restore original search path
|
||||
restoreSQL := fmt.Sprintf("SET search_path = %s", s.originalSearchPath)
|
||||
if _, err := s.db.Exec(restoreSQL); err != nil {
|
||||
log.Warn().Err(err).Str("original", s.originalSearchPath).Msg("ISOLATION: Failed to restore search_path")
|
||||
}
|
||||
|
||||
// Drop the schema - CASCADE ensures dependent objects are also dropped
|
||||
dropSQL := fmt.Sprintf("DROP SCHEMA IF EXISTS %s CASCADE", schemaName)
|
||||
if _, err := s.db.Exec(dropSQL); err != nil {
|
||||
return fmt.Errorf("failed to drop schema %s: %w", schemaName, err)
|
||||
}
|
||||
|
||||
s.currentSchema = ""
|
||||
|
||||
if isCleanupLoggingEnabled() {
|
||||
log.Info().Str("schema", schemaName).Msg("ISOLATION: Dropped schema")
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// getCurrentSearchPath retrieves the current search_path setting
|
||||
func (s *Server) getCurrentSearchPath() (string, error) {
|
||||
var searchPath string
|
||||
err := s.db.QueryRow("SHOW search_path").Scan(&searchPath)
|
||||
return searchPath, err
|
||||
}
|
||||
|
||||
// CloseDatabase closes the database connection
|
||||
func (s *Server) CloseDatabase() error {
|
||||
if s.db != nil {
|
||||
|
||||
Reference in New Issue
Block a user