- Add STATE_TRACER_README.md documenting purpose, usage, limitations, and future enhancements - Add state_tracer.go with per-process file-based tracing to $TMPDIR - Trace scenario start/end, database cleanup, JWT secret operations - Integrate tracing into suite.go BeforeScenario/AfterScenario hooks - Document findings: sequential execution per feature, shared database across processes - Identify root causes: in-memory JWT secrets not isolated by schema, config reload timing Generated by Mistral Vibe. Co-Authored-By: Mistral Vibe <vibe@mistral.ai>
242 lines
9.2 KiB
Markdown
242 lines
9.2 KiB
Markdown
# BDD State Tracer
|
|
|
|
## Overview
|
|
|
|
The BDD State Tracer is a debugging tool that logs scenario execution, database operations, and state modifications to a file in `$TMPDIR` for analysis of test execution order and state pollution issues.
|
|
|
|
## Purpose
|
|
|
|
### Why Tracing Was Added
|
|
|
|
During multi-iteration BDD test runs with `./scripts/validate-test-suite.sh`, intermittent failures occurred that were difficult to diagnose:
|
|
- Tests passed when run individually
|
|
- Tests failed when run together in the validation script
|
|
- Patterns suggested database state pollution between scenarios across different feature packages
|
|
|
|
The tracer was created to answer key questions:
|
|
1. **Execution Order**: Which scenarios run in which order?
|
|
2. **State Modifications**: What database writes/cleanups occur and when?
|
|
3. **Overlap Detection**: Are scenarios running in parallel (causing race conditions)?
|
|
4. **Isolation Verification**: Is schema isolation working as expected?
|
|
|
|
### Key Findings from Tracing
|
|
|
|
1. **Sequential Execution**: Each feature package runs in a separate process (separate PIDs), but scenarios within each feature run sequentially
|
|
2. **Shared Database**: All processes share the same PostgreSQL database connection
|
|
3. **Schema Isolation Status**: When `BDD_SCHEMA_ISOLATION=false` (default in validate script), all scenarios share the `public` schema
|
|
4. **Cleanup Operations**: Database cleanup (`CleanupDatabase`) runs after each scenario, deleting all test data from all tables
|
|
5. **In-Memory State**: JWT secrets are stored in-memory only, not in database - schema isolation doesn't prevent JWT secret pollution
|
|
|
|
### Example Trace Output
|
|
|
|
```
|
|
2026-04-11T10:10:53.032156 | auth | User registration | SCENARIO_START |
|
|
2026-04-11T10:10:53.146438 | auth | User registration | SCENARIO_END | PASSED
|
|
2026-04-11T10:10:53.152398 | auth | User registration | JWT_RESET | ok
|
|
2026-04-11T10:10:53.162357 | auth | Failed authentication | SCENARIO_START |
|
|
2026-04-11T10:10:53.268273 | auth | Failed authentication | SCENARIO_END | PASSED
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Enable Tracing
|
|
|
|
Set the environment variable `BDD_TRACE_STATE=1` before running tests:
|
|
|
|
```bash
|
|
# Single run with tracing
|
|
BDD_TRACE_STATE=1 go test ./features/auth -v
|
|
|
|
# Validation script with tracing
|
|
BDD_TRACE_STATE=1 ./scripts/validate-test-suite.sh 1
|
|
|
|
# Multiple runs with tracing
|
|
BDD_TRACE_STATE=1 ./scripts/validate-test-suite.sh 5
|
|
```
|
|
|
|
### Trace File Location
|
|
|
|
Trace files are written to `$TMPDIR` (typically `/var/folders/.../T/` on macOS or `/tmp` on Linux):
|
|
|
|
```bash
|
|
# Find trace files
|
|
ls -la $TMPDIR/bdd-state-trace-*.log
|
|
|
|
# View a trace file
|
|
cat $TMPDIR/bdd-state-trace-20260411-101053-12345.log
|
|
```
|
|
|
|
### Trace File Format
|
|
|
|
```
|
|
TIMESTAMP | FEATURE | SCENARIO | ACTION | DETAILS
|
|
2026-04-11T10:10:53.032156 | auth | User registration | SCENARIO_START |
|
|
2026-04-11T10:10:53.146438 | auth | User registration | SCENARIO_END | PASSED
|
|
2026-04-11T10:10:53.152398 | auth | User registration | JWT_RESET | ok
|
|
2026-04-11T10:10:53.162357 | auth | User registration | DB_CLEANUP | all_tables
|
|
```
|
|
|
|
**Columns:**
|
|
- `TIMESTAMP`: ISO 8601 format with microseconds
|
|
- `FEATURE`: Feature name from `FEATURE` environment variable
|
|
- `SCENARIO`: Scenario name (includes URI for disambiguation)
|
|
- `ACTION`: Type of action (see below)
|
|
- `DETAILS`: Additional context
|
|
|
|
**Action Types:**
|
|
- `SCENARIO_START` - Scenario execution begins
|
|
- `SCENARIO_END` - Scenario execution completes (PASSED or FAILED)
|
|
- `DB_CLEANUP` - Database cleanup operation
|
|
- `DB_SELECT` - Database read operation
|
|
- `JWT_RESET` - JWT secrets reset to initial state
|
|
- `DB_INSERT/UPDATE/DELETE` - Database write operations (future)
|
|
- `SCHEMA_*` - Schema isolation operations (future)
|
|
- `TX_*` - Transaction boundary operations (future)
|
|
|
|
## Implementation
|
|
|
|
### Architecture
|
|
|
|
The state tracer uses a simple file-based approach:
|
|
|
|
1. **Per-Process Tracing**: Each `go test` process creates its own trace file with unique filename based on timestamp and PID
|
|
2. **Immediate Flush**: Each trace line is flushed immediately to disk using `Sync()` to prevent data loss
|
|
3. **No Dependencies**: Uses only standard library (`os`, `fmt`, `time`, `path/filepath`)
|
|
4. **Singleton Pattern**: Package-level functions for easy usage across the codebase
|
|
|
|
### Files
|
|
|
|
- `pkg/bdd/testserver/state_tracer.go` - Core tracing functions
|
|
- `pkg/bdd/suite.go` - Integration with godog Before/After scenario hooks
|
|
|
|
### Key Functions
|
|
|
|
```go
|
|
// Package-level functions (called from anywhere)
|
|
TraceStateScenarioStart(feature, scenario string)
|
|
TraceStateScenarioEnd(feature, scenario string, err error)
|
|
TraceStateDBCleanup(feature, scenario, table string)
|
|
TraceStateJWTSecretOperation(feature, scenario, operation, details string)
|
|
TraceStateSchemaIsolation(feature, scenario, operation, details string)
|
|
TraceStateTransaction(feature, scenario, action, details string)
|
|
TraceStateDBRead(feature, scenario, table, details string)
|
|
```
|
|
|
|
## Limitations
|
|
|
|
### Current Limitations
|
|
|
|
1. **Per-Process Files**: Each `go test` process creates its own file, making correlation across processes manual
|
|
2. **No Database Write Tracing**: Currently only traces cleanup, not individual INSERT/UPDATE/DELETE operations
|
|
3. **No API Call Tracing**: Doesn't trace HTTP requests made during scenarios
|
|
4. **No Timing Analysis**: Doesn't measure duration between operations automatically
|
|
5. **No Schema Name in Trace**: When schema isolation is enabled, doesn't show which schema is active
|
|
6. **File Rotation**: No automatic cleanup of old trace files
|
|
|
|
### Known Issues
|
|
|
|
1. **PID-based filenames**: If multiple runs happen in the same second, filenames could collide
|
|
2. **Large file sizes**: High-volume tracing could create large files (mitigated by per-run files)
|
|
3. **No header/footer**: Trace files start immediately with data, no metadata about the run
|
|
|
|
## Future Enhancements
|
|
|
|
### Priority 1: Process Correlation
|
|
- Add a unique run ID that can be passed across all processes
|
|
- Include process start/end markers to show process lifecycle
|
|
- Add parent PID tracking to show process hierarchy
|
|
|
|
### Priority 2: Database Operation Tracing
|
|
- Add tracing for all database writes (INSERT, UPDATE, DELETE)
|
|
- Include query text and affected rows
|
|
- Trace transaction boundaries with IDs
|
|
- Add schema name to all database operations when isolation is enabled
|
|
|
|
### Priority 3: API Call Tracing
|
|
- Trace all HTTP requests made during scenarios
|
|
- Include request method, path, status code, and duration
|
|
- Mark requests that modify state (POST, PUT, DELETE vs GET)
|
|
|
|
### Priority 4: Analysis Tools
|
|
- Create a `bdd-trace-analyzer` tool to:
|
|
- Merge trace files from all processes in correct order
|
|
- Detect overlapping scenarios (parallel execution)
|
|
- Identify database state pollution patterns
|
|
- Generate visualization of scenario execution timeline
|
|
- Flag potential race conditions
|
|
|
|
### Priority 5: Improved Output
|
|
- Add trace file header with metadata (run ID, start time, config, etc.)
|
|
- Color-coded output for different action types
|
|
- JSON output option for programmatic analysis
|
|
- Trace level filtering (DEBUG, INFO, WARN, ERROR)
|
|
|
|
### Priority 6: Performance Optimization
|
|
- Batch writes instead of per-line flush (with configurable flush interval)
|
|
- Compress old trace files
|
|
- Automatic cleanup of old files
|
|
|
|
## Analysis Use Cases
|
|
|
|
### Detecting State Pollution
|
|
|
|
Look for patterns like:
|
|
```
|
|
PID 1234 | auth | Scenario A | DB_CLEANUP | all_tables
|
|
PID 5678 | greet | Scenario B | SCENARIO_START |
|
|
# ^ Scenario B starts AFTER auth cleanup - potential issue
|
|
```
|
|
|
|
### Detecting Parallel Execution
|
|
|
|
Check if timestamps overlap:
|
|
```
|
|
PID 1234 | 10:10:53.032 | auth | Scenario A | SCENARIO_START
|
|
PID 5678 | 10:10:53.035 | greet | Scenario B | SCENARIO_START
|
|
# ^ Both started within 3ms - likely parallel
|
|
```
|
|
|
|
### Verifying Schema Isolation
|
|
|
|
Check that each scenario gets its own schema:
|
|
```
|
|
PID 1234 | auth | Scenario A | SCHEMA_CREATE | test_a1b2c3d4
|
|
PID 1234 | auth | Scenario B | SCHEMA_CREATE | test_e5f6g7h8
|
|
# ^ Different schemas for different scenarios - good
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Tracing Not Working
|
|
|
|
1. Verify `BDD_TRACE_STATE=1` is set:
|
|
```bash
|
|
echo $BDD_TRACE_STATE
|
|
```
|
|
2. Check if trace files are being created:
|
|
```bash
|
|
ls -la $TMPDIR/bdd-state-trace-*.log
|
|
```
|
|
3. Verify the `testserver` package is being used (tracing is integrated there)
|
|
|
|
### No Trace Files Found
|
|
|
|
- Tracing only works when `BDD_TRACE_STATE=1` is set before the test process starts
|
|
- Each `go test` process creates its own file - if tests pass quickly, files may be short
|
|
- Files are created in `$TMPDIR` which defaults to `/tmp` on Linux and a temp folder on macOS
|
|
|
|
### Trace Files Too Large
|
|
|
|
- Tracing every operation can generate large files
|
|
- Consider filtering to specific scenarios:
|
|
```bash
|
|
# Run only failing scenarios with tracing
|
|
BDD_TRACE_STATE=1 go test ./features/auth -v -run "TestAuthBDD/Password_reset"
|
|
```
|
|
|
|
## Related Files
|
|
|
|
- `pkg/bdd/suite.go` - Godog test suite initialization with tracing hooks
|
|
- `pkg/bdd/testserver/server.go` - Test server with tracing integration
|
|
- `scripts/validate-test-suite.sh` - Test validation script
|