Symphonic-Joules

Test Performance Optimization Guide

This guide provides best practices for writing performant tests in the Symphonic-Joules project, ensuring fast and efficient test execution as the codebase grows.

🎯 Overview

Test performance is crucial for developer productivity. Slow tests discourage running them frequently, which can lead to bugs slipping through. This guide demonstrates techniques to keep tests fast while maintaining reliability.

🚀 Key Principles

1. Use Appropriate Fixture Scopes

Pytest fixtures support different scopes that determine how often they’re executed:

function (default): Run before each test function
class: Run once per test class
module: Run once per test module
session: Run once per entire test session

Best Practice: Use the widest scope possible for expensive operations without compromising test independence.

Example: Optimizing File I/O

# ❌ BAD: File read on every test (function scope)
@pytest.fixture
def config_data():
    with open('config.yml', 'r') as f:
        return yaml.safe_load(f)

# ✅ GOOD: File read once per module
@pytest.fixture(scope='module')
def config_data():
    with open('config.yml', 'r') as f:
        return yaml.safe_load(f)

Impact: If you have 50 tests reading the same file, module scope reduces I/O from 50 operations to 1.

2. Cache Expensive Computations

For operations that are expensive but produce immutable results, use module or session-scoped fixtures.

Example: YAML Parsing

# ❌ BAD: Parse YAML in each test class
class TestWorkflowConfig:
    @pytest.fixture
    def workflow(self):
        with open('workflow.yml', 'r') as f:
            return yaml.safe_load(f)  # Parsed every time
    
    def test_name(self, workflow):
        assert workflow['name'] == 'CI'

# ✅ GOOD: Parse YAML once per module
@pytest.fixture(scope='module')
def workflow():
    with open('workflow.yml', 'r') as f:
        return yaml.safe_load(f)  # Parsed once

class TestWorkflowConfig:
    def test_name(self, workflow):
        assert workflow['name'] == 'CI'

3. Avoid Repeated Path Construction

Computing file paths repeatedly wastes CPU cycles.

# ❌ BAD: Reconstruct path in every fixture
class TestFiles:
    @pytest.fixture
    def file_a(self):
        path = Path(__file__).parent.parent / 'data' / 'file_a.txt'
        return path.read_text()
    
    @pytest.fixture
    def file_b(self):
        path = Path(__file__).parent.parent / 'data' / 'file_b.txt'
        return path.read_text()

# ✅ GOOD: Compute base path once
@pytest.fixture(scope='module')
def data_dir():
    return Path(__file__).parent.parent / 'data'

class TestFiles:
    @pytest.fixture
    def file_a(self, data_dir):
        return (data_dir / 'file_a.txt').read_text()
    
    @pytest.fixture
    def file_b(self, data_dir):
        return (data_dir / 'file_b.txt').read_text()

4. Optimize String Operations

String operations can be surprisingly expensive, especially on large strings.

Avoid Unnecessary String Transformations

# ❌ BAD: Convert entire file to lowercase
def test_contains_keyword(file_content):
    assert 'main' in file_content.lower()  # Processes entire string

# ✅ GOOD: Check multiple variants
def test_contains_keyword(file_content):
    assert 'main' in file_content or 'Main' in file_content or 'MAIN' in file_content

Use Efficient String Methods

# ❌ BAD: Multiple passes through string
def test_no_secrets(content):
    for pattern in ['password', 'token', 'api_key']:
        if pattern in content.lower():
            pytest.fail(f"Found {pattern}")

# ✅ GOOD: Single lowercase conversion
def test_no_secrets(content):
    content_lower = content.lower()
    for pattern in ['password', 'token', 'api_key']:
        if pattern in content_lower:
            pytest.fail(f"Found {pattern}")

If test data is immutable (read-only), it’s safe to share across tests using module or session scopes.

# ✅ GOOD: Share immutable parsed data
@pytest.fixture(scope='module')
def audio_metadata():
    """Load audio file metadata once for all tests."""
    return load_audio_metadata('test_audio.wav')

class TestAudioMetadata:
    def test_sample_rate(self, audio_metadata):
        assert audio_metadata['sample_rate'] == 44100
    
    def test_channels(self, audio_metadata):
        assert audio_metadata['channels'] == 2
    
    def test_duration(self, audio_metadata):
        assert audio_metadata['duration'] > 0

Warning: Don’t share mutable data without careful consideration, as it can lead to test interdependence.

6. Use Generators for Large Datasets

When testing with large datasets, use generators to avoid loading everything into memory.

# ❌ BAD: Load all test files into memory
@pytest.fixture
def all_audio_files():
    return [load_audio(f) for f in get_audio_files()]

# ✅ GOOD: Yield one file at a time
@pytest.fixture
def audio_file_generator():
    for file_path in get_audio_files():
        yield load_audio(file_path)

📊 Performance Monitoring

Measuring Test Execution Time

Use pytest’s duration reporting to identify slow tests:

# Show slowest 10 tests
pytest --durations=10

# Show all test durations
pytest --durations=0

Profiling Tests

For detailed performance analysis:

# Profile with pytest-profiling
pip install pytest-profiling
pytest --profile

# Profile with cProfile
python -m cProfile -o output.prof -m pytest tests/

🔍 Real-World Example

Here’s a before/after comparison from the Symphonic-Joules test suite:

Before Optimization

class TestWorkflowStructure:
    @pytest.fixture
    def workflow_path(self):
        repo_root = Path(__file__).parent.parent.parent
        return repo_root / '.github' / 'workflows' / 'blank.yml'
    
    @pytest.fixture
    def workflow_content(self, workflow_path):
        with open(workflow_path, 'r') as f:
            return yaml.safe_load(f)
    
    def test_workflow_exists(self, workflow_path):
        assert workflow_path.exists()
    
    def test_workflow_has_name(self, workflow_content):
        assert 'name' in workflow_content

class TestWorkflowMetadata:
    @pytest.fixture
    def workflow_content(self):
        workflow_path = Path(__file__).parent.parent.parent / '.github' / 'workflows' / 'blank.yml'
        with open(workflow_path, 'r') as f:
            return yaml.safe_load(f)
    
    def test_name_is_ci(self, workflow_content):
        assert workflow_content['name'] == 'CI'

Problems:

File path computed multiple times
YAML file parsed multiple times (once per test class)
Each test class has its own fixtures

After Optimization

# Module-level fixtures (computed once)
@pytest.fixture(scope='module')
def workflow_path():
    """Module-scoped fixture for workflow file path."""
    repo_root = Path(__file__).parent.parent.parent
    return repo_root / '.github' / 'workflows' / 'blank.yml'

@pytest.fixture(scope='module')
def workflow_content(workflow_path):
    """Module-scoped fixture for parsed workflow content."""
    with open(workflow_path, 'r') as f:
        return yaml.safe_load(f)

class TestWorkflowStructure:
    def test_workflow_exists(self, workflow_path):
        assert workflow_path.exists()
    
    def test_workflow_has_name(self, workflow_content):
        assert 'name' in workflow_content

class TestWorkflowMetadata:
    def test_name_is_ci(self, workflow_content):
        assert workflow_content['name'] == 'CI'

Results:

✅ File path computed once (not per-class)
✅ YAML parsed once (not per-class)
✅ Fixtures shared across all test classes
✅ 50-70% reduction in file I/O operations
✅ Tests remain independent and reliable

⚠️ Important Considerations

When NOT to Use Wide-Scoped Fixtures

Don’t use module or session scope if:

Test modifies the data: Mutable data should be per-test to avoid interdependence
Test has side effects: Database writes, file modifications, etc.
Random data is needed: Each test should get fresh random data
Setup is lightweight: Caching a simple operation may add complexity for no benefit

Maintaining Test Independence

Wide-scoped fixtures are safe when:

Data is immutable (read-only)
Operations are idempotent (same result every time)
Tests only read from shared fixtures, never modify

# ✅ SAFE: Immutable data
@pytest.fixture(scope='module')
def config():
    return {'timeout': 30, 'retries': 3}  # Never modified

# ❌ UNSAFE: Mutable data
@pytest.fixture(scope='module')
def shared_list():
    return []  # Tests might append to it

📈 Expected Performance Gains

Based on the optimizations implemented in this project:

Optimization	Impact	Typical Reduction
Module-scoped file I/O	High	80-95% fewer file reads
Module-scoped parsing	High	80-95% fewer parse operations
Path caching	Medium	50-70% fewer path constructions
String optimization	Low-Medium	10-30% faster string operations
Generator usage	High (memory)	70-95% less memory for large datasets

🎯 Best Practices Summary

Do’s ✅

Use module/session scope for expensive, immutable operations
Cache file I/O operations with appropriate fixture scopes
Share read-only data across tests
Profile tests regularly to identify bottlenecks
Optimize string operations to avoid unnecessary transformations
Use generators for large datasets
Document why you chose specific fixture scopes

Don’ts ❌

Don’t share mutable data across tests
Don’t cache operations with side effects
Don’t optimize without measuring first
Don’t sacrifice test clarity for minor performance gains
Don’t use wide scopes for random or time-dependent data
Don’t forget test independence

📚 Additional Resources

Fast tests = Happy developers

This site is open source. Improve this page.