Developer Guide¶

Guide for contributing to and extending the Ethopy Analysis package.

Development Setup¶

Prerequisites¶

Python 3.8+
Git
Virtual environment tool

Installation¶

# Clone repository
git clone <repository-url>
cd ethopy_analysis

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install in development mode with all dependencies
pip install -e ".[dev,docs,export]"

Development Dependencies¶

The [dev] extra includes: - pytest: Testing framework - black: Code formatting - isort: Import sorting - flake8: Style checking - mypy: Type checking - jupyter: Notebook support

Code Quality¶

Linting with ruff¶

# Check code style
ruff check src/

# Fix automatically
ruff check --fix src/

# Check specific files
ruff check src/ethopy_analysis/cli.py

Additional linting with pylint¶

# Check code quality
pylint src/ethopy_analysis/

# Check specific module
pylint src/ethopy_analysis/data/loaders.py

# Generate report
pylint --output-format=text src/ethopy_analysis/ > pylint_report.txt

Code Formatting¶

# Format code with black
black src/ethopy_analysis/

# Check formatting without changes
black --check src/ethopy_analysis/

# Sort imports
isort src/ethopy_analysis/

Package Architecture¶

Directory Structure¶

src/ethopy_analysis/
├── __init__.py              # Package initialization
├── cli.py                   # Command-line interface
├── data/                    # Data handling
│   ├── __init__.py
│   ├── loaders.py          # Data loading functions
│   ├── analysis.py         # Analysis functions
│   └── utils.py            # Data utilities
├── plots/                   # Visualization
│   ├── __init__.py
│   ├── animal.py           # Animal-level plots
│   ├── session.py          # Session-level plots
│   ├── comparison.py       # Multi-animal comparisons
│   └── utils.py            # Plotting utilities
├── db/                      # Database access
│   ├── __init__.py
│   └── schemas.py          # Database schemas
└── config/                  # Configuration
    ├── __init__.py
    ├── settings.py         # Configuration management
    └── styles.py           # Plot styling

Design Principles¶

1. DataFrame-First - All functions work with pandas DataFrames - DataJoint-based data loading with DataFrame interfaces - Easy testing and debugging

2. Function-Based - Prefer functions over classes - Clear, descriptive function names - Composable and reusable

3. Consistent Signatures - Common parameter patterns - Standard return types - Predictable behavior

Adding New Features¶

Adding Data Loading Functions¶

# In src/ethopy_analysis/data/loaders.py
def get_new_data_type(animal_id, session, format="df"):
    """Load new data type for analysis.

    Parameters
    ----------
    animal_id : int
        Animal identifier
    session : int
        Session number
    format : str
        Return format ("df" or "dj")

    Returns
    -------
    pandas.DataFrame or datajoint.Table
        Data for analysis
    """
    # Implementation here
    pass

Adding Plotting Functions¶

# In src/ethopy_analysis/plots/animal.py or session.py
def plot_new_analysis(animal_id, session, save_path=None):
    """Create new analysis plot.

    Parameters
    ----------
    animal_id : int
        Animal identifier
    session : int
        Session number
    save_path : str, optional
        Path to save plot

    Returns
    -------
    matplotlib.figure.Figure
        Plot figure
    matplotlib.axes.Axes
        Plot axes
    """
    import matplotlib.pyplot as plt

    # Create plot
    fig, ax = plt.subplots()

    # Add plot logic here

    if save_path:
        plt.savefig(save_path, dpi=300, bbox_inches='tight')

    return fig, ax

Adding CLI Commands¶

# In src/ethopy_analysis/cli.py
@main.command()
@click.option('--animal-id', type=int, required=True, help='Animal ID')
@click.option('--output-dir', default='./output', help='Output directory')
def new_command(animal_id, output_dir):
    """New CLI command description."""
    try:
        # Implementation here
        click.echo(f"Processing animal {animal_id}")

    except Exception as e:
        click.echo(f"Error: {str(e)}", err=True)
        sys.exit(1)

Code Style Guidelines¶

Function Documentation¶

def example_function(param1, param2, optional_param=None):
    """Brief description of function.

    Longer description if needed. Explain what the function does,
    its purpose, and any important details.

    Parameters
    ----------
    param1 : type
        Description of param1
    param2 : type
        Description of param2
    optional_param : type, optional
        Description of optional parameter

    Returns
    -------
    type
        Description of return value

    Examples
    --------
    >>> result = example_function(1, 2)
    >>> print(result)
    3
    """
    # Implementation
    pass

Error Handling¶

def robust_function(animal_id, session):
    """Function with proper error handling."""
    if not isinstance(animal_id, int):
        raise ValueError("animal_id must be an integer")

    if animal_id < 0:
        raise ValueError("animal_id must be positive")

    try:
        # Main logic
        result = process_data(animal_id, session)
        return result
    except ConnectionError:
        raise ConnectionError("Database connection failed")
    except Exception as e:
        raise RuntimeError(f"Processing failed: {str(e)}")

Type Hints¶

from typing import Optional, List, Dict, Tuple
import pandas as pd

def typed_function(
    animal_id: int,
    sessions: List[int],
    config: Optional[Dict] = None
) -> Tuple[pd.DataFrame, Dict]:
    """Function with type hints."""
    # Implementation
    pass

Configuration Management¶

Adding Configuration Options¶

# In src/ethopy_analysis/config/settings.py
DEFAULT_CONFIG = {
    "database": {
        "host": "",
        "user": "",
        "password": "",
        "schemas": {
            "experiment": "lab_experiments",
            "stimulus": "lab_stimuli",
            "behavior": "lab_behavior"
        }
    },
    "analysis": {
        "min_trials_per_session": 2,
        "performance_threshold": 0.8,
        "new_option": "default_value"  # Add new option
    }
}

Documentation¶

Building Documentation¶

# Install documentation dependencies
pip install -e ".[docs]"

# Build documentation
mkdocs build

# Serve documentation locally
mkdocs serve

Adding Documentation¶

Update relevant .md files in docs/
Add examples to docstrings
Update API reference for new functions
Test documentation builds locally

CI/CD and Automation¶

GitHub Actions Workflows¶

We use GitHub Actions for automated testing, building, and deployment.

Release Workflow (`.github/workflows/release.yml`)¶

Triggers: Push tags matching v*.*.* Actions: - Builds package with setuptools_scm - Uploads to PyPI with API token - Creates GitHub release with changelog - Tests installation

Setup: Add PYPI_API_TOKEN to repository secrets

Documentation Workflow (`.github/workflows/docs.yml`)¶

Triggers: Push to main branch Actions: - Builds MkDocs documentation - Deploys to GitHub Pages

Setup: Enable GitHub Pages with "GitHub Actions" source

Release Process¶

Update CHANGELOG.md with new version section
Commit changes: git commit -m "docs: update changelog for v1.0.0"
Create tag: git tag v1.0.0 && git push origin v1.0.0
Monitor workflows: Check GitHub Actions for success

Version Management¶

Uses setuptools_scm for automatic versioning: - Version from Git tags (e.g., v1.0.0 → 1.0.0) - Development versions include commit hash - No manual version updates needed

Troubleshooting¶

Common Issues: - 403 GitHub release error: Add permissions: contents: write to workflow - PyPI upload fails: Check API token in repository secrets - Version conflicts: Ensure tag doesn't already exist

Recovery:

# Delete problematic tag and retry
git tag -d v1.0.0
git push origin --delete v1.0.0
git tag v1.0.0 && git push origin v1.0.0

Common Development Tasks¶

Adding a New Analysis Function¶

Add function to appropriate module
Write comprehensive docstring
Add unit tests
Update API documentation
Add example usage

Debugging Database Issues¶

# Test database connection
from ethopy_analysis.db.schemas import test_connection
print(test_connection())

# Check configuration
from ethopy_analysis.config.settings import get_database_config
print(get_database_config())

Performance Optimization¶

# Profile code
import cProfile
cProfile.run('your_function()', 'profile_output.prof')

# Time specific operations
import time
start = time.time()
result = your_function()
print(f"Execution time: {time.time() - start:.2f}s")

Contributing Guidelines¶

Code Review Process¶

Create feature branch
Implement changes with tests
Ensure code quality (ruff, pylint)
Update documentation
Submit pull request

Commit Messages¶

feat: add new analysis function for behavioral states
fix: resolve database connection timeout issue
docs: update API reference for new functions
test: add unit tests for data loaders

Pull Request Template¶

Description of changes
Testing performed
Documentation updates
Breaking changes (if any)

Best Practices¶

Performance¶

Use pandas vectorized operations
Minimize database queries
Cache expensive computations
Profile before optimizing

Maintainability¶

Write clear, self-documenting code
Use consistent naming conventions
Keep functions focused and small
Add comprehensive tests

User Experience¶

Provide helpful error messages
Include usage examples
Maintain backward compatibility
Document breaking changes