CI Integration for ML Tests
Running ML tests in continuous integration pipelines. Part of the Unit Testing for ML Pipelines course at AI School by Lilly Tech Systems.
Why CI Matters for ML Projects
Continuous Integration (CI) automatically runs your test suite whenever code is pushed, ensuring that new changes do not break existing functionality. For ML projects, CI is even more important than for traditional software because ML bugs are often silent. A broken feature engineering function does not throw an error; it just produces wrong features that lead to a degraded model. Only automated tests running in CI catch these issues before they reach production.
GitHub Actions for ML Testing
GitHub Actions is a popular choice for ML CI due to its free tier and easy GPU access. Here is a production-ready workflow configuration:
# .github/workflows/ml-tests.yml
name: ML Pipeline Tests
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
unit-tests:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.10', '3.11']
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Cache pip packages
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install pytest pytest-cov pytest-xdist
- name: Run unit tests
run: pytest tests/ -m "not slow and not integration" -n auto --cov=src --cov-report=xml
- name: Upload coverage
uses: codecov/codecov-action@v4
with:
file: coverage.xml
data-validation:
runs-on: ubuntu-latest
needs: unit-tests
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run data validation tests
run: pytest tests/ -m "data" -v
integration-tests:
runs-on: ubuntu-latest
needs: [unit-tests, data-validation]
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: pip install -r requirements.txt
- name: Run integration tests
run: pytest tests/ -m "integration" -v --timeout=300
Handling Large Test Data in CI
ML test suites often need sample datasets that are too large for Git. Strategies for managing test data in CI:
- Small synthetic fixtures — Generate test data programmatically in conftest.py (preferred for unit tests)
- DVC (Data Version Control) — Track test data separately from code, pull it in CI
- Git LFS — Store large test files with Git Large File Storage
- Cloud storage — Download test data from S3 or GCS during CI setup
Managing ML Dependencies in CI
ML dependencies (TensorFlow, PyTorch, scikit-learn) are large and slow to install. Speed up CI with these techniques:
- Use pip caching to avoid re-downloading packages
- Create a minimal requirements-test.txt with only what tests need
- Use Docker images with pre-installed ML libraries
- Pin exact versions to avoid surprise breaking changes
CI Quality Gates for ML
Configure your CI pipeline to enforce quality standards:
# Add to your CI workflow
- name: Check test coverage
run: |
coverage=$(pytest tests/ --cov=src --cov-report=term | grep TOTAL | awk '{print $4}' | tr -d '%')
if [ "$coverage" -lt 80 ]; then
echo "Test coverage $coverage% is below 80% threshold"
exit 1
fi
- name: Check for test warnings
run: pytest tests/ -W error::UserWarning -W error::DeprecationWarning
Pre-commit Hooks for ML Code
Add pre-commit hooks that run fast checks before code is committed: linting with ruff or flake8, type checking with mypy, and running the fastest subset of tests. This catches issues before they even reach CI, saving pipeline time.
PYTHONHASHSEED=0 and fixed random seeds to ensure reproducibility across environments.
Lilly Tech Systems