Automated Testing

Updated May 15, 2023 ·

Overview

Software testing ensures that an application works as expected. In ML, testing goes beyond traditional methods and adds more complexity due to the nature of data and model behavior.

Types of Software Tests

Different types of tests check the functionality of an application:

Unit Tests
- Test individual components
- Ensure each part works as intended
Integration Tests
- Check how components interact
- Identify issues in the communication between parts
End-to-End Tests
- Test overall system functionality
- Ensure the application works from start to finish

For more information, please see Testing.

ML Systems Are Different

ML systems depend on data and models, which makes them unique:

Data Dependency
- Models rely on the quality of training data
- Data influences the model’s behavior
- Poor data can affect performance
Model-Driven Behavior
- Behavior is learned, not pre-programmed
- Models adapt over time based on data
- Stale models can degrade performance
- Models need regular updates

Testing ML systems is more complex than traditional systems, as shown in the figure below:

Types of Tests for ML Systems

In addition to traditional softwarer tests, ML systems also require unique types of tests to ensure quality.

Data Tests

Data Tests are deterministic tests which checks if features meet certain expectations.

Verify if feature follows known distributions
Check for compliance with privacy standards
Features should provide enough value to justify costs

For instance, if a feature is based on temperature measurements, we should verify that the values fall within the expected range.

Model Tests

ML models enhance user experience by optimizing model performance using metrics like log-loss, ultimately improving user satisfaction.

Ensure all hyperparameters are tuned
Monitor metrics to avoid overfitting
Maintain accuracy with validation and monitoring

To assess prediction impact, we must evaluate model staleness and decide when to update it. Regular testing against a baseline model helps measure the value of advanced techniques.

Pipeline Tests

Testing ML pipelines involves complex workflows. To ensure smooth operations, we should:

Ensure reproducibility of training results
Verify end-to-end functionality of the pipeline
Conduct integrations tests that include data and model tests

Overview​

Types of Software Tests​

ML Systems Are Different​

Types of Tests for ML Systems​

Data Tests​

Model Tests​

Pipeline Tests​