Automated Testing
Overview
Software testing ensures that an application works as expected. In ML, testing goes beyond traditional methods and adds more complexity due to the nature of data and model behavior.
Types of Software Tests
Different types of tests check the functionality of an application:
-
Unit Tests
- Test individual components
- Ensure each part works as intended
-
Integration Tests
- Check how components interact
- Identify issues in the communication between parts
-
End-to-End Tests
- Test overall system functionality
- Ensure the application works from start to finish
For more information, please see Testing.
ML Systems Are Different
ML systems depend on data and models, which makes them unique:
-
Data Dependency
- Models rely on the quality of training data
- Data influences the model’s behavior
- Poor data can affect performance
-
Model-Driven Behavior
- Behavior is learned, not pre-programmed
- Models adapt over time based on data
- Stale models can degrade performance
- Models need regular updates
Testing ML systems is more complex than traditional systems, as shown in the figure below:
Types of Tests for ML Systems
In addition to traditional softwarer tests, ML systems also require unique types of tests to ensure quality.
Data Tests
Data Tests are deterministic tests which checks if features meet certain expectations.
- Verify if feature follows known distributions
- Check for compliance with privacy standards
- Features should provide enough value to justify costs
For instance, if a feature is based on temperature measurements, we should verify that the values fall within the expected range.
Model Tests
ML models enhance user experience by optimizing model performance using metrics like log-loss, ultimately improving user satisfaction.
- Ensure all hyperparameters are tuned
- Monitor metrics to avoid overfitting
- Maintain accuracy with validation and monitoring
To assess prediction impact, we must evaluate model staleness and decide when to update it. Regular testing against a baseline model helps measure the value of advanced techniques.
Pipeline Tests
Testing ML pipelines involves complex workflows. To ensure smooth operations, we should:
- Ensure reproducibility of training results
- Verify end-to-end functionality of the pipeline
- Conduct integrations tests that include data and model tests