Skip to main content

Automated Tracking

Updated May 15, 2023 ·

ML is Experimental

Machinge Learning is experimental in nature and it involves trial and error. Engineers test different approaches to find the best model.

Modify data sources (e.g., applying transformations)
Train models with different algorithms
Choose evaluation metrics to compare performance

Since there are too many possibilities, manually tracking every experiment is impractical.

Problems Without Automation

Without automated tracking, ML workflows become chaotic.

Hard to track experiments and results
Difficult to reproduce and share findings
Wasted time and resources

Logging

Logging requirements depend on the project. Common elements include:

Code - Version of scripts used
Configuration - Experiment settings and environment details
Data - Source, type, format, preprocessing steps
Model - Parameters, hyperparameters, and evaluation metrics

Logging ensures:

Reproducibility - Others can verify and rerun experiments
Performance tracking - Compare models and improve results
Transparency - Understand decisions and detect issues

Automated Tracking Systems

Automated Tracking Systems organize and display logged data.

View training metadata and compare runs
Track and reproduce experiments
Access via dashboards or programmatic interfaces
Promote top models to the model registry

Available Tools

Building a tracking system from scratch is complex. There aree options that exists in the market:

Open-source: MLflow, Weights & Biases, Sacred
Paid solutions: Azure ML, Google Vertex AI, AWS SageMaker

Choosing the right tool depends on budget and requirements.

ML is Experimental
Problems Without Automation
Logging
Automated Tracking Systems
Available Tools