Skip to main content

Automated Tracking

Updated May 15, 2023 ·

ML is Experimental

Machinge Learning is experimental in nature and it involves trial and error. Engineers test different approaches to find the best model.

  • Modify data sources (e.g., applying transformations)
  • Train models with different algorithms
  • Choose evaluation metrics to compare performance

Since there are too many possibilities, manually tracking every experiment is impractical.

Problems Without Automation

Without automated tracking, ML workflows become chaotic.

  • Hard to track experiments and results
  • Difficult to reproduce and share findings
  • Wasted time and resources

Logging

Logging requirements depend on the project. Common elements include:

ComponentDescription
CodeVersion of scripts used
ConfigurationExperiment settings and environment details
DataSource, type, format, preprocessing steps
ModelParameters, hyperparameters, and evaluation metrics

Benefits:

BenefitDescription
ReproducibilityOthers can verify and rerun experiments
Performance trackingCompare models and improve results
TransparencyUnderstand decisions and detect issues

Automated Tracking Systems

Automated Tracking Systems organize and display logged data.

  • View training metadata and compare runs
  • Track and reproduce experiments
  • Access via dashboards or programmatic interfaces
  • Promote top models to the model registry

Available Tools

Building a tracking system from scratch is complex. There aree options that exists in the market:

  • Open-source: MLflow, Weights & Biases, Sacred
  • Paid solutions: Azure ML, Google Vertex AI, AWS SageMaker

Choosing the right tool depends on budget and requirements.