Skip to main content

Retraining the Model

Updated May 12, 2023 ·

Overview

Retraining is essential to keep the model updated and accurate as data changes over time.

  • Changes in data over time

    • Data naturally changes, affecting model accuracy.
    • Refreshing data helps adjust the model to new patterns.
  • Drift in data

    • Changes in data or relationships can affect model predictions.

Types of Drift

There are two main types of drifts:

  • Data drift

    • Change in the input data (e.g., new customer demographics).
    • May affect performance, but not always.
  • Concept drift

    • Change in how input data relates to the target variable.
    • Can reduce model accuracy if patterns change.

How Often to Retrain?

Retraining frequency depends on multiple factors.

  • Business environment

    • Fast-changing industries need more frequent retraining.
    • Market condition changess may need quicker adjustments.
  • Cost of retraining

    • Requires resources and money, especially for complex models.
    • High costs need to be weighed against model performance gains.
  • Business requirements

    • High accuracy standards may necessitate more frequent updates.
    • If performance drops below thresholds, retraining is needed.

Retraining Methods

When retraining, you can choose how to update the model.

  • Separate model for new data

    • Train a new model only on the latest data.
    • Good for handling shifting data but may miss older trends.
  • Combine new and old data

    • Merge both data sets to create a new model.
    • Maintain context with past data while adapting to changes.

Automatic Retraining

Retraining can be automated based on drift detection.

  • Set thresholds for data or concept drift to trigger retraining.
  • Reduces manual intervention by adjusting the model when needed.
  • Useful in systems with frequent updates and fast-changing environments.