In the past, developing traditional predictive models took so much time and effort that, once deployed, they were often used for years before being refreshed. Over that time, predictive accuracy would atrophy. As conditions changed, the gap would widen between the trained data models and the data they were analyzing in the real world.
Today, with machine learning and embedded analytics platforms, models can be refreshed much more frequently. Retesting and updates happen in months or even weeks.
How do you make sure your predictive analytics features continue to perform as expected after launch? Follow these guidelines to maintain and enhance predictive analytics over time:
Know When It’s Time to Refresh
Application teams can use these three common methods for determining when to initiate model retesting and updating:
- Seasonal. In many industries, such as retail and hospitality, customer behavior changes seasonally. It makes sense to refresh predictive models just before these cyclic patterns start to shift.
- Measurement-based. By measuring model accuracy at frequent, random points in time, you’ll pick up early signs of a predictive falloff. If your model was performing with 80 percent accuracy at launch and it’s now at only 70 percent, that’s a sure signal the behavior of the model (based on the data it was trained on) is no longer what it’s actually seeing in new data coming in. Perhaps business conditions are morphing, or a new customer behavioral trend is emerging.
- Activity-based. You can get ahead of behavioral changes from such actions by including a proactive model refresh. Add it as part of new product go-to-market plans or campaign strategies.
Boost Predictive Performance Over Time
Updates not only prevent accuracy from backsliding, they can also boost performance going forward. The beauty of machine learning is that as algorithms analyze more and more of your customers’ data over time, they generate smarter and smarter predictive models with each refresh.
Depending on your application, it may be useful to measure additional dimensions of model performance, which could suggest ways to improve further.
Sensitivity measures the number of correct “yes” predictions as a percentage of actual “yes” outcomes. It is an important measure for a healthcare application, where the model might, for example, be predicting patients at high risk for cancer. A 15 percent error rate—where the model predicts NO CANCER for patients who actually get cancer—could prevent crucial screening and early intervention treatments.
Specificity measures the number of correct “no” predictions as a percentage of actual “no” outcomes. Usually there is a tradeoff between sensitivity and specificity. So if sensitivity is more important for your application, you can increase it by lowering your threshold for specificity, and vice versa.
One way to make such refinements is through data sampling methods as a correction for data imbalance problems. These same techniques can also be used to refine model performance. In the churn example, you could increase the model’s sensitivity by down-sampling the number of data points in the majority class (will not churn) or up-sampling the number of data points in the minority class (will churn).
How to Adapt to Changing Business Requirements
To maintain and improve model performance, application teams should build in periodic re-training. But that may just be the start. Any time you want to use your model in a new way—such as in a market where customer behavior is likely to be different from the data you originally used for training and validation—you’ll need to re-train your models.
For instance, consider a model you trained using historical data from the East Coast. If you now want to roll your predictive analytics features out to customers in other parts of the country, you’ll need to re-train with data from those regions. The same goes for adapting models to different customer segments, or to a new product or line of business.
Remember, predictive analytics is not a static report that describes the past. It uses machine learning to predict future outcomes based on what it can learn from the past. Predictive analytics is only as good as the underlying model and the most recent data, which is generated by people with choices and habits that change frequently. Best in class organizations are constantly monitoring their predictive analytics to improve them and find new opportunities to improve their organization’s performance, and it shows in their financial success.