How do you maintain the performance of Machine Learning models?

May 26, 2025
3 min read

There is a boom in the use of Machine Learning by companies around the world. However, these models need regular monitoring to prevent any dip in terms of performance. Erwan Scornet, Senior Lecturer at the Centre for Applied Mathematics at the École Polytechnique in Paris, describes the measures needed to keep these models in optimal condition.

Maintaining the performance of a model: A constant challenge

Machine Learning is based on collecting data and using it to make predictive analyses. "An algorithm must first be trained on data before it can be applied," summarizes Erwan Scornet. "Once the model has been trained, new data can then be collected over time and analysed by this fully-trained model.”

However, the performance of that model can decline over time, for two main reasons. The first is the appearance of very different data to that used during the training phase. For example, if the training was based on data about apartments in Paris and the model is subsequently used to predict house prices in the United States, the difference in the type of data will be too great to achieve a good level of performance.

Far more difficult to correct, the second reason is a change in the market. The situation being modelled with Machine Learning could change, due to a financial crisis or a pandemic, for example. "Generally speaking, performance instability is fairly unavoidable,” says Erwan Scornet. “In theory, to prevent it, you would have to make predictions based on similar data to the data used to train the algorithm, while also making sure there was little variation in the underlying conditions over time. But in the property market, those conditions are unlikely to occur.”

Detecting and anticipating change

Though difficult to avoid, changes in a model’s accuracy can at least be detected and corrected. "One solution is to monitor the model's predictive performance and to make regular updates. You can assess how far the model is from the true state of the market by comparing its predictions with the data collected every month or two, for example.” Collecting data on a regular basis is therefore a key point. "If you let six months go by, the model could be completely adrift of the actual market, meaning a significant loss of performance,” Erwan Scornet warns. “In the property sector, it’s good to be monitoring price trends with online databases of sales and rentals on a monthly basis.”

That said, this indicator alone is not sufficient, as it can hide major disparities, depending on the data. As cited in the earlier example, the model may be very good at valuing Paris apartments, but not so good with houses in the United States. A more detailed analysis, drawing on specific information, is needed. “When it comes to mortgage lending, we don't want the prediction to depend on a borrower's gender,” Erwan Scornet points out. “However, biases arise because of the dataset on which the algorithm has been trained. So, you need to regularly monitor not only the prediction error of the algorithm, but also the influence of so-called sensitive variables, to avoid any discrimination. Keeping these influences in check is vital if people are to feel comfortable using Machine Learning, but the task becomes more difficult as the number of sensitive variables increases."

There is no substitute for the human touch

For companies, the main constraint to improving the usability of Machine Learning is money. "To prevent the algorithm from replicating and accentuating any of these biases – thereby creating errors – you need human intervention," explains Erwan Scornet. Companies therefore need to create new positions for people to assess the performance of their Machine Learning models and to collect data on a regular basis.

To meet that demand, new jobs are now emerging, such as MLops (Machine Learning Operations). "It’s a mixture of DevOps and the data scientist’s role", according to Erwan Scornet. MLops complements the work of the data architect, who builds the infrastructure for storing the data, and the data scientist, who creates the Machine Learning model. In short, the role of MLops is to continuously monitor and re-train the models to maintain their performance.