How much savings will we achieve if we improve forecasting accuracy by 10%?
Supply Chain VPs around the world.
Improving forecasting accuracy should help your inventory quality: you will have a better idea of what, when, and how much to produce. Ultimately, better forecasts mean higher service levels, lower inventory levels, fewer dead stocks, expedites, and redeployments. Unfortunately, supply chains are too complex to be approximated by a few equations. So, as explained in our previous article, no simple formula can tell you how much you will save thanks to better forecasts (or the extra service level or inventory reduction you can expect).
Instead, at SupChains, we advise running inventory simulations (also known as digital twins) to measure the impact of an improved forecasting engine (or forecasting process).
How Does a Digital Twin Work?
A digital twin is an elaborate data-driven simulation of your supply chain replicating its core processes. Running such a simulation allows you to make what-if scenarios using historical data. Using these, you can assess the impact of specific model refinements (such as investing in a new forecasting model) or business changes (such as opening a warehouse).
To answer the question, How much savings will we achieve if we improve forecasting accuracy by 10%? We can re-play specific time sequences (we advise simulating multiple years) to simulate the impact of using machine learning forecasting instead of simplistic models. We will use this historical data to replicate how your supply chain would have reacted to different forecasts. As these simulations reproduce your inventory policies (powered by different forecasts), they will track would-have-been service levels, shortages, and stock levels (including excess inventory).
The Perfect Inventory Competition
Businesses need accurate forecasts and adequate inventory policies to run their supply chains. But how important are…
How Important Is Forecasting Accuracy?
To assess the impact of forecasting accuracy on business metrics (such as service level and inventory turns), we set up our simulation engine with a real dataset with around 7000 products from a pharmaceutical distributor. (You can read more about this dataset and the supply chain in our previous case study.)
Using our digital twin, we simulated how inventory levels would have fluctuated in 2022 if the distributor had used one of our forecast engines to replenish its stock. In this realistic simulation, we used actual order values and generated new forecasts and supply orders every week using historical data available at the time.
In the simulation, we compared three forecasts: MA, ML1, and ML2. Before looking at the business results, let’s compare these three forecasts:
- Forecast MA was generated using a moving average of historical sales values.
(I like to use moving averages as benchmarks as they are simple to use, and they often beat statistical engines — often to the surprise of planners and software vendors.)
- Forecasts ML1 and ML2 were both generated using different machine learning algorithms. You can read more information about our models here.
As you can see in the previous table, we compared our three forecasts using various metrics (Bias, MAE, RMSE, and WMAPE). Long story short, I only advise tracking MAE and Bias. WMAPE is simply inappropriate, and RMSE doesn’t scale well to product portfolios. (You can read more about these pros and cons in my new book, Demand Forecasting Best Practices.)
Forecast KPI: RMSE, MAE, MAPE & Bias
The article below is an extract from my book Data Science for Supply Chain Forecast, available here. You can find my…
As explained in my book, I like combining MAE and Bias into a single “Score” value for a more straightforward discussion.
In short, ML1 reduced the score error from the benchmark by 16%, and ML2 reduced it by 23%.
Our inventory framework can test up to 23 different inventory models. But, for this analysis, we decided to stick to the usual safety stock model:
- Order up to level = Forecast over risk horizon + Safety stocks
- Safety stocks = zσ√(L+R), where σ is computed as the forecast error RMSE.
You can see below the overall result of our digital twin. We presented these results by drawing an efficient frontier between service and inventory levels. (Reading this graph will tell you how much service level you can achieve with a given average weekly inventory coverage.)
In the following table, we tracked, for each forecasting model, how much inventory coverage (expressed as a number of weeks) was required in the simulation to achieve a specific service level.
As you can see, ML1 can deliver the same overall fill rate as the benchmark but with 6.6% less inventory. ML2 provides an even higher 9.6% inventory reduction. The relative inventory reduction is particularly high for low fill rates, with a slight decline for higher service level targets.
The following table shows how much fill rate you would achieve with a specific global average weekly stock coverage. For the same inventory coverage, ML1 enjoys a 9.5% shortage reduction compared to the benchmark, whereas ML2 delivers a 14.0% reduction. As you can see, the relative shortage reduction is fairly consistent despite a slight decrease with high stock coverages.
We summarized the results in the table below.
Overall — and the results are consistent same for both ML1 and ML2 — each 1% forecast error reduction results in a 0.4% inventory reduction or a 0.6% shortage reduction. This supports the idea that having a good forecasting score (high accuracy and low bias) results in positive, strong business outcomes. So we can now answer our initial question: How much can we expect from a 10% forecasting improvement? The effect would be around a 4% inventory reduction or a 6% shortage reduction. Of course, these numbers could change based on your products’ forecastability, lead times, and various factors (MOQ and MOV, among others), so they should be confirmed for different supply chains.
The next question would be,
Is a 10% forecasting improvement an expectable result from an improvement project?
Our recent detailed case studies should give you some ideas of what to expect: