Who is better at forecasting: AI or Humans?
As AI becomes ever-smarter, it begs the question: who is better when it comes to demand forecasting: AI or humans?
When it comes to producing demand forecasts, they are either largely human-driven or largely machine-driven. Humans and AI both have their own strengths and weaknesses.
What's Good and Bad about Human Forecasts?
Human forecasts are expert opinions about the future, drawing on instincts and experience rather than just coldly reviewing historical data. This approach can be highly valuable in volatile environments where external factors—such as economic downturns or geopolitical conflicts—can rapidly shift. Human forecasters can incorporate market insights from Sales, Marketing, or Key Customers, which is crucial in such unpredictable conditions.
However, human forecasts are not without their flaws. Bias is an inherent part of human judgment, leading to overly optimistic predictions, inconsistent adjustments to new information, and even politically charged manipulation. For example, forecasts might be inflated to meet sales quotas or not sufficiently revised after losing a significant deal. Or forecasts could be inflated during the 'judgment' phase to bridge any gaps vs. the budget (what we would like to see happen as opposed to what is most likely to happen).
What's Good and Bad about Machine Forecasts?
In contrast, machine forecasts are based on extrapolating historical data, or modeling impacts of historical driver data on historical sales, then constructing a future demand forecast based on forecasted / planned values for drivers. Machine models whether pure extrapolative or driver modeling based, offer a gaming-free and scalable approach. Machines can analyze vast amounts of data across numerous product-location-customer combinations, and can handle both linear and non-linear relationships among a large number of drivers. Driver based demand models also enable sophisticated "what-if" scenario simulation capability to perform manual demand shaping or machine driven demand shaping optimization: to enable Demand Planners or Commercial Teams to understand the full spectrum of likely outcomes through sensitivity analysis or optimal values for sales prices, trade promotions etc.
However, machine forecasts come with their own set of challenges. They rely on the assumption that past can be extrapolated into future, or past relationships of demand drivers to sales will continue to hold in future. They also need comprehensive and clean data sets. AI/ML based machine forecasts require expertise in Data Science given significant ongoing effort in model hyper parameter finetuning to avoid model drift. Machine forecasts are unlikely to respond effectively to black swan events such as disruptions in trade routes due to enemy action.
What's wrong with Human Judgment applied to Machine Forecasts?
Demand Planning processes have been usually designed to start from a baseline statistical forecast, then let humans apply judgment to it, then manually try to blend all human judged forecasts across sales / marketing / demand planners in a single one number consensus demand (forecast review & approval phase). This approach looks solid in theory, but in practice, it quickly breaks down, as humans can freely introduce their bias as they override the baseline statistical forecast. Forecast value added was introduced as a concept to stem the value leakage by identifying where such overrides are destroying value. Again, this approach also looks sound in theory, the challenge is that any overrides made & retained in the consensus demand last cycle are ignored as baseline statistical forecast is generated fresh in the current cycle. This necessitates a painful process of re-inserting the same overrides to be able to get back to the consensus demand agreed in the past cycle. Given the manual redundant effort involved, many companies simply provide the baseline statistical forecast as a reference point for humans to look at, but default the prior cycle human forecasts into current cycle, which risks the statistical forecasts to be largely ignored.
What's the Best Practice to leverage both Humans and AI?
To truly leverage the strengths of both human and machine forecasts, it’s crucial to integrate them rather than treating them as separate entities or sequential steps.
What is needed is a process where the baseline statistical forecast is generated as an optimal blend across current cycle machine forecasts and last cycle human forecasts. This forecast retains all the human override goodness (where humans have a track record of making value-adding overrides) and ignore human bias (where trivial overrides made are within the noise range, and/or the override track record has not been helpful). This enables all humans to receive a starting point which is the best possible across human and machine intelligence, but which still allows all humans to judge and enrich the forecast, to then follow the consensus demand review process.
How Optimal Blending works to significantly reduce forecast error as well as planner effort
Idea is to combine the current cycle machine forecast and previous cycle human forecast simultaneously in an optimal blend, which dynamically allocates weights to all machine or human forecasts for a given product / location / customer / lag based on optimizing n-period moving average error or bias during the training part of history (in-sample data). The optimal blending performance is tested during the validation part of history (holdout data). A validated model is then used to make predictions about the future.
This approach is flexible to include as many machine forecasts (traditional 'extrapolative' forecasts like exponential smoothing, advanced Machine Learning / Deep Learning based forecasts such as Neural Networks based, or demand driver based forecasts) and as many human forecasts (customer forecast, sales forecast, marketing forecast, demand planner forecasts, demand planning manager judged forecasts, consensus demand, etc.).
You need a state of the art engine that can perform this optimal blending at all hierarchy level combinations (product family - customer group vs. product group - customer region, etc.) and all time buckets (forecasts in daily/weekly/monthly/quarterly buckets) to autonomously choose the best level in terms of forecastability and still retain all the goodness of human overrides that have helped in the past.
How to accept human wisdom and reject human bias: guardrails for human overrides
You can now offer this current cycle optimal forecast as a starting point to humans for enrichment in the current cycle. However, now you have the human intelligence from last cycle already baked in (unlike a traditional statistical forecast baseline, which does not), and now overrides can be limited to where new information has been gathered since the last cycle. Humans are free to override the optimal forecast baseline, but then we track override value add (error reduction in such human forecast, compared to optimal forecast baseline). Decision support is provided in terms of recommended override direction and ranges based on their track record of making value-added overrides. You now have the option of automatically flagging overrides which are in the other direction than recommended, or the ones which exceed override bands recommended.
This approach reduces frivolous overrides, incorrect overrides, and allows the optimal blending model to self-learn and correct, as human overrides are one of the demand drivers. This lean forecasting approach retains all the goodness of human input while limiting the damage from human bias (value-destroying overrides, e.g., inflating forecast to bridge the gap from sales quota or annual operating plan). It also enables understanding of forecast performance across stakeholder groups (Forecast Value Added trends by lag).
Mirror Mirror on the Wall: how to internally benchmark Human and Machine Performance with respect to Forecastability (great/good/bad/ugly)
Such a system should also establish internal benchmarks in terms of Forecastability Bands (lowest error possible vs. highest error acceptable) and visualize relative performance of any human or machine forecast (Forecast Value Added or Relative Absolute Error) at any lag vs. such forecastability bands.
Forecast Value Added, Forecastability Bands, and Revenue/Margin based segmentation combined are the best way of understanding performance, setting validated targets for accuracy improvement, and focus planner effort where we have the right balance between economic value and accuracy improvement headroom.
We have tested the predictive power of vyan.ai engine's ensemble forecasting approach on large sized data sets and seen significant value through the predictive power from the ability to identify complex relationships across a large number of variables and forecasts. We have found that optimizing the ensemble model on an n-period moving average error / bias as opposed to period specific error / bias delivers a stable self-correcting high performance model, producing much better forecasts than any of the underlying human or machine forecasts.
What's the hard benefit of Optimally Blending Human and Machine Intelligence
Now you can autonomously predict shifting demand patterns at scale, provide override guidance to your demand planners, and significantly reduce effort and error as you move to n AI-powered lean forecasting process. Vyan.ai delivers all this goodness as it enhances process efficiency and forecast quality, reduces hard costs of forecast error (inventory carrying costs, excess inventory liquidation costs, production / transportation expediting costs) while increasing revenue, margin, and market share (reduce lost sales, increase customer service levels and trust). There is significant ROI and improved stakeholder satisfaction on the table: 10-30% forecast error reduction and 3-10% reduction in cost of goods sold related to poor forecasting.