27 Feb My Prediction for Predicting
Many of our clients have been on the “Operational Data Journey” for a number of years. They have reached a point where they are gathering accurate data, from across the value chains of their organisations, in a timely and accurate manner, delivering excellent insights and driving improvement. They continually update their Data Analytics in line with changes in their business, and are expanding staff access to Analytics so as to maximise the number of data workers with true insight into performance drivers.
The next logical step is to look at Advanced Analytics so as to predict business needs and outcomes. We see companies attempting to develop budgets and forecasts based upon an in-depth knowledge of performance over the last number of years. We see hospitals trying to predict attendance, admissions and lengths of stay, based upon predictor variables they have determined from studying data from a number of years’ records.
But how accurate, and therefore valuable, can these predictions be? What is the risk associated with relying upon predictive analytics?
Perhaps the worst example in recent history was the prediction of the likelihood of an earthquake of magnitude 9.0 would occur in the Fukushima region of Japan. (From Nate Silver’s “The Signal and the Noise”)
The Gutenberg_Richter law shows a linear relationship between magnitude and frequency (on a logarithmic scale) and would predict a 9.0 magnitude earthquake happening once every 300 years in Japan, which is not an insignificant risk.
However, above magnitude 7.5, there’s a kink in the graph, and with no magnitude 8.0 earthquake in the region since 1864, there is a challenge around how best to model.
Gutenberg Richter would imply that the kink should be ignored and more data in the future will fall in line with the prediction, but seismologists deployed a characteristic fit, to describe historical frequencies.
The result is a prediction that a magnitude 9.0 earthquake would only occur once in 13,000 years, hence the decision to build Fukushima to withstand up to magnitude 8.6, with the resulting catastrophic consequences.
In most of our businesses and organisations, the downside of a wrong prediction will never be so catastrophic. However, consequences of persistently inaccurate predictions will include a loss of faith in data and ceasing use of predictions.
Therefore, it’s vital to understand the limitations of predictions and to build in and accept a degree of uncertainty. A paediatric hospital in New South Wales predicts the average length of stay of its patients to within 1 day, with 80% accuracy. An NHS Trust uses an algorithm to predict attendance at A&E to help rostering, but with an inbuilt safety, mechanism to guard against under-staffing.
But perhaps the biggest benefit we’ve seen from projects we’ve been involved in has the determination of the most significant predictor variables. This allows organisations to differentiate between the signal and the noise in the vast arrays of data they are processing and trying to decipher.
Understanding the most significant predictor variables allows organisations to focus on them, and to de-focus on the extraneous ones. So for example, is competitive pricing or brand value the most accurate predictor of FMCG sales? Is attendance at school the most significant predictor of grades?
It will be interesting to assess the data arising from retail sales in the run up to and after Storm Emma. Should bakers have predicted that a forecast of -10 and snow showers be sufficient to empty stores nationwide?
So my Prediction for Predicting is…?
The key benefits arising are an understanding of the predictor variables. What’s really driving the outcomes we see in a business? How can we address and optimise them? My worry relates to an over-confidence based on prediction and a lack of awareness of the remaining uncertainty. An 80% prediction of an occurrence is of huge benefit, but only so long as there remains the awareness that there’s still a 1 in 5 chance of being incorrect.
To find out more about Advanced Analytics, come to the Capventis Event: Whiskey and Data: Chasing Time in Teeling Whiskey Distillery, on Wednesday the 14th March, to book your place today click here.