Strategy and Policy

Having a strategy is key in developing a roadmap for all government divisions.

The example below, indicates the thinking that goes into a Strategy document, when it comes to prediction. This thinking can be applied to Cloud and Data services as well.

Strategy and Policy Services

Cost modelling, for delivery and on-going costs

Business case development

Use case development

Data strategy

Resourcing

Better data models and computers are at the core of progress in prediction. To understand the value let us look at the long standing problem of prediction i.e. forecasting. What marketers call customer churn.

For many businesses customers are expensive to acquire and therefore losing customers through churn is costly. In service industries like insurance, financial services and telecommunications managing churn is perhaps the most import marketing activity. The first step in reducing churn is identifying customers at risk. Historically the core method for predicting churn is a statistical technique called regression. Research focused on improving regression techniques. Research proposed and tested hundreds of different regression methods in academic journals and in practise.

What is regression?

It finds a prediction based on average of what has occurred in the past. Before deep learning multivariate regression provided an efficient way to condition on multiple data points. Regression takes the data and tries to find the result that minimises prediction mistakes, maximising ‘goodness’ of it. Regression minimises prediction mistakes on average, punishes large errors more than small ones. It’s a powerful method on small datasets. Unlike regression deep learning predictions might be wrong on average, but when predictions miss, they often don’t miss by much. Throughout the 1990’s and early 2000’s experiments with deep learning had limited success. Deep learning models were improving but regression performed better.

The data and computers weren’t good enough to take advantage of what deep learning could do. By the early 2000’s that all changed. The best churn models used deep learning, generally outperformed others.

What Changed?

First data and computers were good enough to enable deep learning to evolve and dominate. In 1990’s it was difficult to build large enough datasets. A classic study of churn used 650 data points fewer than 30 variables.

By 2004, computer processing and storage improved. In the Duke tournament (The Duke university Teradata Center in 2004 had a data science tournament to predict churn), the training dataset contained information on hundreds of variables and tens of thousands of data points. With these additional data points and variables, deep learning started to perform as well if not better than regression. Today researchers base churn predictions on thousands of variables and millions of data points. Improvements in deep learning means it is possible to include enormous amounts of data, including images as well.

Improvement to machine learning models, and deep learning in particular, mean that this is possible to efficiently turn available data into accurate predictions of churn and deep learning models clearly dominate regression and various other techniques.