Mind+Machine. Vollenweider Marc
the variability of possible definitions, so there is a margin of error to the map, although I believe that the order of magnitude is not too far off.
Figure I.2 Demographics of Use Cases
This map illustrates my first key point: big data is a relatively small part of the analytics world. Let's take a look at the main results of this assessment of the number of use cases.
1. Globally, there are a staggering estimated one billion implementations of primary use cases, of which about 85 percent are in B2B and about 15 percent in B2C companies. A primary use case is defined as a generic business issue that needs to be analyzed by a business function (e.g., marketing, R&D) of a company in a given industry and geography. An example could be the monthly analysis of the sales force performance for a specific oncology brand in the pharmaceutical industry in Germany. Similar analyses are performed in pretty much every pharmaceutical company selling oncology drugs in Germany.
2. Around 30 percent of companies require high analytics intensity and account for about 90 percent of the primary analytics use cases. International companies with multiple country organizations and global functions and domestic companies with higher complexity are the main players here.
3. The numbers increase to a staggering 50 to 60 billion use cases globally when looking at secondary implementations, which are defined as micro-variations of primary use cases throughout the business year. For example, slightly different materials or sensor packages in different packaging machines might require variant analyses, but the underlying use case of “preventive maintenance for packaging machines” would still remain the same. While not a precise science, this primary versus secondary distinction will be very relevant for counting the number of analytics use cases in the domain of Internet of Things and Industry 4.0. A simple change in sensor configurations might lead to large numbers of completely new secondary use cases. This in turn would cause a lot of additional analytics work, especially if not properly managed for reuse.
4. Only an estimated 5 to 6 percent of all primary use cases really require big data and the corresponding methodologies and technologies. This finding is completely contrary to the image of big data in the media and public perception. While the number of big data use cases is growing, it can be argued that the same holds true for small data use cases.
The conclusion is that data analytics is mainly a logistical challenge rather than just an analytical one. Managing the growing portfolios of use cases in sustainable and profitable ways is the true challenge and will remain so. In meetings, many executives tell us that they are not leveraging the small data sets their companies already have. We've seen that 94 percent of use cases are really about small data. But do they provide lower ROI because they are based on small data sets? The answer is no – and again, is totally contrary to the image portrayed in the media and the sales pitches of big data vendors.
Let me make a bold statement that is inevitably greeted by some chuckles during client meetings: “Small data is beautiful, too.” In fact, I would argue that the average ROI of a small data use case is much higher due to the significantly lower investment. To illustrate my point, I'd like to present Subscription Management: “The 800 Bits Use Case,” which I absolutely love as it is such an extreme illustration of the point I'm making.
Using just 800 bits of HR information, an investment bank saved USD 1 million every year, generating an ROI of several thousand percent. How? Banking analysts use a lot of expensive data from databases paid through individual seat licenses. After bonus time in January, the musical chairs game starts and many analyst teams join competitor institutions, at which point the seat license should be canceled. In this case, this process step simply did not happen, as nobody thought about sending the corresponding instructions to the database companies in time. Therefore, the bank kept unnecessarily paying about USD 1 million annually. Why 800 bits? Clearly, whether someone is employed (“1”) or not (“0”) is a binary piece of information called a “bit.” With 800 analysts, the bank had 800 bits of HR information. The analytics rule was almost embarrassingly simple: “If no longer employed, send email to terminate the seat license.” All that needed to happen was a simple search for changes in employment status in the employment information from HR.
The amazing thing about this use case is it just required some solid thinking, linking a bit of employment information with the database licenses. Granted, not every use case is as profitable as this one, but years of experience suggest that good thinking combined with the right data can create a lot of value in many situations.
This use case illustrates another important factor: the silo trap. Interesting use cases often remain unused because data sets are buried in two or more organizational silos, and nobody thinks about joining the dots. We will look at this effect again later.
Summing up the first fallacy: not everything needs to be big data. In fact, far more use cases are about small data, and the focus should be on managing portfolios of profitable analytics use cases regardless of what type of data they are based on.
FALLACY #2
MORE DATA MEANS MORE INSIGHT
Companies complain that they have far more data than insight. In 2014, the International Data Corporation (IDC) predicted that the amount of data available to companies will increase tenfold by 2020, doubling every two years.4 In one conversation, a client compared this situation to a desert with the occasional oasis of insight.
“Marc, we are inundated with reports and tables, but who'll give me the ‘so what'? I don't have enough time in the day to study all the data, and my junior people don't have the experience to come up with interesting insights.”
The ratio seems to worsen daily, as the amount of available data rises rapidly, while the level of insight remains constant or increases only slightly. The advent of the Internet of Things puts us at risk of making this ratio even worse, with more devices producing more data.
As the Devex consulting practice writes:
Stanley