Data Management: a gentle introduction. Bas van Gils
copies of data assets as you like without affecting the original. If this were the case for physical assets then we would all be as rich as Croesus for sure. This property of data is important in chapters to come when we talk about storing, using, transferring, and managing data.
■ 2.3 DATA AND PROCESS
This brings me to the final part of this chapter: the relationship between data and process. It is safe to say that data does not magically spring into existence. On the contrary: creating data takes effort by business professionals, for example by adding data into computer systems or by manipulating existing data to create new data.
The fact that we are not so (consciously) aware of this is not surprising. Years ago – before the computer era – a lot of our data sat in paper files and records. Creating data meant getting in there and updating the files. More data meant more paper. More paper meant more space required to store the data. This, eventually, lead to bigger and bigger libraries1. In the computer-age this is different: most data is now stored digitally and adding more bits and bytes requires very little extra physical space.
Producing data in business processes is useful in itself. Things become more interesting when we consider where else that data can be used/ where else data can be put to good use to create value. Example 5 illustrates this point.
Example 5. Data and processes
Suppose you work at a company that leases expensive medical equipment to hospitals. Each time the company closes a new deal with a hospital, its records are updated (new data is added to their systems). The value of this data is that it proves that the transaction took place and that the company is owed a certain fee each time.
The data is likely to be used in other parts of the company as well. For example, sales and marketing representatives are interested in the data to investigate whether they can cross-sell insurance products with the newly leased equipment, whilst management will be interested in monthly sales reports to see how well the company is doing.
This example illustrates a point that I cannot make enough: there is a strong relationship between business processes and data (see e.g. [BRS19] for a recent discussion of this topic, bridging the gap between research and practice). Data without use in processes has no value. Processes without data cannot happen: if processes are the value creation engine of the organization, then data is its fuel. As a corollary of this discussion, this book will also have much say about processes and not just about data.
Data can only be used if it is of the right quality and can be found. The former point is easily understood: just like poor materials will likely lead to the construction of a poor physical asset, so does poor data lead to poor process performance. The latter point requires a bit more explanation. The general thinking seems to be: our data is stored in our systems and we know which systems we have – so how hard can it be to find out data? Example 6 shows that in practice this may not be as easy as it seems.
Example 6. Finding data
Let’s go back to the library case that was mentioned previously. Libraries are structured in such a way that, by and large, it should be straightforward to find the books and articles that you need. In the old days this was done through extensive cataloguing, classification, and index systems. These days all of this is automated1. It is true that in most organizations all data is stored electronically in systems. In theory it should be easy to find. However, do you have any idea how many systems your organization has for storing data about customers or products? Chances are there are dozens! Finding the right information for use is one of the key challenges for many organizations.
1 If you want to know more about information retrieval, consider reading e.g. [Pai99] - which also has a good historical overview.
The point that this example tries to make is that data is often dispersed across many systems which makes it harder to locate the right data for the right person doing his/ her job at the right time. This, in turn, shows that the value of data depends on more than it being a correct representation of the real-world: being able to use it in processes in a timely manner might be just as important. If your data is “correct” but it can’t be found in time to be used in a process then, in fact, its value is very low, or even zero.
■ 2.4 VISUAL SUMMARY
1 An interesting overview of the history of libraries can be found in [Mur09].
Synopsis - This chapter picks up where the previous chapter left off: if data is an important asset, then it should be managed as such. In this chapter, I will briefly introduce the Data Management Body of Knowledge (DMBOK) reference work on data management upon which part I of this book is based. I will use this as a backdrop to discuss some of its key challenges for data management. The challenges are illustrated with small examples.
■ 3.1 A DEFINITION OF DATA MANAGEMENT
In the previous chapter, I have discussed the concept of data as an asset to signify the importance of data for an organization. We pick up the discussion with a claim: if data is such an important asset to the organization, then it should be managed as such. This is the realm of data management.
Simply put, DM is the capability that is concerned with managing data as an asset. This definition is still somewhat vague and requires further clarification. In [AB13], Peter Aiken points out that “any holistic examination of the information technology field will reveal that it is largely about technology – not about information”. We begin by stating that data management is largely about putting the “I” back in “IT”. This observation shows that DM is not solely an IT capability.
Sidebar 2. Interview with Marc van den Berg (summer 2019)
Many organizations are currently experiencing challenges with data due to past decisions and are paying the price because of the investment they have to make to fix their data after the fact. At the same time, these organizations want to make a quantum leap forward and reap the benefits from new technologies such as big data and artificial intelligence. This will not work, as first you must have your house in order. In my view this means: make sure you have shared goals about what you want to achieve with data, and subsequently align business and IT to attain those goals.
Marc van den Berg is managing director of IT and Innovation at PGGM, a Dutch pension provider.
It appears that in most organizations there is no longer a real, meaningful difference between “the business side of the organization” and “the IT side of the organization”, at least not in the classic sense of business/ IT alignment literature from the 1980s and 1990s [PB89, HV93]. With the rise of process automation, digital/ digitalization we see that the two perspectives are now intertwined to such a degree that the distinction is fading rapidly (see e.g. [