Data Management: a gentle introduction. Bas van Gils
each other. Let me offer two examples to illustrate:
■ One of the key processes in IT management is incident management. Through this process, organizations attempt to ensure that IT services are restored as soon as possible after an incident. This process is very similar to data quality issue management (see chapter 16). Given how closely related data and systems are, it may make sense to align these two processes.
■ One of the key considerations in systems development is user interface design. This discipline is traditionally largely focused on making sure user interfaces are easy to use and the interaction between user and system to ensure work can be performed effectively. From a data management perspective, this would include such aspects as consistent use of language, intuitive use/ ergonomics, and ensuring that there are “guard rails” in place that will prevent users from entering an incorrect input that the system will not be able to process correctly.
Here, too, it is safe to conclude that, in practice, both disciplines are important and tightly linked.
■ 4.4 INFORMATION/DATA ANALYSIS
The terms information analysis, data analysis, and information management are closely related and – as with so many terms – are defined differently depending on the context and author. For example, in the Netherlands, the term information management currently has very little to do with the management of information. Instead, it tends to mean the capability to understand and manage IT requirements and the associated portfolio of required projects to implement them. The more general definition of this term is the organizational capability to manage the lifecycle of data, which is quite close to how DM is defined.
In my view, information/ data analysis is a capability that operates on a completely different level of abstraction. The purpose of this type of analysis is, in the context of the information needs of a stakeholder or group of stakeholders, to analyze the interplay between process, information/ data, and systems and document a functional/ technical design that can be used to implement these requirements through the development or adaptation of IT systems.
Many of the techniques that I will discuss in chapter 11 are also used for information/ data analysis. Classic approaches that fall into this category – developed in the 1990s and still highly relevant today – are structured analysis and design [You89] and information engineering [Mar89, Mar90a, Mar90b].
■ 4.5 DATABASE MANAGEMENT
Database management is the capability that is concerned with designing, implementing, and running databases that help to make data available to the right person, at the right time. This is a fairly technical discipline and for this reason I have chosen not to give it a chapter of its own in this book.
Databases come in many shapes and forms. The relational model, developed by Codd in the 1970s, is still by far the most popular approach for structuring and storing data [Cod70, Cod79]. This model is based on the notion of mathematical relations. A relation can be seen as a table2 with a heading that lists the attributes of the relation (i.e. a Person relation may have First name, Last name, Birth date as attributes) and a body consisting of tuples/ rows with values that represent the population of the table (i.e. {‘Bas’, ‘van Gils’, ‘06-dec-1976’}).
One of the key points of the relational model is that data structures are designed a priori in such a way that they can be queried in many different ways to answer any question that people may have about the data. In other words, the “cost” of time spent in designing the data structures is balanced by the” “value” of flexible querying. Data structures are rigorously designed and tend to be fairly static. Adjusting them tends to have a major impact on IT systems. A more recent development is to work with database systems where the line of reasoning is the inverse: get data in the system and do not worry too much about structuring the data a priori. Instead, the structure of the data in the database is analyzed when the system is queried. In this case, the benefit of “ease of getting data into the system” is balanced by the cost of “querying becomes a little harder”. Several types of databases fall into this category of NoSQL-systems (see e.g. [RW12] for a good overview as well as advantages/ disadvantages of each).
On an (even) more technical level, database management concerns decisions about how to set up the infrastructure to host databases, whether systems should have a failover option (i.e. if one system is unavailable, then the other will take over), or what the implications are of hosting the data “in the cloud.”
■ 4.6 DM AND ENTERPRISE ARCHITECTURE MANAGEMENT
Enterprise architecture (EA) is a capability that considers organizations from a “big picture view”. The capability evolved from both the business/ IT alignment literature [PB89, HV93] and IT engineering/ architecture [Zac87, ISO11, The11, The16a, GD14, GD15, RWR06, RBM19].
It appears that each architecture approach uses its own definition of architecture. Most of these approaches at least relate to the definition that is presented in the ISO/IEC/IEEE 42010 standard about Systems and software engineering – Architecture description which states that the architecture of a system3 is about two things (1) the fundamental organization of that system and (2) the principles guiding the design and evolution of that system [ISO11]. A more elaborate discussion is presented in chapter 12.
The “big picture view of the enterprise” relates to the first aspect, and gives a clear overview of the relationship between key elements in the organization. Typically, this is about the “golden triangle”: business process, data, and systems. Architecture modeling languages (e.g. ArchiMate) are capable of visualizing this big picture view. One discussion that crops up frequently is: “where does ‘architecture’ stop and where do more detailed analyses (of processes, systems, and data) begin?” There is no simple answer to this question: the word “fundamental” from the definition of architecture is a subjective term. What might be fundamental for one stakeholder may be a (potentially irrelevant) detail for another. Figure 4.2 illustrates how architecture (models) are linked to more detailed designs.
Figure 4.2 From architecture to a more “detailed design”
From the perspective of enterprise architecture, data (architecture) is but one of the aspects that is to be considered. To put it differently, data architecture is considered to be a part of enterprise architecture. Switching perspectives, one could argue that data architecture (chapter 12) is but one aspect