Designing Geodatabases for Transportation. J. Allison Butler
work and increase the chance for inconsistencies.
Right about now, you may be thinking that you never want data redundancies, so why is this a big deal? The answer is that no single application may impose the need for data redundancy, but the collection of applications supported by the editing process may. For example, you could have several applications that want to know the length of a facility, some in meters and others in miles. Even if all the applications want the data in the same form, they are likely to expect the data to be stored in a data field under a specific name, such as LEN, LENGTH, or DISTANCE. Rather than store the length in all these different forms and field names within the editing database, you want to store it there once and then create the different versions needed by the various supported applications. There may also be applications that need data derived from other data. For instance, sums, averages, minimums, maximums, and counts may be employed by various applications, such as the total number of highway lane miles or the minimum length of all passing sidings located along a rail line. It is much better to have these values derived rather than enter them directly because it saves time and reduces the risk of error.
These practices mean the process of moving data from the editing to the published geodatabase will likely involve data transformations, calculation of derived fields, data replication, and other actions. But this process can be automated. In contrast, data editing is a primarily manual task. Work smarter, not harder. You will get better data with less work.
Going back to the earlier discussion of agile methods in enterprise geodatabase design, the editing environment is typically the last one to be designed and the first to be built. It is designed last because you will not know what data must be maintained—and the geodatabase design that best supports that data—until all the application inputs are defined. It is delivered first because all those using applications will not function until the inputs are provided. Assuming you cannot design and build everything at once, this chronology presents an impossible task, because the agile method assumes that an application’s final requirements evolve. As a result, the geodatabase must itself be agile.
The core concept of agility is flexibility combined with robustness. Separating the editing and usage portions of the complete enterprise database allows each to evolve independently and to use a structure optimally suited to its needs. Editing involves lots of small transactions that change the geodatabase coupled with a strong need to coordinate edits made by different persons over time. In other words, maintaining database integrity. In contrast, applications involve extractions of relatively large chunks of data. Each application defines a set of data needs and imposes requirements on the geodatabase that it uses. That geodatabase should be part of the published dataset, which receives its content from the editing geodatabase. If you use the editing geodatabase directly, then your application would have to do all the heavy lifting associated with getting the data into the right form. Conversely, if you edit the application’s geodatabase directly, then the editing process has to deal with the data structure the application needs. In both cases, you have editors and users churning the same data, which can often produce surprising results because of a loss of referential integrity; i.e., differences in values across the geodatabase.
This book provides detailed instructions for how to structure and process data so that it can be used to support applications without users having to separately and duplicatively maintain the data. This book is about enterprise data editing, not within a single office, but across the organization. The data it embraces is defined by other applications. Editing geodatabases evolve more frequently than do application geodatabases. The editing environment is the sum of all application data requirements. As a result, it will probably need to be modified each time any application changes or is added to the list of supported work processes.
Designing Geodatabases for Transportation describes a geodatabase design process founded on content rather than specific applications. The design of the editing application is determined by the nature of the data to be edited. Thus, the solutions presented in this book follow the general structure of, “If your user needs this kind of data, then build the editing geodatabase this way.” Many of the geodatabase design principles presented here are widely applicable and need not be restricted to transportation themes. All are consistent with good data-management practices and current technology.
Book organization
This book is divided into three parts. Part 1 covers the basics of geodatabase design. Part 2 explores the various ways transportation geodatabases may be structured. Part 3 offers a variety of advanced topics on transportation geodatabase design.
As with any book intended for a wide range of readers, Designing Geodatabases for Transportation covers a lot of foundational concepts dealing with database design in general and geodatabases in particular. While it may tempting for a more knowledgeable reader to skip the first few chapters, even the advanced data modeler should review the content of part 1 in order to be familiar with the terms and presentation employed in this book. Similarly, you may want to explore the modal chapters in part 3 related to forms of transportation not included in your own geodatabase because there may be ideas you can use.
One of the more obvious demarcations in the book is the distinction between the segmented data structures used mainly by local governments and commercial database vendors and the route-based structures used primarily by state and provincial transport agencies. Because they are conceptually less complex, design concepts more applicable to segmented data models are generally presented in earlier chapters and those concepts with greater applicability to route-based models are covered in later chapters. Do not skip the content directed to one side of this dividing line because this distinction in application is often one of convenience. Many design techniques are applicable to both basic data structures.
Some content is targeted to a specific audience. These passages will be placed in sidebars identified by one of two icons.
The building block icon identifies basic knowledge about a fundamental aspect of the topic. |
The rocket icon denotes information suitable for advanced readers that describes what is happening behind the scenes, gets into the details of a topic, or offers guidance for specific tasks. |
Transportation geodatabases have been difficult to construct in the past, in large part because of a lack of basic guidance on how to address the many problems presented by this unique data and the business processes it supports. Designing Geodatabases for Transportation is intended to provide basic guidance on how to construct transportation geodatabases in a manner that addresses these inherent problems.
1 Although pipelines and utility systems have a similar structure and do transport materials or energy from place to place, they are not included in this book. Other ESRI Press publications and data models address the spatial database design needs of pipelines, telecommunications, water utilities, and electric power systems.
Data modeling
• Files
• Tables