Digital Forensic Science. Vassil Roussev
At the same time, most IT systems are not specifically engineered to facilitate the forensic acquisition and analysis of their data. Therefore, there is the need to continuously develop forensic methods that keep up with the rapid growth in data volume and system complexity.
The main goal of this book is to provide a relatively brief, but systematic, technical overview of digital forensic methods, as they exist today, and to outline the main challenges that need to be addressed in the immediate future.
1.1 SCOPE OF THIS BOOK
By its nature, digital forensics is a multi-disciplinary undertaking, combining various expertise including software developers providing tools, investigators applying their analytical expertise, and lawyers framing the goals and bounds of the investigation. Nevertheless, the almost singular focus of this book is on the technical aspects of process—the algorithmic techniques used in the acquisition and analysis of the different systems and artifacts.
In other words, the goal is to provide a computer science view of digital forensic methods. This is in sync with Fred Brooks’ thesis that the primary purpose of computer science research is to build computational tools to solve problems emanating from other domains: “Hitching our research to someone else’s driving problems, and solving those problems on the owners’ terms, leads us to richer computer science research.” [66]
This means that we will only superficially touch upon the various legal concerns, or any of the issues regarding tool use, procedural training, and other important components of digital forensic practice. In part, this is due to the shortness of the book format, and the high quality coverage of these topics in existing literature.
However, the primary reason is that we seek to present digital forensics from a different perspective that has been missing. It is an effort to systematize the computational methods that we have acquired over the last three decades, and put them in a coherent and extensible framework.
Target audience. The treatment of the topics is based on the author’s experience as a computer science educator and researcher. It is likely to fit better as part of a special topics course in a general computer science curriculum rather than as part of a specialized training toward certification, or digital forensics degree.
We expect this text to be most appropriate in an advanced, or a graduate, course in digital forensics; it could also be used as supplemental material in an introductory course, as some of the topic treatment is different from other textbooks. We hope that faculty and graduate students will find it helpful as a starting point in their research efforts, and as a good reference on a variety of topics.
Non-goals. It may be useful to point out explicitly what we are not trying to achieve. Broadly, we are not trying to replace any of the established texts. These come in two general categories:
• comprehensive introduction to the profession of the forensic investigator (often used as a primary textbook in introductory courses) such as Casey’s Digital Evidence and Computer Crime [32];
• in-depth technical reference books on specialized topics of interest, such as Carrier’s classic File System Forensic Analysis [23], The Art of Memory Forensics by Ligh et al. [108], or Carvey’s go-to books on Windows [29] and registry analysis [30].
Due to the limitations of the series format, we have also chosen to forego a discussion on multimedia data and device forensics, which is a topic worth its own book, such as the one edited by Ho and Li [92].
1.2 ORGANIZATION
The book’s structure is relatively flat with almost no dependencies among the chapters. The two exceptions are Chapter 3, which should be a prerequisite for any of the subsequent chapters, and Chapter 6, which will make most sense as the closing discussion.
Chapter 2 provides a brief history of digital forensics, with an emphasis on technology trends that have driven forensic development. The purpose is to supply a historical context for the methods and tools that have emerged, and to allow us to reason about current challenges, and near-term developments.
Chapter 3 looks at the digital forensics process from several different perspectives—legal, procedural, technical, and cognitive—in an effort to provide a full picture of the field. Later, these models are referenced to provide a framework to reason about a variety of challenges, from managing data volumes to improving the user interface of forensic tools.
Chapter 4 is focused on system forensics; that is, on the types of evidentiary artifacts that are produced during the normal operation of computer systems. Most of these are operating system and application data structures, but we also discuss the emerging problem of cloud system forensics.
Chapter 5 discusses artifact forensics: the analysis of autonomous data objects, usually files, that have self-contained representation and meaningful intepretation outside the scope of a specific computer system. These include text, images, audio, video, and a wide variety of composite document formats.
Chapter 6 is an effort to outline a medium-term research agenda that is emerging from the broader trends in IT, such as fast data growth, cloud computing, and IoT. The focus is on the difficult problems that need to be addressed in the next five years rather than on the specific engineering concerns of today, such as finding ways to get around encryption mechanisms.
CHAPTER 2
Brief History
The beginning of the modern era of digital forensics can be dated to the mid-1980s, which saw the adoption of 18 U.S.C. § 1030 [1] as part of the Comprehensive Crime Control Act of 1984 [33]. The Computer Fraud and Abuse Act of 1986 was enacted by the U.S. Congress as the first of several amendements to clarify and expand the scope of the provisions. In 1984, the FBI initiatated its Magnetic Media Program [72], which can be viewed as a watershed moment in recognizing the importance of digital evidence, and the need for professionalization of the field.
Prior to that, computer professionals used ad-hoc methods and tools primarily for the purposes of data extraction and recovery after unforeseen failures and human errors; to this day, data recovery remains a cornerstone of digital forensic methodology. In the pre-1984 days, there was little effort to build a systematic body of knowledge, or specialized expertise. This is not surprising as there was little societal need—computers were centralized timeshare systems used by businesses, and had little useful information for the legal system. This all began to change with the massive surge in popularity of personal computers, and the introduction of dial-up networking for consumers.
2.1 EARLY YEARS (1984–1996)
The following dozen years (1984–1996) saw a rapid increase in personal computer use, along with fast growth in private network services like CompuServe, Prodigy, and AOL. This exploration period is characterized by substantial diversity of hardware and software, and saw the emergence of early de facto standard file formats (e.g., GIF [45]), most of which were poorly documented and rarely formally described [72].
Toward the end of the period, the meteoric rise in popularity of the Netscape Navigator web browser marked the tipping point for the transition to standards-based internetworking. At the same time, the combination of the Intel x86 architecture and Microsoft Windows operating system became the dominant software on the PC desktop. Taken together, these developments rapidly reduced the platform diversity and enabled a coherent view of the digital forensic process to gradually emerge. It also became feasible to become an expert by focusing on just one platform that (at the time) had minimal security and privacy provisions to impede the analysis. For example, one of the major forensic vendors today, AccessData, advertised itself as “leaders in cryptography and password recovery since 1987,” and offered a set of tools for that purpose [2].
In