Digital Forensic Science. Vassil Roussev
it does not, it was deleted
Table 3.2: Abstract rules for transforming A → B (A into B) based on observed changes to features, f, feature locations Loc (A, f), and feature names NAME (A, f). Note: The RENAME primitive is not strictly needed (as it can be modeled as ADDNAME followed by DELNAME), but it is useful to convey higher-level semantics ([75], Table 2).
Tampering. This is the easiest and most obvious explanation although it is not necessarily the most likely one; common examples include planting of new files with old timestamps, and system clock manipulation.
System operation. The full effects of the underlying operation, even as simple as copying a file, are not always obvious and require careful consideration. For example, the Unix cp command sets the creation time of the new copy to the time of the operation but will keep the original modification time if the -p option is used.
Time tracking errors. It has been shown [127, 167] that operating systems can introduce inconsitencies during normal operation due to rounding and implementation errors. It is worth noting that, in many cases, the accuracy of a recorded timestamp is of little importance to the operation of the system; therefore, perfection should not be assumed blindly.
Tool error is always a possibility; like all software, forensic tools have bugs and these can manifest themselves in unexpected ways.
One important practical concern is how to report the extracted feature changes. Performing a comprehensive differential analysis, for example, of two hard disk snapshots is likely to result in an enormous number of individual results that can overwhelm the investigator. It is critical that differencing tools provide the means to improve the quality and relevance of the results. This can be accomplished in a number of ways: (a) filtering of irrelevant information; (b) aggregation of the results to highlight both the common and the exceptional cases; (c) progressive disclosure of information, where users can start at the aggregate level and use queries and hierchies to drill down to the needed level of detail; (d) timelining—provide a chronological ordering of the (relevant) events.
3.3.2 COMPUTER HISTORY MODEL
Differential analysis offers a relatively simple view of forensic inference, by focusing on the beginning and end state of the data, and by expressing the difference in terms of a very small set of primitive operations. The computer history model (CHM) [27]—one of Carrier’s important contributions to the field—seeks to offer a more detailed and formal description of the process. The model employs finite state machines to capture the state of the system, as well as its (algorithmic) reaction to outside events. One of the main considerations is the development of an investigative model that avoids human bias by focusing on modeling the computation itself along with strict scientific hypothesis testing. The investigation is defined as a series of yes/no questions (predicates) that are evaluated with respect to the available history of the computation.
Primitive Computer History Model
This assumes that the computer being investigated can be represented as a finite state machine (FSM), which transitions from one state to another in reaction to events. Formally, the FSM is a quintuple (Q, Σ, δ, s0, F), where Q is a finite set of states, Σ is a finite alphabet of event symbols, δ is the transition function δ : Q × Σ → Q, s0 ∊ Q is the starting state of the machine, and F ⊆ Q is the set of final states.
The primitive history of a system describes the lowest-level state transitions (such as the execution of individual instructions), and consists of the sequence of primitive states and events that occurred.
The primitive state of a system is defined by the discrete values of its primitive, uniquely addressable storage locations. These may include anything from a CPU register to the content of network traffic (which is treated as temporary storage). As an illustration, Figure 3.1 shows an event E1 reading the values from storage locations R3 and R6 and writing to locations R3 and R4.
Figure 3.1: Primitive computer history model example: event E1 is reading the values from storage locations R3 and R6 and writing to locations R3 and R4 [27].
The primitive history is the set T containing the times for which the system has a history. The duration between each time in T, Δt, must be shorter than the fastest state change in the system. The primitive state history is function hps : T → Q that maps a time t ∊ T to the primitive state that existed at that time. The primitive event history is a function hpe : T → Σ that maps a time t ∊ T to a primitive event in the period (t – Δt, t + Δt).
The model described so far is capable of describing a static computer system; in practice, this is insufficient as a modern computing system is dynamic—it can add resources (such as storage) and capabilities (code) on the fly. Therefore, the computer history model uses a dynamic FSM model with sets and functions to represent the changing system capabilities. Formally, each of the Q, Σ, and δ sets and functions can change for each t ∊ T.
Complex Computer History Model
The primitive model presented is rarely practical on contemporary computer systems executing billions of instructions per second (code reverse engineering would be an exceptional case). Also, there is a mismatch between the level of abstraction of the representation and that of the questions that an investigator would want to ask (e.g., was this file downloaded?). Therefore, the model provides the means to aggregate the state of the system and ask questions at the appropriate level of abstraction.
Complex events are state transitions that cause one or more lower-level complex or primitive events to occur; for example, copying a file triggers a large number of primitive events. Complex storage locations are virtual storage locations created by software; these are the ephemeral and persistent data structures used by software during normal execution. For example, a file is a complex storage location and the name value attribute pairs include the file name, several different times-tamps, permissions, and content.
Figure 3.2 shows a complex event E1 reading from complex storage locations D1 and D2 and writing a value to D1. At a lower level, E1 is performed using events E1a and E1b, such as CPU, or I/O instructions. The contents of D1 and D2 are stored in locations (D1a, D1b) and (D2a, D2b), respectively.
Figure 3.2: Complex history event examples: event E1 with two complex cause locations and one complex effect location [27].
General Investigation Process
The sequence of queries pursued by the investigator will depend on the specific objectives of the inquiry, as well as the experience and training of the person performing it. The CHM is agnostic with respect to the overall process followed (we will discuss the cognitive perspective in Section 3.3.3) and does not assume a specific sequence