Statistics and the Evaluation of Evidence for Forensic Scientists. Franco Taroni
1000 times more likely that the PoI is the offender than is not the offender. Some of the difficulties associated with assessments of probabilities are discussed by Tversky and Kahneman (1974) and are further described in Section 2.5. An appropriate representation of probabilities is useful because it fits the analytic device most used by lawyers, namely, the creation of a story. This is a narration of events ‘abstracted from the evidence and arranged in a sequence to persuade the fact‐finder that the story told is the most plausible account of “what really happened” that can be constructed from the evidence that has been or will be presented’ (Anderson and Twining 1998, p. 166). Also of relevance is Kadane and Schum (1996), which provides a Bayesian analysis of evidence in the Sacco–Vanzetti case (Sacco 1969) based on subjectively determined probabilities and assumed relationships amongst evidential events. A similar approach is presented in Section 2.9.
1.3 Uncertainty in Scientific Evidence
Scientific evidence requires considerable care in its interpretation (Evett 2009). Emphasis needs to be put on the importance of asking the question: what do the results mean in this particular case? (Jackson 2000). Kirk and Kingston (1964) emphasised:
Suppose that the fibres do match – what does it mean? Suppose that there is a defined degree of similarity in the bullet marking, or the handwriting, does it prove identity of origin, or does it merely give a sometimes controversial basis for making a decision as to the identity of origin? (p. 439)
Scientists and jurists have to ‘[
] abandon the idea of absolute certainty so that a fully objective approach to the problem can be made. [] If it can be accepted that nothing is absolutely certain then it becomes logical to determine the degree of confidence that may be assigned to a particular belief’ (Kirk and Kingston 1964, p. 435). On the same line of reasoning, the authors (Kingston and Kirk 1964) expressed themselves on uncertainty and they emphasised:A statistical analysis is used when uncertainty must exist. If there were a way of arriving to a certain answer to a problem, statistical methods would not be used. But when uncertainty does exist, and a statistical approach is possible, then this approach is the best one available since it offers an index on the uncertainty based upon a precise and logical line of reasoning. [
] It is undoubtedly true that serious errors have been made in applying incorrect statistical methods to the evaluation of physical evidence, but such misuse does not support the generalisation that statistics cannot be properly used in criminalistics at all. (p. 516)There are various kinds of problems concerned with the random variation naturally associated with scientific observations. There are problems concerned with the definition of a suitable reference population against which concepts of rarity or commonality may be assessed. There are problems concerned with the choice of a measure of the value of the evidence.
The effect of the random variation can be assessed with the appropriate use of probabilistic and statistical ideas. There is variability associated with scientific observations. Variability is a phenomenon that occurs in many places. People are of different sexes, determination of which is made at conception. People are of different height, weight, and intellectual ability, for example. The variation in height and weight is dependent on a person's sex. In general, females tend to be lighter and shorter than males. However, variation is such that there can be tall, heavy females and short, light males. At birth, it is uncertain how tall or how heavy the baby will be as an adult. However, at birth, it is usually known whether the baby is a boy or a girl. This knowledge affects the uncertainty associated with the predictions of adult height and weight.
People are of different blood groups. A person's blood group does not depend on the age or sex of the person but does depend on the person's ethnicity. The refractive index of glass varies within and between windows. Observation of glass as to whether it is window or bottle glass will affect the uncertainty associated with the prediction of its refractive index and that of other pieces of glass, which may be thought to come from the same origin.
It may be thought that, because there is variation in scientific observations, it is not possible to make quantitative judgements regarding any comparisons between two sets of observations. The two sets are either different or they are not and there is no more to be said. However, this is not so. There are many phenomena that vary but they vary in certain specific ways. It is possible to represent these specific ways mathematically. Various probability distributions to represent variation are introduced in Appendix A. It is then possible to assess differences quantitatively and to provide a measure of uncertainty associated with such assessments.
It is useful to recognise the distinction between statistics and probability. Probability is a deductive process that argues from the general to the particular. Consider a fair coin, i.e. one in which when tossed the probability of a head landing uppermost equals the probability of a tail landing uppermost equals 1/2. A fair coin is tossed 10 times. Probability theory enables a determination to be made of the probability that there are three heads and seven tails, say. The general concept of a fair coin is used to determine something about the outcome of the particular case in which it was tossed 10 times.
On the other hand, statistics is an inductive process that argues from the particular to the general. Consider a coin that is tossed ten times and there are three heads and seven tails. Statistics enables the question as to whether the coin is fair or not to be addressed. The particular outcome of three heads and seven tails in ten tosses is used to determine something about the general case of whether the coin was fair or not.
Fundamental to both statistics and probability is uncertainty. Given a fair coin, the number of heads and tails in ten tosses is uncertain. The probability associated with each outcome may be determined but the actual outcome itself cannot be predicted with certainty. Given the outcome of a particular sequence of 10 tosses, information is then available about the fairness or otherwise of the coin. For example, if the outcome were 10 heads and no tails, one may believe that the coin is double‐headed but it is not certain that this is the case. There is still a non‐zero probability (1/1024) that 10 tosses of a fair coin will result in 10 heads. Indeed this has occurred in the first author's own experience. A class of some 130 students were asked to each toss a coin 10 times. One student tossed 10 consecutive heads from what it is safe to assume was a fair coin. The probability of this happening is
. Probability is therefore the measure of choice for the quantification of uncertainty. It is therefore important to define probability carefully. This point is clarified in Section 1.7. At present, it suffices to mention a brief definition by de Finetti (1968).[A probability is] subjective [and it] means the degree of belief (as actually held by someone, on the ground of his whole knowledge, experience, information) regarding the truth of a sentence or event,
(a fully specified single event or sentence, whose truth or falsity is, for whatever reason, unknown to that person). (p. 45)1.3.1 The Frequentist Method
Consider a consignment of compact disks (CDs), containing
disks. The consignment is said to be of size . It is desired to make inferences about the proportion (