Limits of Science?. John E. Beerbower
concept of probability will appear frequently in subsequent chapters. Therefore, I think that it is useful to say a few things about the concept and the branch of mathematics that is often referred to as “probability and statistics.”
We all have some intuitive notion of probabilities. Many things in our lives are, at any given time, more or less likely. We feel with virtual certainty that the Sun will rise tomorrow and can predict with confidence the precise time at which it will do so. Whether we will see the Sun is less certain. We might estimate the likelihood based upon the time of year, this evening’s weather and the forecast for the morning.
We also have an intuitive feeling for the concept of chance (and luck and risk). It appears that games of chance have a very long history in human society, perhaps predating history itself. See Amir D. Aczel, Chance (2004), pp.viii–ix. The use of presumably random events to learn the will of the gods or to divine the future is also of ancient origin. Id., pp.xi–xii; Hacking, The Emergence of Probability (1975), pp.1–3.
Luck, risk and chance
Not surprisingly, the rigorous investigation and analysis of probabilities arose through the mercenary interest in winning at games of chance. That is still likely the context in which probability theory has the most obvious meaning to many of us. Elaborate models were developed to calculate the probabilities of various complicated outcomes in such games in the seventeenth century by Galileo, Blaise Pascal, Pierre de Fermat and Abraham de Moivre and in the eighteenth century by Jacob Bernoulli, Thomas Bayes and Pierre Simon de Laplace. See, e.g., Aczel, Chance, p.xii; Carnap, Philosophical Foundations of Physics, pp.23–4; Hacking, The Emergence of Probability, pp.11–2, 143–53. The rules students learn about the calculations of probabilities of various permutations and combinations derive from these developments.
As in other areas of mathematics, the creation of a rigorous logical system of probability theory that began with propositions that seemed obvious and natural led to formulae and results that were surprising and even counter-intuitive. For example, the probability that at least one of several possible things might happen is properly calculated not by adding the probabilities of the various possibilities but by subtracting from 1.0, the probability that none of the things would occur. (The probabilities of all possible occurrences must add up to 1.0 or certainty.) Thus, the probability that heads will appear at least once in three tosses of a coin is not three times 1/2 (the odds of getting heads on any one toss). Instead, it equals 1.0 minus the probability that the coin toss results in three tails. The probability of tossing three tails is .125 (the result of 0.5 × 0.5 × 0.5), so the probability of getting at least one heads is .875 (or 87.5%). The reason for approaching the question from the opposite end is the need to account for and eliminate the overlaps (e.g., getting a heads on any one or all of the tosses meets the criteria of at least one heads). This rule is referred to as the Law of Unions of Independent Events. See Aczel, Chance, pp.25–39.
It may happen that the casual practitioner of such calculations will discover that the theory does not apply very well in many real world situations. The reason is that games of chance have some rather special characteristics. The unusual feature of the classical view of probability is that one would (in an honest game, at least) confront a variety of outcomes each of which is equally likely to occur—the flip of a coin, the roll of dice, the hand of cards. Situations in which alternative outcomes could be intuitively perceived as equally likely, that is, where no reason could be ascertained why one number or card would be more likely to appear than another, are, however, rather limited. Most of our real world experiences are much more complicated and ambiguous.
A subsequent and more general theory of probability focused on the expected frequency of occurrences, generally based upon or support by experience or experimentation. If one did something 100 times, a particular outcome would tend to occur some particular number of times out of the 100 (or 1000, etc.). But, again, in the real world, we encounter single events where the notion of frequency of outcomes is not directly applicable. What is the meaning of probabilities in such cases? Id., pp.25–8.
Actually, we use a concept of probability that is much more nuanced than the comparisons of relative frequencies. We regularly conclude that one outcome or state of affairs is more probable than another. We may say that something is possible, likely, very likely or, even, almost certain. Such usage often cannot be restated in terms of relative frequencies. Indeed, in many such cases we would not entertain the prospect of putting a numerical value on the probability we have expressed. We may say that it is probable (or very probable) that our child is sick. That is a meaningful statement. Yet, it would seem quite strange to try to assess whether that condition is 70% or 80% probable.
Carnap proposes the phrase “logical probability” or “inductive probability” to refer to this type of usage. We derive logically a sense of the likelihood of something based upon a set of facts or evidence, drawing inferences from the evidence (facts such as a lack of appetite, a fever, a cough, etc.). See Carnap, Philosophical Foundations of Physics, pp.20–22, 34–5. See also, Hacking, The Emergence of Probability, pp.13–5.
In all events, probability theories developed out of the particular interests and mind-sets of the theories’ creators. Yet, such theories are often put to use in contexts far different from those in which they emerged—for example, in statistical inferences and in quantum mechanics. See Hacking, The Emergence of Probability, p.9.
Bayes’ Theorem
There is another type of example of probability that I want to discuss. It is based on Bayes’ Theorem, a formula discovered by the Reverend Thomas Bayes in the eighteenth century, and is one that can generate some rather surprising conclusions. See, e.g., Aczel, Chance, pp.95–6. It can best be explained by a famous example. Suppose there are three people, one of whom has won a prize through a random drawing, the result of which is still undisclosed. Each had an equal 1/3 chance of winning, so each assumes that the probability that his name was drawn is 1/3 or 33.33%. A then says to the moderator, “Obviously, at least one of B and C cannot have won. Tell us the identity of one of the two who did not win, so that that person can leave.”
The moderator announces that B did not win. What now, with that additional information, is the probability that A is the winner? Has it become 1/2? Actually, Bayes Theorem tells us that the probability that A won is still 1/3. Perhaps more surprisingly, however, according to the Theorem, the probability that C has won has become 2/3.19 See id., pp.95–103. How is that possible?
Well, before the moderator provided the additional information, the probability that either B or C had won was 2/3 (1.0 minus the 1/3 probability that A won). With the information that B had not won, the probability that one of the two of them had won did not change, but the only one that could now have won was C. So C enjoys the 2/3s chance that B plus C had together.
Note that if the request to the moderator had simply been to identify one of the three who had not won, the identification of B as a loser would have improved the chances of A and C (both would then have a 1/2 chance of being the winner). The information contained in the responses to the two requests is different, affecting the calculation of the probabilities with the benefit of the additional information.
Note further that this analysis depends upon the fact that the winner had already been determined but was not disclosed (otherwise, the moderator could not have granted the request). That situation is different from the prospective flip of a coin. In the case of the coin, the question is: What is the likelihood that the toss will yield heads (or tails)? In the game described above, we are not predicting the future. Indeed, the probabilities with respect to who won are no longer even probabilities. Someone did win; two others lost. No additional information or event will change the outcome. It is 100% certain that one of the three did win.
Instead, the real question being presented is: What is the probability that the person we pick is the person who actually did win? With respect to that question, the additional information is relevant. In other words, the probability that C would become the winner was 1/3, and that probability was