The Money Formula. Wilmott Paul
markets. One approach to prediction is to build deterministic Newtonian models of the system. Alternatively, one can make probabilistic models based on statistics. In practice, scientists usually use a combination of these approaches. For example, weather predictions are made using deterministic models, but because the predictions are prone to error, meteorologists use statistical techniques to make probabilistic forecasts (e.g., a 20 % chance of rain). Quants do the same for the markets, but then bet large amounts of money on the outcome. This chapter looks at how probability theory is applied to forecast the financial weather.
In 1724, after the collapse of his French monetary experiment, John Law supported himself in Venice by gambling. He would sit at a table at the Ridotto casino with 10,000 gold pistole coins arranged in stacks like casino chips, and offer any challenger the chance to make a wager of a single pistole. If they rolled six dice and got all sixes, then they could keep the lot. Law knew the odds of this happening were only 1 in 46,656 (6 multiplied by itself 6 times). So people always lost, but would go away happy at having gambled with the notorious John Law.
A key concept from probability theory is the idea of expected value, which equals the payout multiplied by the probability. For Law's gamble, this was 10,000 multiplied by 1/46,656, or 0.21 gold pistoles. Since the stake was 1 pistole, Law had an edge (a fair payout would have been 46,656 coins instead of 10,000). It was his money, after all, so he wanted to make a profit. We'll see later that he could still have made money even if he had offered the punters better odds, odds giving them the positive expectation. The solution to this apparent paradox is that he would have to do his gambling via a financial vehicle, a hedge fund, and he'd have to be betting with other people's money.
The connection between basic probability theory and something like the stock market becomes clear when we consider the result of a sequence of coin tosses, as in Figure 2.1. Here the paths start at the left and branch out to the right with time. If the coin comes up heads, you win one point, but if it is tails, you lose a point. The heavy line shows one particular trajectory, known as a random walk, against the background of all possible trajectories. At each time step, the path takes a random step up or down. Most paths remain near the center. Figure 2.2 shows how the final distribution looks after 14 time steps. The mean or average displacement is zero, and over 20 % of the paths end with no displacement. If this were a plot of price changes for a stock, and the horizontal axis represented time in days, we would say that the expected value of the stock after 14 days would be unchanged from its initial value.
Figure 2.1 Coin toss results
The black line shows one possible random walk, with a vertical step of plus 1 (up) or minus 1 (down) at each iteration. The light gray lines are an overlay of all possible paths through 14 iterations. The plot shows how the future becomes more uncertain as the possible paths multiply.
Figure 2.2 A histogram showing the final distribution after 14 iterations
The range is −14 to 14, but over 20 % of paths end with no change in position (center bar). The shape approximates the bell curve or normal distribution from classical statistics.
After n iterations, the maximum deviation from 0 is equal to n – so after 14 steps, the range is from −14 to 14. But most paths stay near the center, so the average displacement is much smaller.27 A longer random walk, of 100 steps, is shown by the solid line in Figure 2.3. The light-gray lines are the bounds for possible paths: the upper bound is the path with an increase of 1 at every step, while the lower bound is the path with a decrease of 1 at every step (the probability of these paths is extremely low, since they are the same as tossing a coin and getting heads 100 times in a row). In the background, the density of the grayscale at any point corresponds to the probability of a random walk going through that point. Note how this probability density spreads out with time, rather like an idealized, turbulence-free version of a plume of smoke emitted from a chimney. Random walks sound wild, but on average they are very well-behaved.
Figure 2.3 100-Step random walk
The solid line is a random walk of 100 steps, starting at 0, with a displacement of plus or minus 1 at each step. The light-gray lines show the upper and lower bounds, corresponding to the paths in which the displacement at every step is plus or minus 1, respectively. The density of the grayscale at any point corresponds to the probability of a random walk going through that point. This is highest for paths with small displacement. The probability of a path entering the white area is very low, or zero outside the light-gray lines.
Such computations become unwieldy when there are a very large number of games or iterations; however, in 1738 the mathematician Abraham de Moivre showed that after an infinitely large number of iterations, the results would converge on the so-called normal distribution, or bell curve. This is specified by two numbers: the mean or average and the standard deviation, which is a measure of the curve's width.28 About 68 % of the data fall within one standard deviation of the mean, and about 95 % are within two standard deviations. The homme moyen of statistics, this formula got its name because of its ubiquity in the physical and social sciences. The distinguishing feature of the normal distribution is that, according to the central limit theorem, which was partially proven by de Moivre, it can be used to model the sum of any random processes, provided that a number of conditions are met. In particular, the separate processes have to be independent of one another, and identically distributed. So, for example, if 18th-century astronomers made many measurements of the position of Saturn in the night sky, then each measurement would be subject to errors, but they could hope that a plot of the measurements would look like a bell curve, with the correct answer close to the middle.
The normal distribution is perhaps the closest the field of statistics comes to a Newtonian formula. The equation is simple and elegant, there are only two parameters that need to be measured (the mean and the standard deviation), and it can be applied to a wide range of phenomena. The “Law of Unreason,” as its Victorian popularizer Francis Galton called it, would find perhaps its greatest application in mastering, or appearing to master, the chaos of the markets.29
The desire to bring order out of chaos, and to see the hidden pattern in the noise, is basic to human nature. In mathematics, even chaos theory is not so much about chaos as about showing that what appears to be wild and unruly behavior can actually be explained by a simple equation. As the field's founder, French mathematician Henri Poincaré, told one of his PhD students: “what is chance for the ignorant is not chance for the scientists. Chance is only the measure of our ignorance.”30
The student who earned this rebuke was called Louis Bachelier. His mistake, perhaps, was choosing a thesis subject that was a little too chaotic – the buying and selling of securities that took place within the mock-Greek temple building of the Paris Exchange, or Bourse. He was awarded a good but undistinguished grade on his 1900 dissertation, entitled Théorie de la Spéculation, and his work failed to unite the academic community in a frenzy of excitement (it took him 27 years to find a permanent job).31
Bachelier began his thesis as follows (imaginary editorial remarks in italics):
The influences which determine the movements of the Stock Exchange are innumerable.
29
Galton (1889).
30
Quoted in Bernstein (1998, p. 200).
31
Bachelier (1900).