The Failure of Risk Management. Douglas W. Hubbard
Of course, a giant experiment is not usually very practical, at least for individual companies to conduct by themselves. Fortunately, we have some other ways to answer this question without necessarily conducting our own massive controlled experiments. For example, there are some situations in which the risk management method caught what obviously would have been a disaster, such as detecting a bomb in a suitcase, only because of the implementation of a new plastic explosives–sniffing device. Another example would be where an IT security audit uncovered an elaborate embezzling scheme. In those cases, we know it would have been extremely unlikely to have discovered—and addressed—the risk without that particular tool or procedure. Likewise, there are examples of disastrous events that obviously would have been avoidable if some prudent amount of risk management had been taken. For example, if a bank was overexposed on bad debts and reasonable procedures would never have allowed such an overexposure, then we can confidently blame the risk management procedures (or lack thereof) for the problem.
But direct evidence of cause and effect is not as straightforward as it might at first seem. There are times when it appears that a risk management effort averted one risk but exacerbated another that was harder to detect. For example, the FAA currently allows parents traveling with a child under the age of two to purchase only one ticket for the adult who holds the child on his or her lap. Suppose the FAA is considering requiring parents to purchase seats for each child, regardless of age. If we looked at a crash where every separately seated toddler survived, is that evidence that the new policy reduced risk? Actually, no—even if we assume it is clear that particular children are alive because of the new rule. A study already completed by the FAA found that changing the “lap children fly free” rule would increase total fares for the traveling families by an average of $185, causing one-fifth of them to drive instead of fly. When the higher travel fatalities of driving are considered, it turns out that changing the rule would cost more lives than it saves. It appears we still need to check even the apparently obvious instances of cause and effect against some other independent measure of overall risk. The danger of this approach is that it may turn out that even when a cause-effect relationship is clear, it could just be anecdotal evidence. We still need other ways to check our conclusions about the effectiveness of risk management methods.
Component Testing
Lacking large controlled experiments, or obvious instances of cause and effect, we still have ways of evaluating the validity of a risk management method. The component testing approach looks at the gears of risk management instead of the entire machine. If the entire method has not been scientifically tested, we can at least look at how specific components of the method have fared under controlled experiments. Even if the data is from different industries or laboratory settings, consistent findings from several sources should give us some information about the problem.
As a matter of fact, quite a lot of individual components of larger risk management methods have been tested exhaustively. In some cases, it can be conclusively shown that a component adds error to the risk assessment or at least doesn't improve anything. We can also show that other components have strong theoretical backing and have been tested repeatedly with objective, scientific measures. Here are a few examples of component-level research that are already available:
The synthesis of data: One key component of risk management is how we synthesize historical experience. Where we rely on experts to synthesize data and draw conclusions, we should look at research into the relative performance of expert opinion versus statistical models.
Known human errors and biases: If we rely on expert opinion to assess probabilities, we should be interested in reviewing the research on how well experts do at assessing the likelihood of events, their level of inconsistency, and common biases. We should consider research into how hidden or explicit incentives or irrelevant factors affect judgment. We should know how estimates can be improved by accounting for these issues.
Aggregation of estimates: In many cases several experts will be consulted for estimates, and their estimates will be aggregated in some way. We should consider the research about the relative performance of different expert-aggregation methods.
Behavioral research into qualitative scales: If we rely on various scoring or classification methods (e.g., a scale of 1 to 5 or high/medium/low), we should consider the results of empirical research on how these methods are actually used and how much arbitrary features of the scales effect how they are used.
Decomposition: We can look into research about how estimates can be improved by how we break up a problem into pieces and how we assess uncertainty about those pieces.
Errors in quantitative models: If we are using more quantitative models and computer simulations, we should be aware of the most common known errors in such models. We also need to check to see whether the sources of the data in the model are based on methods that have proven track records of making realistic forecasts.
If we are using models such as AHP, MAUT, or similar systems of decision analysis for the assessments of risk, they should meet the same standard of a measurable track record of reliable predictions. We should also be aware of some of the known mathematical flaws introduced by some methods that periodically cause nonsensical results.
Formal Errors
Outright math errors should be the most obvious disqualifiers of a method, and we will find them in some cases. This isn't just a matter of making simplifying assumptions or using shortcut rules of thumb. Those can be useful as long as there is at least empirical evidence that they are helpful. But where we deviate from the math, empirical evidence is even more important. This is especially true when deviations from known mathematics provide no benefits in simplicity compared to perfectly valid mathematical solutions—which is often the main case for taking mathematical shortcuts.
In some cases, it can be shown that mathematically irregular methods may actually lead to dangerously misguided decisions. For example, we shouldn't be adding and multiplying ordinal scales, as is done in many risk assessment methods. We will show later some formal analysis how such procedures lead to misguided conclusions.
A Check of Completeness
Even if we use the best methods, we can't apply them to a risk if we don't even think to identify it as a risk. If a firm thinks of risk management as “enterprise risk management,” then it ought to be considering all the major risks of the enterprise—not just legal, not just investment portfolio, not just product liability, not just worker safety, not just business continuity, not just security, and so on. This criterion is not, however, the same as saying that risk management can succeed only if all possible risks are identified. Even the most prudent organization will exclude risks that nobody could conceivably have considered.
But there are widely known risks that are excluded from some risk management for no other reason than an accident of organizational scope or background of the risk manager. If the scope of risk management in the firm has evolved in such a way that it considers risk only from a legal or a security point of view, then it is systematically ignoring many significant risks. A risk that is not even on the radar can't be managed at all.
The surveys previously mentioned and many “formal methodologies” developed detailed taxonomies of risks to consider, and each taxonomy is different from the others. But completeness in risk management is a matter of degree. The use of a detailed taxonomy is helpful, but it is no guarantee that relevant risks will be identified.
More important, risks should not be excluded simply because they are speaking about risks in completely different languages. For example, cyber risk, financial portfolio risk, safety risk, and project risk do not need to use fundamentally different languages when discussing risk. If project risks are 42, cyber risks are yellow, safety risks are moderate, portfolio risks have a Sharpe Ratio of 1.1, and there is a 5 percent chance a new product will fail to break even, what is the total risk? They can and should be using the same types of metrics so risks