System Reliability Theory. Marvin Rausand
shown in Figure 3.13.
Figure 3.13 Cause and effect diagram for the event “car will not start.”
When the team members agree that an adequate amount of detail has been provided under each major category, they analyze the diagram, and group the causes. An important part of this analysis is to eliminate irrelevant causes from the diagram and tidy it up. One should especially look for causes that appear in more than one category. For those items identified as the “most likely causes,” the team should reach consensus on listing those causes in priority order with the first cause being the “most likely cause.”
Some cause and effect analyses also include an evaluation of how easy it is to verify each of the identified causes in the diagram. Three classes are sometimes used: (i) very easy, (ii) somewhat easy, and (iii) not easy. A final step to propose actions to rectify the identified causes, may or may not be included in the analysis.
The cause and effects diagram cannot be used for quantitative analyses, but is generally considered to be an excellent aid for problem solving, and to illustrate the potential causes of an item failure/fault. Cause and effect analysis is also a recommended step in a more comprehensive root cause analysis (see Section 3.7.2).
Example 3.14 (Car will not start)
Consider a car that will not start after having been idle for a period. The causes suggested by the team are shown in the cause and effect diagram in Figure 3.14. A number of similar cause and event diagrams may be found on the Internet.
3.7.2 Root Cause Analysis
A root cause analysis may be defined as:
Definition 3.8 (Root cause analysis)
A systematic investigation of a failure or a fault to identify its likely root causes, such that they can be removed by design, process, or procedure changes.
The root cause analysis is reactive, starting with (i) a failure that has happened, or (ii) a potential failure that has been identified. The root cause analysis should continue until organizational factors have been identified, or until data are exhausted. Root cause analysis may be used to investigate a wide range of undesired events, not only failures and faults but also our description is delimited to failure/fault analysis.
The main steps of a root cause (failure) analysis are:
1 Clearly define the failure or fault. Explain clearly what went wrong.
2 Gather data/evidence. The evidence should provide answers to the following questions:When did the failure occur?Where did it occur?What conditions were present prior to its occurrence?What controls or barriers could have prevented its occurrence but did not?What are the potential causes? (Make a preliminary list of likely causes).Which actions can prevent recurrence?
3 Ask why and identify the true root cause associated with the defined failure/fault.
4 Check the logic and eliminate items that are not causes.
5 Identify corrective action(s) that will prevent recurrence of the failure/fault – and that address both proximate and root causes.
6 Implement the corrective action(s).
7 Observe the corrective actions to ensure effectiveness.
8 If necessary, reexamine the root cause analysis.
The root cause analysis is done by a team using idea generation techniques, such as brainstorming, and is often started by a cause and effect analysis link: (see Section 3.7.1). To identify root causes, it is usually recommended to ask “why?” at least five times for each main cause identified. The five whys are illustrated in Figure 3.14.
The root causes must be thoroughly understood before corrective actions are proposed. By correcting root causes, it is hoped that the likelihood of failure recurrence is minimized.
Figure 3.14 Repeatedly asking why?
Example 3.15 (Car will not start)
Reconsider the car that will not start in Example 3.14. The following sequence of five questions and answers may illustrate the analysis process.
1 Why will not the car start?Cause: The engine will not turn over.
2 Why will the engine not turn over?Cause: The battery is dead.
3 Why is the battery dead?Cause: The alternator is not functioning.
4 Why is the alternator not functioning?Cause: The belt is broken.
5 Why is the alternator belt broken?Cause: The belt was not replaced according to the manufacturer's maintenance schedule.
This example is strongly influenced by the presentation “Corrective action and root cause analysis” by David S. Korcal (found on the Internet).
Careful studies of failures that occur should add to our “lessons learned,” and we therefore end this chapter optimistically by quoting Henry Ford (1863–1947):
Failure is the opportunity to begin again more intelligently.
3.8 Problems
1 3.1 Consider the exterior door of a family house. The door is locked/unlocked by using a standard key.List all relevant functions of the door (including lock).List all relevant failure modes of the door.Classify the failure modes by using the classification system outlined in this chapter.Do you consider it relevant to include misuse failures? If “yes,” provide examples.
2 3.2 Consider a filter coffee maker/brewer that you are familiar with.List all potential failure modes of the coffee brewer.Identify potential causes of each failure mode.Identify potential effects of each failure mode.
3 3.3 Identify and describe possible failure modes of a (domestic) refrigerator.
4 3.4 Assume that your mobile phone is “dead.” Illustrate the possible causes of this fault by a cause and effect diagram.
5 3.5 Consider a smoke detector used in a private home and list possible causes of systematic faults of this detector.
6 3.6 Explain the differences between the terms failure and fault. Illustrate you explanation by practical examples.
7 3.7 Consider a domestic washing machine.Identify as many causes of potential failures as possible.Define categories of failure causes.Use these categories to classify the identified failure causes.
8 3.8 Suggest a technical system that can be divided into several levels of indenture. If you cannot propose anything better, you may use a family car. Assume that a specific component failure mode occurs in the system and exemplify the relationships that are illustrated in Figure