System Reliability Theory. Marvin Rausand
does not deteriorate and faults or bugs remain dormant and undetected until the software is modified or a specific condition or trigger activates the bug – leading to item failure. Software bugs are manifestations of mistakes done in specification, design, and/or implementation. Reliability analysis of a software program is done by checking the code syntax according to specific rules and by testing (debugging) the software for a variety of input data. This process is not discussed further in this book. Interested readers may consult ISO 25010.
1.3.2 Maintainability and Maintenance
Many items have to be maintained to perform as required. Two different concepts are important, maintainability, and maintenance. Maintainability is a design feature of the item and indicates how easy it is to get access to the parts that are to be maintained and how fast a specific maintenance task can be done. Maintenance describes the actual work that is done to maintain an item. Maintainability is defined as follows:
Definition 1.5 (Maintainability)
The ability of an item, under stated conditions of use, to be retained in, or restored to, a state in which it can perform as required, when maintenance is performed under stated conditions and using prescribed procedures and resources.
Maintainability is further discussed in Chapter 9. Maintenance is defined as follows:
Definition 1.6 (Maintenance)
The combination of all technical and management actions during the life cycle of an item intended to retain the item in, or restore it to, a state in which it can perform as required (IEV 192‐06‐01).
Hardware maintenance is discussed in more detail in Chapters 9 and 12. Software maintenance is not treated in this book.
1.3.3 Availability
Availability measures the degree to which an item is able to operate at some future time
1.3.4 Quality
The term “quality” is closely related to reliability and is defined as follows:
Definition 1.7 (Quality)
The totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs.
Quality is sometimes defined as conformity to specifications and a quality defect is referred to as a nonconformity. According to common usage, quality denotes the conformity of the item to its specification as manufactured, whereas reliability denotes its ability to continue to comply with its specification over its useful life. With this interpretation, reliability may be considered as an extension of quality into the time domain.
1.3.5 Dependability
Dependability is a more recent concept that embraces the concepts of reliability, maintainability, and availability, and in some cases also safety and security. Dependability has, especially, become known through the important series of standards IEC 60300 “Dependability management.” The IEV defines dependability as follows:
Definition 1.8 (Dependability)
The ability (of an item) to perform as and when required (IEV 192‐01‐01).
Another commonly used definition is “Trustworthiness of a system such that reliance can justifiably be placed on the service it delivers” (Laprie 1992).
Remark 1.1 (Translating the word “dependability”)
Many languages, such as Norwegian and Chinese, do not have words that can distinguish reliability and dependability, and reliability and dependability are therefore translated to the same word.
1.3.6 Safety and Security
General safety is outside the scope of this book, and we deal only with the safety aspects of a specified technical item and define safety as follows:
Definition 1.9 (Safety)
Freedom from unacceptable risk caused by the technical item.
This definition is a rephrasing of definition IEV 351‐57‐05. The concept safety is mainly used related to random hazards, whereas the concept security is used related to deliberate hostile actions. We define security as:
Definition 1.10 (Security)
Dependability with respect to prevention of deliberate hostile actions.
The deliberate hostile action can be a physical attack (e.g. arson, sabotage, and theft) or a cyberattack. The generic categories of attacks are called threats and the entity using a threat is called a threat actor, a threat agent, or an adversary. Arson is therefore a threat, and an arsonist is a threat actor. The threat actor may be a disgruntled employee, a single criminal, a competitor, a group, or even a country. When a threat actor attacks, he seeks to exploit some weaknesses of the item. Such a weakness is called a vulnerability of the item.
Remark 1.2 (Natural threats)
The word “threat” is also used for natural events, such as avalanche, earthquake, flooding, landslide, lightning, tsunami, and volcano eruption. We may, for example, say that earthquake is a threat to our item. Threat actors are not involved for this type of threats.
1.3.7 RAM and RAMS
RAM, as an acronym for reliability, availability, and maintainability, is often used, for example, in the annual RAM Symposium.1 RAM is sometimes extended to RAMS where S is added to denote safety and/or security. The RAMS acronym is, for example, used in the railway standard IEC 62278.
Remark 1.3 (Broad interpretation of reliability)
In this book, the term “reliability” is used quite broadly, rather similar to RAM as defined above. The same interpretation is used by Birolini (2014).
1.4 Reliability Metrics
Throughout this book, it is assumed that the time‐to‐failure and the repair time of an item are random variables with probability distributions that describe the future behavior of the item. The future behavior may be evaluated based on one or more reliability metrics. A reliability metric is a “quantity” that is derived from the reliability model and is, as such, not directly measurable. When performance data become available, we may estimate or predict quantitative values for each reliability metric.
A single reliability metric is not able to tell the whole truth. Sometimes, we need to use several reliability metrics to get a sufficiently clear