Category
page 1Reliability engineering

maintenance
thumb|A tractor being mechanically repaired in Werneuchen, 1966
thumb|Field repair of aircraft engine (1915–1916)

poka-yoke
is any mechanism in a process that helps an equipment operator avoid mistakes and defects by preventing, correcting, or drawing attention to human errors as they occur. It is a Japanese term that means "mistake-proofing" or "error prevention", and is also sometimes referred to as a forcing function or a behavior-shaping constraint.
anomaly detection
The identification of rare items, events or observations which raise suspicions by differing significantly from the expected or majority of the data
safety engineering
engineering discipline which assures that engineered systems provide acceptable levels of safety when using them, for concerned or nearby persons and assets
risk assessment
analysis with risk acceptance criteria or other decision parameters
failure mode and effects analysis
systematic technique for identification of potential failure modes in a system and their causes and effects
reliability engineering
sub-discipline of systems engineering that emphasizes dependability in the lifecycle management of a product or a system
survival analysis
branch of statistics for analyzing the expected duration of time until one or more significant events happen, such as death in biological organisms and failure in mechanical systems
single point of failure
part whose failure will disrupt the entire system
human error
action with unintended consequences, that is often the primary cause or contributing factor in disasters and accidents
fault tolerance
ability of a system to continue functioning despite erroneous inputs or faults within some of its components
failure rate
frequency with which an engineered system or component fails
long-term support release
software version that is stable and supported under a long-term or extended contract
bathtub curve
curve for failure rates over time
service life
product total life in use from the point of sale to the point of discard
redundancy
use of a number of critical components for securing one or more functions of a system with the intention of increasing its reliability, usually in the form of a backup or fail-safe design
fault tree analysis
failure analysis system used in engineering
high availability
systems design and implementation with a view to maximising service
robustness
property of a computer system to cope with faults in input or execution

fail-safe
In engineering, a fail-safe is a design feature or practice that, in the event of a failure of the design feature, inherently responds in a way that will cause minimal or no harm to other equipment, to the environment or to people. Unlike inherent safety to a particular hazard, a system being "fail-safe" does not mean that failure is naturally inconsequential, but rather that the system's design prevents or mitigates unsafe consequences of the system's failure. If and when a "fail-safe" system fails, it remains at least as safe as it was before the failure. Since many types of failure are poss
Kaplan–Meier estimator
non-parametric statistic used to estimate the survival function
antifragility
Antifragility is a property of systems in which they increase in capability to thrive as a result of stressors, shocks, volatility, noise, mistakes, faults, attacks, or failures. The concept was developed by Nassim Nicholas Taleb in his book, Antifragile, and in technical papers. As Taleb explains in his book, antifragility is fundamentally different from the concepts of resiliency (i.e. the ability to recover from failure) and robustness (that is, the ability to resist failure). The concept has been applied in risk analysis, physics, molecular biology, transportation planning, engineering, ae

reliability, availability, maintainability and safety
domain within technical and industrial safety engineering
site reliability engineering
discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems
network reliability
ability of a computer network protocol to notify the sender of whether delivery of data was successful
censoring
in statistics, engineering, medical research, and other technical disciplines, condition in which the value of a measurement or observation is only partially known
failure analysis
process of collecting and analyzing data to determine the cause of a failure, often with the goal of determining corrective actions or liability

hazard analysis
identification of present hazards as the first step in a process to assess risk
reliability-centered maintenance
maintenance planning approach based on reliability and safety system assessment
damage tolerance
ability of a structure to safely withstand defects
mean time to recovery
measure of the maintainability of repairable items; average time from discovery of a failure to completion of repairs
Triple modular redundancy
redundancy using three systems and voting to determine the result
failure mode, effects, and criticality analysis
systematic technique for failure analysis
process decision program chart
technique designed to help prepare contingency plans, by identifying the consequential impact of failure on activity plans
maintenance engineering
Engineering discipline
Structural reliability
ensuring structures' safety through probabilistic analysis
hot spare
spare component that is an active and connected part of a working system, ready to take over functionality with little or no interruption
chaos engineering
in software engineering, experimenting with the product to test extreme situations
cascading failure
process in a system of interconnected parts in which the failure of one or few parts can trigger the failure of other parts and so on
failure cause
defects in design, process, quality, or part application, which are the underlying cause of a failure or which initiate a process which leads to failure
data centre tier
defined levels of resiliency and redundancy for IT infrastructure
critical to quality
attribute of a part, product or process
idiot-proof
thumb|upright|Paper cutting machine with two separate hand buttons and one leg pedal for its operation. Requiring most of the operator's limbs to be used to activate the machine prevents them from being in dangerous positions while it operates.
repair kit
artificial physical object
heartbeat
periodic signal generated by hardware or software to indicate normal operation or to synchronize other parts of a computer system