Failure analysis is a crucial process for understanding the underlying causes of a system or product failure. It involves a systematic and scientific approach to investigate and determine the root cause of the failure, as well as identifying potential solutions to prevent similar failures in the future. This process is essential for industries such as engineering, manufacturing, and technology, where failures can have significant consequences. In this article, we will delve into the process of conducting a failure analysis, including its steps, methods, and why it is a vital aspect of ensuring quality and reliability in various industries.
Failure analysis is the process of collecting and analyzing data to determine the cause of a failure. It is an important discipline in many branches of manufacturing industry, such as the electronics industry, where it is a vital tool used in the development of new products and for the improvement of existing products. It relies on collecting failed components for subsequent examination of the cause or causes of failure using a wide array of methods, especially microscopy and spectroscopy. The NDT or nondestructive testing methods are valuable because the failed products are unaffected by analysis, so inspection always starts using these methods.
Forensic investigation
Forensic inquiry into the failed process or product is the starting point of failure analysis. Such inquiry is conducted using scientific analytical methods such as electrical and mechanical measurements, or by analysing failure data such as product reject reports or examples of previous failures of the same kind. The methods of forensic engineering are especially valuable in tracing product defects and flaws. They may include fatigue cracks, brittle cracks produced by stress corrosion cracking or environmental stress cracking for example. Witness statements can be valuable for reconstructing the likely sequence of events and hence the chain of cause and effect. Human factors can also be assessed when the cause of the failure is determined. There are several useful methods to prevent product failures occurring in the first place, including failure mode and effects analysis (FMEA) and fault tree analysis (FTA), methods which can be used during prototyping to analyse failures before a product is marketed.
Failure theories can only be constructed on such data, but when corrective action is needed quickly, the precautionary principle demands that measures be put in place. In aircraft accidents for example, all planes of the type involved can be grounded immediately pending the outcome of the inquiry.
Another aspect of failure analysis is associated with No Fault Found (NFF) which is a term used in the field of failure analysis to describe a situation where an originally reported mode of failure can’t be duplicated by the evaluating technician and therefore the potential defect can’t be fixed.
NFF can be attributed to oxidation, defective connections of electrical components, temporary shorts or opens in the circuits, software bugs, temporary environmental factors, but also to the operator error. Large number of devices that are reported as NFF during the first troubleshooting session often return to the failure analysis lab with the same NFF symptoms or a permanent mode of failure.
The term Failure analysis also applies to other fields such as business management and military strategy.