Efficient fault-injection-based assessment of software-implemented hardware fault tolerance

Schirmeier, Horst Benjamin

Authors:	Schirmeier, Horst Benjamin
Title:	Efficient fault-injection-based assessment of software-implemented hardware fault tolerance
Language (ISO):	en
Abstract:	With continuously shrinking semiconductor structure sizes and lower supply voltages, the per-device susceptibility to transient and permanent hardware faults is on the rise. A class of countermeasures with growing popularity is Software-Implemented Hardware Fault Tolerance (SIHFT), which avoids expensive hardware mechanisms and can be applied application-specifically. However, SIHFT can, against intuition, cause more harm than good, because its overhead in execution time and memory space also increases the figurative “attack surface” of the system – it turns out that application-specific configuration of SIHFT is in fact a necessity rather than just an advantage. Consequently, target programs need to be analyzed for particularly critical spots to harden. SIHFT-hardened programs need to be measured and compared throughout all development phases of the program to observe reliability improvements or deteriorations over time. Additionally, SIHFT implementations need to be tested. The contributions of this dissertation focus on Fault Injection (FI) as an assessment technique satisfying all these requirements – analysis, measurement and comparison, and test. I describe the design and implementation of an FI tool, named Fail, that overcomes several shortcomings in the state of the art, and enables research on the general drawbacks of simulation-based FI. As demonstrated in four case studies in the context of SIHFT research, Fail provides novel fine-grained analysis techniques that exploit the newly gained possibility to analyze FI results from complete fault-space exploration. These analysis techniques aid SIHFT design decisions on the level of program modules, functions, variables, source-code lines, or single machine instructions. Based on the experience from the case studies, I address the problem of large computation efforts that accompany exhaustive fault-space exploration from two different angles: Firstly, I develop a heuristical fault-space pruning technique that allows to freely trade the total FI-experiment count for result accuracy, while still providing information on all possible faultspace coordinates. Secondly, I speed up individual TAP-based FI experiments by improving the fast-forwarding operation by several orders of magnitude for most workloads. Finally, I dissect current practices in FI-based evaluation of SIHFT-hardened programs, identify three widespread pitfalls in the result interpretation, and advance the state of the art by defining a novel comparison metric.
Subject Headings:	Fault injection Transient memory faults Software-implemented hardware fault tolerance Criticality analysis Fault-tolerance assessment FAIL* Fault-similarity pruning Smart-hopping Extrapolated absolute failure count Software-based fault tolerance Software test
Subject Headings (RSWK):	Fehlertoleranz Softwareentwicklung
URI:	http://hdl.handle.net/2003/35175 http://dx.doi.org/10.17877/DE290R-17222
Issue Date:	2016
Appears in Collections:	Eingebettete Systemsoftware

Files in This Item:

File	Description	Size	Format
Dissertation.pdf	DNB	4.65 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show full item record

This item is protected by original copyright rightsstatements.org