Close

Presentation

This content is available for: Tech Program Reg Pass. Upgrade Registration
Demystifying and Mitigating Cross-Layer Deficiencies of Soft Error Protection in Instruction Duplication
DescriptionSoft errors are prevalent in modern High-Performance Computing (HPC) systems, resulting in silent data corruptions (SDCs), compromising system reliability. Instruction duplication is a widely used software-based protection technique against SDCs. Existing instruction duplication techniques are mostly implemented at LLVM level and may suffer from low SDC coverage at assembly level. In this paper, we evaluate instruction duplication at both LLVM and assembly levels. Our study shows that existing instruction duplication techniques have protection deficiency at assembly level and are usually over-optimistic in the protection. We investigate the root-causes of the protection deficiency and propose a mitigation technique, Flowery, to solve the problem. Our evaluation shows that Flowery can effectively protect programs from SDCs evaluated at assembly level.
Event Type
Paper
TimeThursday, 16 November 20232pm - 2:30pm MST
Location401-402
Tags
Accelerators
Artificial Intelligence/Machine Learning
Codesign
Fault Handling and Tolerance
Performance Measurement, Modeling, and Tools
Post-Moore Computing
Registration Categories
TP
Reproducibility Badges