{"title":"Transient and intermittent fault recovery without rollback","authors":"S. Hamilton, A. Orailoglu","doi":"10.1109/DFTVS.1998.732173","DOIUrl":null,"url":null,"abstract":"Increasing chip density combined with heightened reliability expectations has spawned greater interest in fault tolerant design. In recent years, research into rollback and retry techniques has established them as an effective approach to recovery from transient and intermittent faults. For applications with strict timing requirements, however, the high error latency inherent in retry approaches is unacceptable. We have developed an alternative recovery method with strict error latency boundaries. In addition, the bulky state storage hardware required in rollback designs has been eliminated. The result is a more efficient, more broadly applicable approach to fault tolerant design.","PeriodicalId":245879,"journal":{"name":"Proceedings 1998 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Cat. No.98EX223)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 1998 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Cat. No.98EX223)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DFTVS.1998.732173","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Increasing chip density combined with heightened reliability expectations has spawned greater interest in fault tolerant design. In recent years, research into rollback and retry techniques has established them as an effective approach to recovery from transient and intermittent faults. For applications with strict timing requirements, however, the high error latency inherent in retry approaches is unacceptable. We have developed an alternative recovery method with strict error latency boundaries. In addition, the bulky state storage hardware required in rollback designs has been eliminated. The result is a more efficient, more broadly applicable approach to fault tolerant design.