{"title":"无回滚的瞬时间歇故障恢复","authors":"S. Hamilton, A. Orailoglu","doi":"10.1109/DFTVS.1998.732173","DOIUrl":null,"url":null,"abstract":"Increasing chip density combined with heightened reliability expectations has spawned greater interest in fault tolerant design. In recent years, research into rollback and retry techniques has established them as an effective approach to recovery from transient and intermittent faults. For applications with strict timing requirements, however, the high error latency inherent in retry approaches is unacceptable. We have developed an alternative recovery method with strict error latency boundaries. In addition, the bulky state storage hardware required in rollback designs has been eliminated. The result is a more efficient, more broadly applicable approach to fault tolerant design.","PeriodicalId":245879,"journal":{"name":"Proceedings 1998 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Cat. No.98EX223)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Transient and intermittent fault recovery without rollback\",\"authors\":\"S. Hamilton, A. Orailoglu\",\"doi\":\"10.1109/DFTVS.1998.732173\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Increasing chip density combined with heightened reliability expectations has spawned greater interest in fault tolerant design. In recent years, research into rollback and retry techniques has established them as an effective approach to recovery from transient and intermittent faults. For applications with strict timing requirements, however, the high error latency inherent in retry approaches is unacceptable. We have developed an alternative recovery method with strict error latency boundaries. In addition, the bulky state storage hardware required in rollback designs has been eliminated. The result is a more efficient, more broadly applicable approach to fault tolerant design.\",\"PeriodicalId\":245879,\"journal\":{\"name\":\"Proceedings 1998 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Cat. No.98EX223)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 1998 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Cat. No.98EX223)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DFTVS.1998.732173\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 1998 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (Cat. No.98EX223)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DFTVS.1998.732173","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Transient and intermittent fault recovery without rollback
Increasing chip density combined with heightened reliability expectations has spawned greater interest in fault tolerant design. In recent years, research into rollback and retry techniques has established them as an effective approach to recovery from transient and intermittent faults. For applications with strict timing requirements, however, the high error latency inherent in retry approaches is unacceptable. We have developed an alternative recovery method with strict error latency boundaries. In addition, the bulky state storage hardware required in rollback designs has been eliminated. The result is a more efficient, more broadly applicable approach to fault tolerant design.