{"title":"Recovery blocks and algorithm-based fault tolerance","authors":"A. Tyrrell","doi":"10.1109/EURMIC.1996.546394","DOIUrl":null,"url":null,"abstract":"Algorithm-based fault-tolerance has been used for a number of years in the field of numerical processing. It has advantages over more 'explicit' fault-tolerant methods in that it operates concurrently with the application, thus reducing the time overhead associated with the added redundancy. Recovery blocks and similar fault-tolerant methods are critically dependent on the detection of errors in the system (as are all fault-tolerant methods). In the recovery block scheme, this error detection is performed by some form of acceptability check on the resultant data. This is usually a non-trivial problem and one of the major issues that prevent recovery block schemes being used more widely. This paper describes how algorithm-based fault-tolerant methods could be used to assist in the error detection process within the recovery block scheme and thus make it more appropriate for use in 'real' applications.","PeriodicalId":311520,"journal":{"name":"Proceedings of EUROMICRO 96. 22nd Euromicro Conference. Beyond 2000: Hardware and Software Design Strategies","volume":"272 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of EUROMICRO 96. 22nd Euromicro Conference. Beyond 2000: Hardware and Software Design Strategies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EURMIC.1996.546394","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24
Abstract
Algorithm-based fault-tolerance has been used for a number of years in the field of numerical processing. It has advantages over more 'explicit' fault-tolerant methods in that it operates concurrently with the application, thus reducing the time overhead associated with the added redundancy. Recovery blocks and similar fault-tolerant methods are critically dependent on the detection of errors in the system (as are all fault-tolerant methods). In the recovery block scheme, this error detection is performed by some form of acceptability check on the resultant data. This is usually a non-trivial problem and one of the major issues that prevent recovery block schemes being used more widely. This paper describes how algorithm-based fault-tolerant methods could be used to assist in the error detection process within the recovery block scheme and thus make it more appropriate for use in 'real' applications.