{"title":"An efficient method to reduce roundoff error in matrix multiplication with algorithm-based fault tolerance","authors":"Qihong Zhang, J.H. Kim","doi":"10.1109/ICWSI.1994.291235","DOIUrl":null,"url":null,"abstract":"Algorithm-Based Fault Tolerance (ABFT) schemes have been proposed by a number of researchers recently. Although all errors can be theoretically detected and corrected by using these techniques, some practical problems, especially the roundoff errors, degrade the performance drastically. In this paper, we proposed a new scheme called Extended Mantissa Checksum (EMC) test in which the mantissa of the product of two input matrices are divided into two sections and extended for faulty detection and correction. Using this scheme, the number of undetected errors and false alarms are decreased largely and the error coverage is improved significantly. In addition, the time latency is short and the hardware overhead is small compared with other schemes.<<ETX>>","PeriodicalId":183733,"journal":{"name":"Proceedings of 1994 International Conference on Wafer Scale Integration (ICWSI)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1994-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1994 International Conference on Wafer Scale Integration (ICWSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWSI.1994.291235","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Algorithm-Based Fault Tolerance (ABFT) schemes have been proposed by a number of researchers recently. Although all errors can be theoretically detected and corrected by using these techniques, some practical problems, especially the roundoff errors, degrade the performance drastically. In this paper, we proposed a new scheme called Extended Mantissa Checksum (EMC) test in which the mantissa of the product of two input matrices are divided into two sections and extended for faulty detection and correction. Using this scheme, the number of undetected errors and false alarms are decreased largely and the error coverage is improved significantly. In addition, the time latency is short and the hardware overhead is small compared with other schemes.<>