{"title":"A Proposal of an ECC-based Adaptive Fault-Tolerant Mechanism for 16-bit data words","authors":"Joaquín Gracia-Morán;Luis-J. Saiz-Adalid;J.-Carlos Baraza-Calvo;Daniel Gil-Tomas;Pedro-J. Gil-Vicente","doi":"10.1109/TLA.2024.10500715","DOIUrl":null,"url":null,"abstract":"With the integration scale level reached in CMOS technology, memory systems provide a great storage capacity, but at the price of an augment in their fault rate. In this way, the probability of experiencing Single Cell Upsets or Multiple Cell Upsets have risen. Error Correction Codes (ECC) are broadly employed to protect memory systems. Though, the inclusion of an ECC in a computer system adds, in each memory word, some extra bits used to detect and/or correct errors. In addition, encoding and decoding circuitries must be added, introducing overheads in area, delay, and power consumption. Usually, when an ECC-based fault tolerance mechanism is designed, its fault tolerance properties cannot be modified. However, in some applications, current memory systems can suffer a variable fault rate during their operation. Thus, it seems very interesting that this mechanism would be able to adapt to these variable fault conditions. This work proposes an Adaptive Fault-Tolerant mechanism based on ECC. This mechanism can adapt to different fault conditions, being able to correct and/or detect single and multiple bits in error. The Adaptive Fault-Tolerant mechanism proposed uses a unique encoder, that is, it is not necessary to re-encode the data to change the error coverage.","PeriodicalId":55024,"journal":{"name":"IEEE Latin America Transactions","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10500715","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Latin America Transactions","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10500715/","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
With the integration scale level reached in CMOS technology, memory systems provide a great storage capacity, but at the price of an augment in their fault rate. In this way, the probability of experiencing Single Cell Upsets or Multiple Cell Upsets have risen. Error Correction Codes (ECC) are broadly employed to protect memory systems. Though, the inclusion of an ECC in a computer system adds, in each memory word, some extra bits used to detect and/or correct errors. In addition, encoding and decoding circuitries must be added, introducing overheads in area, delay, and power consumption. Usually, when an ECC-based fault tolerance mechanism is designed, its fault tolerance properties cannot be modified. However, in some applications, current memory systems can suffer a variable fault rate during their operation. Thus, it seems very interesting that this mechanism would be able to adapt to these variable fault conditions. This work proposes an Adaptive Fault-Tolerant mechanism based on ECC. This mechanism can adapt to different fault conditions, being able to correct and/or detect single and multiple bits in error. The Adaptive Fault-Tolerant mechanism proposed uses a unique encoder, that is, it is not necessary to re-encode the data to change the error coverage.
期刊介绍:
IEEE Latin America Transactions (IEEE LATAM) is an interdisciplinary journal focused on the dissemination of original and quality research papers / review articles in Spanish and Portuguese of emerging topics in three main areas: Computing, Electric Energy and Electronics. Some of the sub-areas of the journal are, but not limited to: Automatic control, communications, instrumentation, artificial intelligence, power and industrial electronics, fault diagnosis and detection, transportation electrification, internet of things, electrical machines, circuits and systems, biomedicine and biomedical / haptic applications, secure communications, robotics, sensors and actuators, computer networks, smart grids, among others.