Sukrat Gupta, Neel Gala, G. Madhusudan, V. Kamakoti
{"title":"一种容错微处理器体系结构","authors":"Sukrat Gupta, Neel Gala, G. Madhusudan, V. Kamakoti","doi":"10.1109/ATS.2015.35","DOIUrl":null,"url":null,"abstract":"Deeply scaled CMOS circuits are vulnerable to soft and hard errors. These errors pose reliability concerns, especially for systems used in radiation-prone environments like space and nuclear applications. This paper presents SHAKTI-F, a RISC-V based SEE-tolerant micro-processor architecture that provides a solution to the reliability issues mentioned above. The proposed architecture uses error correcting codes (ECC) to tolerate errors in registers and memories, while it employs a combination of space and time redundancy based techniques to tolerate errors in the ALU. Two novel re-computation techniques for detecting errors for the addition/subtraction and multiplication modules are proposed. The scheme also identifies parts of the circuitry that need to be radiation hardened thus providing a total protection to SEEs. The proposed scheme provides fine-grain error detection capability that help in localization of the error to a specific functional unit and isolating the same, rather than the entire processor or a large module within a processor. This provides a graceful degradation and/or fail-safe shutdown capability to the processor. The HDL model of the processor was validated by simulating it with randomly induced SEEs. The proposed scheme adds an extra penalty of only 20% on the core area and 25% penalty on the performance when compared with conventional systems. This is very less when compared to the penalty incurred by employing schemes including double modular and triple modular redundancy. Interestingly, there is a 45% reduction in power consumption due to introduction of fault tolerance. The resulting system runs at 330 MHz on a 55nm technology node, which is sufficient for the class of applications these cores are utilized for.","PeriodicalId":256879,"journal":{"name":"2015 IEEE 24th Asian Test Symposium (ATS)","volume":"308 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"SHAKTI-F: A Fault Tolerant Microprocessor Architecture\",\"authors\":\"Sukrat Gupta, Neel Gala, G. Madhusudan, V. Kamakoti\",\"doi\":\"10.1109/ATS.2015.35\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deeply scaled CMOS circuits are vulnerable to soft and hard errors. These errors pose reliability concerns, especially for systems used in radiation-prone environments like space and nuclear applications. This paper presents SHAKTI-F, a RISC-V based SEE-tolerant micro-processor architecture that provides a solution to the reliability issues mentioned above. The proposed architecture uses error correcting codes (ECC) to tolerate errors in registers and memories, while it employs a combination of space and time redundancy based techniques to tolerate errors in the ALU. Two novel re-computation techniques for detecting errors for the addition/subtraction and multiplication modules are proposed. The scheme also identifies parts of the circuitry that need to be radiation hardened thus providing a total protection to SEEs. The proposed scheme provides fine-grain error detection capability that help in localization of the error to a specific functional unit and isolating the same, rather than the entire processor or a large module within a processor. This provides a graceful degradation and/or fail-safe shutdown capability to the processor. The HDL model of the processor was validated by simulating it with randomly induced SEEs. The proposed scheme adds an extra penalty of only 20% on the core area and 25% penalty on the performance when compared with conventional systems. This is very less when compared to the penalty incurred by employing schemes including double modular and triple modular redundancy. Interestingly, there is a 45% reduction in power consumption due to introduction of fault tolerance. The resulting system runs at 330 MHz on a 55nm technology node, which is sufficient for the class of applications these cores are utilized for.\",\"PeriodicalId\":256879,\"journal\":{\"name\":\"2015 IEEE 24th Asian Test Symposium (ATS)\",\"volume\":\"308 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 24th Asian Test Symposium (ATS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ATS.2015.35\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 24th Asian Test Symposium (ATS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ATS.2015.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SHAKTI-F: A Fault Tolerant Microprocessor Architecture
Deeply scaled CMOS circuits are vulnerable to soft and hard errors. These errors pose reliability concerns, especially for systems used in radiation-prone environments like space and nuclear applications. This paper presents SHAKTI-F, a RISC-V based SEE-tolerant micro-processor architecture that provides a solution to the reliability issues mentioned above. The proposed architecture uses error correcting codes (ECC) to tolerate errors in registers and memories, while it employs a combination of space and time redundancy based techniques to tolerate errors in the ALU. Two novel re-computation techniques for detecting errors for the addition/subtraction and multiplication modules are proposed. The scheme also identifies parts of the circuitry that need to be radiation hardened thus providing a total protection to SEEs. The proposed scheme provides fine-grain error detection capability that help in localization of the error to a specific functional unit and isolating the same, rather than the entire processor or a large module within a processor. This provides a graceful degradation and/or fail-safe shutdown capability to the processor. The HDL model of the processor was validated by simulating it with randomly induced SEEs. The proposed scheme adds an extra penalty of only 20% on the core area and 25% penalty on the performance when compared with conventional systems. This is very less when compared to the penalty incurred by employing schemes including double modular and triple modular redundancy. Interestingly, there is a 45% reduction in power consumption due to introduction of fault tolerance. The resulting system runs at 330 MHz on a 55nm technology node, which is sufficient for the class of applications these cores are utilized for.