Enhancing Fault Awareness and Reliability of a Fault-Tolerant RISC-V System-on-Chip

IF 2.6 3区 工程技术 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
D. Santos, André M. P. Mattos, D. Melo, L. Dilillo
{"title":"Enhancing Fault Awareness and Reliability of a Fault-Tolerant RISC-V System-on-Chip","authors":"D. Santos, André M. P. Mattos, D. Melo, L. Dilillo","doi":"10.3390/electronics12122557","DOIUrl":null,"url":null,"abstract":"Recent research has shown interest in adopting the RISC-V processors for high-reliability electronics, such as aerospace applications. The openness of this architecture enables the implementation and customization of the processor features to increase their reliability. Studies on hardened RISC-V processors facing harsh radiation environments apply fault tolerance techniques in the processor core and peripherals, exploiting system redundancies. In prior work, we present a hardened RISC-V System-on-Chip (SoC), which could detect and correct radiation-induced faults with limited fault awareness. Therefore, in this work, we propose solutions to extend the fault observability of the SoC implementation by providing error detection and monitoring. For this purpose, we introduce observation features in the redundant structures of the system, enabling the report of valuable information that supports enhanced radiation testing and support the application to perform actions to recover from critical failures. Thus, the main contribution of this work is a solution to improve fault awareness and the analysis of the fault models in the system. In order to validate this solution, we performed complementary experiments in two irradiation facilities, comprehending atmospheric neutrons and a mixed-field environment, in which the system proved to be valuable for analyzing the radiation effects on the processor core and its peripherals. In these experiments, we were able to obtain a range of error reports that allowed us to gain a deeper understanding of the faults mechanisms, as well as improve the characterization of the SoC.","PeriodicalId":11646,"journal":{"name":"Electronics","volume":"3 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/electronics12122557","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 5

Abstract

Recent research has shown interest in adopting the RISC-V processors for high-reliability electronics, such as aerospace applications. The openness of this architecture enables the implementation and customization of the processor features to increase their reliability. Studies on hardened RISC-V processors facing harsh radiation environments apply fault tolerance techniques in the processor core and peripherals, exploiting system redundancies. In prior work, we present a hardened RISC-V System-on-Chip (SoC), which could detect and correct radiation-induced faults with limited fault awareness. Therefore, in this work, we propose solutions to extend the fault observability of the SoC implementation by providing error detection and monitoring. For this purpose, we introduce observation features in the redundant structures of the system, enabling the report of valuable information that supports enhanced radiation testing and support the application to perform actions to recover from critical failures. Thus, the main contribution of this work is a solution to improve fault awareness and the analysis of the fault models in the system. In order to validate this solution, we performed complementary experiments in two irradiation facilities, comprehending atmospheric neutrons and a mixed-field environment, in which the system proved to be valuable for analyzing the radiation effects on the processor core and its peripherals. In these experiments, we were able to obtain a range of error reports that allowed us to gain a deeper understanding of the faults mechanisms, as well as improve the characterization of the SoC.
增强容错RISC-V片上系统的故障感知和可靠性
最近的研究显示出对采用RISC-V处理器用于高可靠性电子产品的兴趣,例如航空航天应用。这种体系结构的开放性使处理器特性的实现和定制成为可能,从而提高它们的可靠性。面对恶劣辐射环境的强化RISC-V处理器研究在处理器核心和外设中应用容错技术,利用系统冗余。在之前的工作中,我们提出了一种强化的RISC-V片上系统(SoC),它可以在有限的故障感知下检测和纠正辐射引起的故障。因此,在这项工作中,我们提出了通过提供错误检测和监控来扩展SoC实现的故障可观察性的解决方案。为此,我们在系统的冗余结构中引入了观测功能,从而能够报告有价值的信息,从而支持增强的辐射测试,并支持应用程序执行从关键故障中恢复的操作。因此,本工作的主要贡献是解决了系统故障感知和故障模型分析的问题。为了验证该解决方案,我们在两个辐照设施中进行了补充实验,包括大气中子和混合场环境,其中该系统证明了对分析辐射对处理器核心及其外围设备的影响有价值。在这些实验中,我们能够获得一系列错误报告,使我们能够更深入地了解故障机制,并改进SoC的表征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Electronics
Electronics Computer Science-Computer Networks and Communications
CiteScore
1.10
自引率
10.30%
发文量
3515
审稿时长
16.71 days
期刊介绍: Electronics (ISSN 2079-9292; CODEN: ELECGJ) is an international, open access journal on the science of electronics and its applications published quarterly online by MDPI.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信