Fault resilience analysis of a RISC-V microprocessor design through a dedicated UVM environment

Marcello Barbirotta, A. Mastrandrea, F. Menichelli, F. Vigli, L. Blasi, Abdallah Cheikh, Stefano Sordillo, F. D. Gennaro, M. Olivieri
{"title":"Fault resilience analysis of a RISC-V microprocessor design through a dedicated UVM environment","authors":"Marcello Barbirotta, A. Mastrandrea, F. Menichelli, F. Vigli, L. Blasi, Abdallah Cheikh, Stefano Sordillo, F. D. Gennaro, M. Olivieri","doi":"10.1109/DFT50435.2020.9250871","DOIUrl":null,"url":null,"abstract":"Fault tolerance is a key requirement in several application domains of embedded processors cores. In a wide variety of applications, however, a full protection against faults occurring in any bit of the core may be oversized, and it has been demonstrated that the system level impact of local faults in the microprocessor chips also depends on the program being executed. As a result, it is relevant to study the fault injection resilience of a processor hardware design with an application-oriented methodology. Previous studies addressed either physical fault injection on FPGA prototypes, or RTL analysis and mixed-level approaches involving UVM, SystemC and DSL libraries. These methods are based on massive random error injection requiring impractical amounts of time, often limited to specific architecture sub-parts. In this work we present the advantages of an RTL, deterministic bit-level cycle-accurate fault injection analysis implemented in a pure UVM Environment. The approach allows characterizing the fault resilience of each bit of the microarchitecture at application level, paving the way to a subsequent customized protection based on the upper bound of error probability. Also, the characterization detects the time intervals corresponding to critical section of the program execution for each bit of the microarchitecture, sometimes leading to unexpected results. We discuss the advantages of a hierarchical time frame span of the execution time with injected faults rather than a uniform timing distribution of faults, and we set up the error classification methodology according to how each faulty bit can damage the system in different execution time sections. We carry out our experiments targeting the Klessydra T03 RISC-V open-source processor core, covering all of the 5561 register bits and characterizing two benchmark program executions, in less than 100 hours’ simulation.","PeriodicalId":340119,"journal":{"name":"2020 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DFT50435.2020.9250871","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Fault tolerance is a key requirement in several application domains of embedded processors cores. In a wide variety of applications, however, a full protection against faults occurring in any bit of the core may be oversized, and it has been demonstrated that the system level impact of local faults in the microprocessor chips also depends on the program being executed. As a result, it is relevant to study the fault injection resilience of a processor hardware design with an application-oriented methodology. Previous studies addressed either physical fault injection on FPGA prototypes, or RTL analysis and mixed-level approaches involving UVM, SystemC and DSL libraries. These methods are based on massive random error injection requiring impractical amounts of time, often limited to specific architecture sub-parts. In this work we present the advantages of an RTL, deterministic bit-level cycle-accurate fault injection analysis implemented in a pure UVM Environment. The approach allows characterizing the fault resilience of each bit of the microarchitecture at application level, paving the way to a subsequent customized protection based on the upper bound of error probability. Also, the characterization detects the time intervals corresponding to critical section of the program execution for each bit of the microarchitecture, sometimes leading to unexpected results. We discuss the advantages of a hierarchical time frame span of the execution time with injected faults rather than a uniform timing distribution of faults, and we set up the error classification methodology according to how each faulty bit can damage the system in different execution time sections. We carry out our experiments targeting the Klessydra T03 RISC-V open-source processor core, covering all of the 5561 register bits and characterizing two benchmark program executions, in less than 100 hours’ simulation.
通过专用UVM环境对RISC-V微处理器设计进行故障恢复分析
在嵌入式处理器内核的一些应用领域中,容错是一个关键的需求。然而,在各种各样的应用中,防止在核心的任何位发生故障的全面保护可能是过大的,并且已经证明,微处理器芯片中局部故障的系统级影响也取决于正在执行的程序。因此,用面向应用的方法研究处理器硬件设计的故障注入弹性是有意义的。以前的研究要么解决FPGA原型的物理故障注入,要么解决RTL分析和涉及UVM、SystemC和DSL库的混合级别方法。这些方法基于大量随机错误注入,需要大量不切实际的时间,通常仅限于特定的体系结构子部分。在这项工作中,我们介绍了在纯UVM环境中实现的RTL,确定性位级周期精确故障注入分析的优点。该方法允许在应用级别描述微架构的每个位的故障恢复能力,为后续基于错误概率上界的定制保护铺平了道路。此外,该特性检测对应于微体系结构的每个位的程序执行的关键部分的时间间隔,有时会导致意想不到的结果。讨论了注入故障的执行时间的分层时间框架跨度比故障的均匀时间分布的优点,并根据每个故障位在不同执行时间段对系统的破坏程度建立了错误分类方法。我们针对Klessydra T03 RISC-V开源处理器内核进行了实验,涵盖了所有5561寄存器位,并在不到100小时的模拟中表征了两个基准程序的执行。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信