MindFI: A Fault Injection Tool for Reliability Assessment of MindSpore Applicacions

2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW) Pub Date : 2021-10-01 DOI:10.1109/ISSREW53611.2021.00068

Yang Zheng, Zhenye Feng, Zheng Hu, Ke Pei

{"title":"MindFI: A Fault Injection Tool for Reliability Assessment of MindSpore Applicacions","authors":"Yang Zheng, Zhenye Feng, Zheng Hu, Ke Pei","doi":"10.1109/ISSREW53611.2021.00068","DOIUrl":null,"url":null,"abstract":"With the emergence of big data and remarkable improvement of computational power, deep neural network (DNN) based intelligent systems, with the superb performance on computer vision, nature language processing, and optimization processing, etc, has been acceleratingly replacing traditional software in various aspects. However, due to the uncertainty of DNN modules learned from data, the intelligent systems are more likely to exhibit incorrect behaviors. Faults in software and hardware are also inevitably in practice, where the hidden defects can easily cause model failure. These will lead to severe accidents and losses in safety- and reliability-critical scenarios, such as autonomous driving. Techniques to test the differences between actual and desired behaviors and evaluate the reliability of DNN applications at faulty conditions is therefore significant for building a trustworthy DNN system. A popular method is fault injection and various fault injection tools have been developed for ML frameworks, such as Tensorflow, PyTorch. In this paper, we present a tool, MindFI, which targets to cover a variety of faults in ML programs written in Mindspore. Data, software and hardware faults can be easily injected in general Mindspore programs. We also use MindFI to evaluate the resilience of several commonly used ML programs against a assessment metrics.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSREW53611.2021.00068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

With the emergence of big data and remarkable improvement of computational power, deep neural network (DNN) based intelligent systems, with the superb performance on computer vision, nature language processing, and optimization processing, etc, has been acceleratingly replacing traditional software in various aspects. However, due to the uncertainty of DNN modules learned from data, the intelligent systems are more likely to exhibit incorrect behaviors. Faults in software and hardware are also inevitably in practice, where the hidden defects can easily cause model failure. These will lead to severe accidents and losses in safety- and reliability-critical scenarios, such as autonomous driving. Techniques to test the differences between actual and desired behaviors and evaluate the reliability of DNN applications at faulty conditions is therefore significant for building a trustworthy DNN system. A popular method is fault injection and various fault injection tools have been developed for ML frameworks, such as Tensorflow, PyTorch. In this paper, we present a tool, MindFI, which targets to cover a variety of faults in ML programs written in Mindspore. Data, software and hardware faults can be easily injected in general Mindspore programs. We also use MindFI to evaluate the resilience of several commonly used ML programs against a assessment metrics.

查看原文本刊更多论文

MindFI:用于MindSpore应用程序可靠性评估的故障注入工具

随着大数据的出现和计算能力的显著提高，基于深度神经网络(deep neural network, DNN)的智能系统以其在计算机视觉、自然语言处理、优化处理等方面的卓越性能，在各个方面加速取代传统软件。然而，由于从数据中学习的DNN模块的不确定性，智能系统更有可能表现出不正确的行为。软件和硬件的故障在实践中也是不可避免的，其中隐藏的缺陷很容易导致模型失效。这将导致严重的事故和损失，在安全和可靠性至关重要的情况下，如自动驾驶。因此，测试实际行为和期望行为之间的差异以及评估DNN应用在错误条件下的可靠性的技术对于构建可信赖的DNN系统非常重要。一种流行的方法是故障注入，并且已经为ML框架开发了各种故障注入工具，例如Tensorflow, PyTorch。在本文中，我们提出了一个工具，MindFI，它的目标是覆盖在Mindspore编写的ML程序中的各种错误。数据，软件和硬件故障可以很容易地注入到一般的Mindspore程序。我们还使用MindFI根据评估指标评估几种常用ML程序的弹性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)

自引率

0.00%

发文量