The Nature of the Times to Flight Software Failure during Space Missions

Javier Alonso, Michael Grottke, A. Nikora, Kishor S. Trivedi
{"title":"The Nature of the Times to Flight Software Failure during Space Missions","authors":"Javier Alonso, Michael Grottke, A. Nikora, Kishor S. Trivedi","doi":"10.1109/ISSRE.2012.32","DOIUrl":null,"url":null,"abstract":"The growing complexity of mission-critical space mission software makes it prone to suffer failures during operations. The success of space missions depends on the ability of the systems to deal with software failures, or to avoid them in the first place. In order to develop more effective mitigation techniques, it is necessary to understand the nature of the failures and the underlying software faults. Based on their characteristics, software faults can be classified into Bohrbugs, non-aging-related Mandelbugs, and aging-related bugs. Each type of fault requires different kinds of mitigation techniques. While Bohrbugs are usually easy to fix during development or testing, this is not the case for non-aging-related Mandelbugs and aging-related bugs due to their inherent complexity. Systems need mechanisms like software restart, software replication or software rejuvenation to deal with failures caused by these faults during the operational phase. In a previous study, we classified space mission flight software faults into the three above-mentioned categories based on problems reported during operations. That study concentrated on the percentages of the faults of each type and the variation of these percentages within and across different missions. This paper extends that work by exploring the nature of the times to software failure due to Bohrbugs and non-aging-related Mandelbugs for eight JPL/NASA missions. We start by applying trend tests to the times to failure to check if there is any reliability growth (or decay) for each type of failure. For those times to failure sequences with no trend, we fit distributions to the data sets and carry out goodness-of-fit tests. The results will be used to guide the development of improved operational failure mitigation techniques, thereby increasing the reliability of space mission software.","PeriodicalId":172003,"journal":{"name":"2012 IEEE 23rd International Symposium on Software Reliability Engineering","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 23rd International Symposium on Software Reliability Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSRE.2012.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

The growing complexity of mission-critical space mission software makes it prone to suffer failures during operations. The success of space missions depends on the ability of the systems to deal with software failures, or to avoid them in the first place. In order to develop more effective mitigation techniques, it is necessary to understand the nature of the failures and the underlying software faults. Based on their characteristics, software faults can be classified into Bohrbugs, non-aging-related Mandelbugs, and aging-related bugs. Each type of fault requires different kinds of mitigation techniques. While Bohrbugs are usually easy to fix during development or testing, this is not the case for non-aging-related Mandelbugs and aging-related bugs due to their inherent complexity. Systems need mechanisms like software restart, software replication or software rejuvenation to deal with failures caused by these faults during the operational phase. In a previous study, we classified space mission flight software faults into the three above-mentioned categories based on problems reported during operations. That study concentrated on the percentages of the faults of each type and the variation of these percentages within and across different missions. This paper extends that work by exploring the nature of the times to software failure due to Bohrbugs and non-aging-related Mandelbugs for eight JPL/NASA missions. We start by applying trend tests to the times to failure to check if there is any reliability growth (or decay) for each type of failure. For those times to failure sequences with no trend, we fit distributions to the data sets and carry out goodness-of-fit tests. The results will be used to guide the development of improved operational failure mitigation techniques, thereby increasing the reliability of space mission software.
航天任务中飞行软件故障的时代性质
关键任务空间任务软件日益复杂,使其在操作过程中容易出现故障。太空任务的成功取决于系统处理软件故障的能力,或者首先避免故障的能力。为了开发更有效的缓解技术,有必要了解故障的性质和潜在的软件故障。根据它们的特点,软件故障可以分为Bohrbugs、非老化相关的Mandelbugs和老化相关的bug。每种类型的断层都需要不同的缓解技术。虽然Bohrbugs通常很容易在开发或测试期间修复,但对于非老化相关的Mandelbugs和老化相关的bug,由于其固有的复杂性,情况并非如此。在运行阶段,系统需要软件重启、软件复制或软件恢复等机制来处理由这些故障引起的故障。在之前的研究中,我们根据运行过程中报告的问题将航天任务飞行软件故障分为上述三类。该研究集中于每种类型断层的百分比以及这些百分比在不同特派团内部和之间的变化。本文通过探索八个JPL/NASA任务中由于Bohrbugs和非老化相关的Mandelbugs而导致的软件故障的时代本质来扩展这项工作。我们首先将趋势测试应用于故障时间,以检查每种类型的故障是否有任何可靠性增长(或衰减)。对于没有趋势的失效序列,我们将分布拟合到数据集上,并进行拟合优度检验。研究结果将用于指导改进操作故障缓解技术的发展,从而提高空间任务软件的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信