{"title":"Stress testing software to determine fault tolerance for hardware failure and anomalies","authors":"J. Wu","doi":"10.1109/AUTEST.2012.6334582","DOIUrl":null,"url":null,"abstract":"Today's military systems rely for their performance on combinations of hardware and software. While testing of hardware performance during design, development and operation is well understood, the testing of software is less mature. In particular, the effect of hardware failures in the field on software performance, and therefore systems performance, is all-too-often overlooked or is tested in a far less rigorous manner that that applied to Hardware failures alone. Numerous examples exist of major system failures driven by software anomalies but triggered by Hardware failures, with consequences that range from degraded mission performance to weapons system destruction and operator fatalities. Measuring software development quality and fault tolerance is a challenging task. Many software test methods focus on source-code only approach (unit tests, modular test) and neglect the impacts caused by hardware anomalies or failures. Such missing test coverage can and will result in potential degraded software performance quality, thereby adding to project cost and delaying schedule. It can also result in far more disastrous consequences for the warfighters. This paper will discuss the general nature of the hardware-failure-software anomaly - system failure flow-down. It will then describe techniques that exist for system software testing and will highlight extensions of these techniques to focus on an effective and comprehensive software testing that includes performance prediction and hardware failure fault tolerance. The end result is a suite of test methods that, when properly applied, offer a systematic and comprehensive analysis of prime software behaviors under a range of hardware field failure conditions.","PeriodicalId":142978,"journal":{"name":"2012 IEEE AUTOTESTCON Proceedings","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE AUTOTESTCON Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AUTEST.2012.6334582","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Today's military systems rely for their performance on combinations of hardware and software. While testing of hardware performance during design, development and operation is well understood, the testing of software is less mature. In particular, the effect of hardware failures in the field on software performance, and therefore systems performance, is all-too-often overlooked or is tested in a far less rigorous manner that that applied to Hardware failures alone. Numerous examples exist of major system failures driven by software anomalies but triggered by Hardware failures, with consequences that range from degraded mission performance to weapons system destruction and operator fatalities. Measuring software development quality and fault tolerance is a challenging task. Many software test methods focus on source-code only approach (unit tests, modular test) and neglect the impacts caused by hardware anomalies or failures. Such missing test coverage can and will result in potential degraded software performance quality, thereby adding to project cost and delaying schedule. It can also result in far more disastrous consequences for the warfighters. This paper will discuss the general nature of the hardware-failure-software anomaly - system failure flow-down. It will then describe techniques that exist for system software testing and will highlight extensions of these techniques to focus on an effective and comprehensive software testing that includes performance prediction and hardware failure fault tolerance. The end result is a suite of test methods that, when properly applied, offer a systematic and comprehensive analysis of prime software behaviors under a range of hardware field failure conditions.