使用故障注入的有效测试恢复代码

IF 1.8 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems Pub Date : 2011-12-01 DOI:10.1145/2063509.2063511

P. Marinescu, George Candea

{"title":"使用故障注入的有效测试恢复代码","authors":"P. Marinescu, George Candea","doi":"10.1145/2063509.2063511","DOIUrl":null,"url":null,"abstract":"A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"8 1","pages":"11:1-11:38"},"PeriodicalIF":1.8000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":"{\"title\":\"Efficient Testing of Recovery Code Using Fault Injection\",\"authors\":\"P. Marinescu, George Candea\",\"doi\":\"10.1145/2063509.2063511\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.\",\"PeriodicalId\":50918,\"journal\":{\"name\":\"ACM Transactions on Computer Systems\",\"volume\":\"8 1\",\"pages\":\"11:1-11:38\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2011-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"54\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Computer Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/2063509.2063511\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Computer Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/2063509.2063511","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 54

摘要

开发可靠的软件系统的关键部分是测试其恢复代码。这段代码通常很难在实验室中进行测试，并且在现场很少运行;然而，当它运行时，它必须完美地执行，以便从故障中恢复系统。在本文中，我们提供了一个库级故障注入引擎，它支持在软件测试中高效地使用故障注入。我们描述了自动化技术，用于可靠地识别应用程序在与其环境交互时可能遇到的错误，用于自动识别程序二进制文件中的高值注入目标，以及用于生成有效的注入测试场景。我们提出了一个框架，用于编写精确的触发器，以错误返回码和相应的副作用的形式在应用程序和库之间的边界注入所需的错误。这些技术体现在LFI中，这是我们正在发布的一种新的故障注入引擎http://lfi.epfl.ch。本文包括我们使用LFI的初步经验报告。最值得注意的是，LFI在MySQL数据库服务器，Git版本控制系统，BIND名称服务器，Pidgin IM客户端和PBFT复制系统中发现了12个严重的，以前未报告的错误，没有开发人员的帮助，也没有访问源代码。LFI还完全自动地将恢复代码覆盖率从几乎为零提高到60%，而不需要新的测试或人工参与。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient Testing of Recovery Code Using Fault Injection

A critical part of developing a reliable software system is testing its recovery code. This code is traditionally difficult to test in the lab, and, in the field, it rarely gets to run; yet, when it does run, it must execute flawlessly in order to recover the system from failure. In this article, we present a library-level fault injection engine that enables the productive use of fault injection for software testing. We describe automated techniques for reliably identifying errors that applications may encounter when interacting with their environment, for automatically identifying high-value injection targets in program binaries, and for producing efficient injection test scenarios. We present a framework for writing precise triggers that inject desired faults, in the form of error return codes and corresponding side effects, at the boundary between applications and libraries. These techniques are embodied in LFI, a new fault injection engine we are distributing http://lfi.epfl.ch. This article includes a report of our initial experience using LFI. Most notably, LFI found 12 serious, previously unreported bugs in the MySQL database server, Git version control system, BIND name server, Pidgin IM client, and PBFT replication system with no developer assistance and no access to source code. LFI also increased recovery-code coverage from virtually zero up to 60% entirely automatically without requiring new tests or human involvement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Computer Systems 工程技术-计算机：理论方法

CiteScore

4.00

自引率

0.00%

发文量

审稿时长

1 months

期刊介绍： ACM Transactions on Computer Systems (TOCS) presents research and development results on the design, implementation, analysis, evaluation, and use of computer systems and systems software. The term "computer systems" is interpreted broadly and includes operating systems, systems architecture and hardware, distributed systems, optimizing compilers, and the interaction between systems and computer networks. Articles appearing in TOCS will tend either to present new techniques and concepts, or to report on experiences and experiments with actual systems. Insights useful to system designers, builders, and users will be emphasized. TOCS publishes research and technical papers, both short and long. It includes technical correspondence to permit commentary on technical topics and on previously published papers.