Experience report: how is dynamic symbolic execution different from manual testing? a study on KLEE

Proceedings of the 2015 International Symposium on Software Testing and Analysis Pub Date : 2015-07-13 DOI:10.1145/2771783.2771818

Xiaoyin Wang, Lingming Zhang, Philip Tanofsky

{"title":"Experience report: how is dynamic symbolic execution different from manual testing? a study on KLEE","authors":"Xiaoyin Wang, Lingming Zhang, Philip Tanofsky","doi":"10.1145/2771783.2771818","DOIUrl":null,"url":null,"abstract":"Software testing has been the major approach to software quality assurance for decades, but it typically involves intensive manual efforts. To reduce manual efforts, researchers have proposed numerous approaches to automate test-case generation, which is one of the most time-consuming tasks in software testing. One most recent achievement in the area is Dynamic Symbolic Execution (DSE), and tools based on DSE, such as KLEE, have been reported to generate test suites achieving higher code coverage than manually developed test suites. However, besides the competitive code coverage, there have been few studies to compare DSE-based test suites with manually developed test suites more thoroughly on various metrics to understand the detailed differences between the two testing methodologies. In this paper, we revisit the experimental study on the KLEE tool and GNU CoreUtils programs, and compare KLEE-based test suites with manually developed test suites on various aspects. We further carried out a qualitative study to investigates the reasons behind the differences in statistical results. The results of our studies show that while KLEE-based test suites are able to generate test cases with higher code coverage, they are relatively less effective on covering hard-to-cover code and killing mutants. Furthermore, our qualitative study reveals that KLEE-based test suites have advantages in exploring error-handling code and exhausting options, but are less effective on generating valid string inputs and exploring meaningful program behaviors.","PeriodicalId":264859,"journal":{"name":"Proceedings of the 2015 International Symposium on Software Testing and Analysis","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 International Symposium on Software Testing and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2771783.2771818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 37

Abstract

Software testing has been the major approach to software quality assurance for decades, but it typically involves intensive manual efforts. To reduce manual efforts, researchers have proposed numerous approaches to automate test-case generation, which is one of the most time-consuming tasks in software testing. One most recent achievement in the area is Dynamic Symbolic Execution (DSE), and tools based on DSE, such as KLEE, have been reported to generate test suites achieving higher code coverage than manually developed test suites. However, besides the competitive code coverage, there have been few studies to compare DSE-based test suites with manually developed test suites more thoroughly on various metrics to understand the detailed differences between the two testing methodologies. In this paper, we revisit the experimental study on the KLEE tool and GNU CoreUtils programs, and compare KLEE-based test suites with manually developed test suites on various aspects. We further carried out a qualitative study to investigates the reasons behind the differences in statistical results. The results of our studies show that while KLEE-based test suites are able to generate test cases with higher code coverage, they are relatively less effective on covering hard-to-cover code and killing mutants. Furthermore, our qualitative study reveals that KLEE-based test suites have advantages in exploring error-handling code and exhausting options, but are less effective on generating valid string inputs and exploring meaningful program behaviors.

查看原文本刊更多论文

经验报告:动态符号执行与手动测试有何不同?关于KLEE的研究

几十年来，软件测试一直是软件质量保证的主要方法，但它通常涉及大量的手工工作。为了减少手工工作，研究人员提出了许多方法来自动生成测试用例，这是软件测试中最耗时的任务之一。该领域的最新成就之一是动态符号执行(Dynamic Symbolic Execution, DSE)，而基于DSE的工具，如KLEE，已经被报道为生成比手工开发的测试套件实现更高代码覆盖率的测试套件。然而，除了竞争性的代码覆盖率之外，很少有研究将基于dse的测试套件与手动开发的测试套件在各种度量上进行更彻底的比较，以了解两种测试方法之间的详细差异。本文回顾了KLEE工具和GNU coretils程序的实验研究，并在各个方面比较了基于KLEE的测试套件和手工开发的测试套件。我们进一步进行了定性研究，以调查统计结果差异背后的原因。我们的研究结果表明，虽然基于klee的测试套件能够生成具有更高代码覆盖率的测试用例，但是它们在覆盖难以覆盖的代码和杀死突变体方面相对来说效率较低。此外，我们的定性研究表明，基于klei的测试套件在探索错误处理代码和耗尽选项方面具有优势，但在生成有效的字符串输入和探索有意义的程序行为方面效率较低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2015 International Symposium on Software Testing and Analysis

自引率

0.00%

发文量