基于动态执行数据的自动化软件故障检测的机器学习技术:实证评估研究

Proceedings of the 6th International Conference on Engineering & MIS 2020 Pub Date : 2020-09-14 DOI:10.1145/3410352.3410747

Rafig Almaghairbe, M. Roper, T. Almabruk

{"title":"基于动态执行数据的自动化软件故障检测的机器学习技术:实证评估研究","authors":"Rafig Almaghairbe, M. Roper, T. Almabruk","doi":"10.1145/3410352.3410747","DOIUrl":null,"url":null,"abstract":"The biggest obstacle of automated software testing is the construction of test oracles. Today, it is possible to generate enormous amount of test cases for an arbitrary system that reach a remarkably high level of coverage, but the effectiveness of test cases is limited by the availability of test oracles that can distinguish failing executions. Previous work by the authors has explored the use of unsupervised and semi-supervised learning techniques to develop test oracles so that the correctness of software outputs and behaviours on new test cases can be predicated [1], [2], [10], and experimental results demonstrate the promise of this approach. In this paper, we present an evaluation study for test oracles based on machine-learning approaches via dynamic execution data (firstly, input/output pairs and secondly, amalgamations of input/output pairs and execution traces) by comparing their effectiveness with existing techniques from the specification mining domain (the data invariant detector Daikon [5]). The two approaches are evaluated on a range of mid-sized systems and compared in terms of their fault detection ability and false positive rate. The empirical study also discuss the major limitations and the most important properties related to the application of machine learning techniques as test oracles in practice. The study also gives a road map for further research direction in order to tackle some of discussed limitations such as accuracy and scalability. The results show that in most cases semi-supervised learning techniques performed far better as an automated test classifier than Daikon (especially in the case that input/output pairs were augmented with their execution traces). However, there is one system for which our strategy struggles and Daikon performed far better. Furthermore, unsupervised learning techniques performed on a par when compared with Daikon in several cases particularly when input/output pairs were used together with execution traces.","PeriodicalId":178037,"journal":{"name":"Proceedings of the 6th International Conference on Engineering & MIS 2020","volume":"34 14","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Techniques for Automated Software Fault Detection via Dynamic Execution Data: Empirical Evaluation Study\",\"authors\":\"Rafig Almaghairbe, M. Roper, T. Almabruk\",\"doi\":\"10.1145/3410352.3410747\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The biggest obstacle of automated software testing is the construction of test oracles. Today, it is possible to generate enormous amount of test cases for an arbitrary system that reach a remarkably high level of coverage, but the effectiveness of test cases is limited by the availability of test oracles that can distinguish failing executions. Previous work by the authors has explored the use of unsupervised and semi-supervised learning techniques to develop test oracles so that the correctness of software outputs and behaviours on new test cases can be predicated [1], [2], [10], and experimental results demonstrate the promise of this approach. In this paper, we present an evaluation study for test oracles based on machine-learning approaches via dynamic execution data (firstly, input/output pairs and secondly, amalgamations of input/output pairs and execution traces) by comparing their effectiveness with existing techniques from the specification mining domain (the data invariant detector Daikon [5]). The two approaches are evaluated on a range of mid-sized systems and compared in terms of their fault detection ability and false positive rate. The empirical study also discuss the major limitations and the most important properties related to the application of machine learning techniques as test oracles in practice. The study also gives a road map for further research direction in order to tackle some of discussed limitations such as accuracy and scalability. The results show that in most cases semi-supervised learning techniques performed far better as an automated test classifier than Daikon (especially in the case that input/output pairs were augmented with their execution traces). However, there is one system for which our strategy struggles and Daikon performed far better. Furthermore, unsupervised learning techniques performed on a par when compared with Daikon in several cases particularly when input/output pairs were used together with execution traces.\",\"PeriodicalId\":178037,\"journal\":{\"name\":\"Proceedings of the 6th International Conference on Engineering & MIS 2020\",\"volume\":\"34 14\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 6th International Conference on Engineering & MIS 2020\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3410352.3410747\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Engineering & MIS 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3410352.3410747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

自动化软件测试的最大障碍是测试oracle的构建。今天，可以为任意系统生成大量的测试用例，达到非常高的覆盖率，但是测试用例的有效性受到可以区分失败执行的测试oracle的可用性的限制。作者以前的工作已经探索了使用无监督和半监督学习技术来开发测试预言机，以便可以预测新测试用例上软件输出和行为的正确性[1]，[2]，[10]，实验结果证明了这种方法的前景。在本文中，我们通过动态执行数据(首先，输入/输出对，其次，输入/输出对和执行轨迹的合并)，通过将其有效性与规范挖掘领域(数据不变量检测器Daikon[5])的现有技术进行比较，提出了基于机器学习方法的测试预言机评估研究。在一系列中型系统上对这两种方法进行了评估，并比较了它们的故障检测能力和误报率。实证研究还讨论了机器学习技术在实践中作为测试预言机应用的主要限制和最重要的特性。该研究还为进一步的研究方向提供了路线图，以解决所讨论的一些限制，如准确性和可扩展性。结果表明，在大多数情况下，半监督学习技术作为自动化测试分类器的表现要比Daikon好得多(特别是在输入/输出对被其执行轨迹增强的情况下)。然而，有一个系统，我们的战略斗争和Daikon表现得更好。此外，在一些情况下，与Daikon相比，无监督学习技术的表现相当，特别是当输入/输出对与执行跟踪一起使用时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Machine Learning Techniques for Automated Software Fault Detection via Dynamic Execution Data: Empirical Evaluation Study

The biggest obstacle of automated software testing is the construction of test oracles. Today, it is possible to generate enormous amount of test cases for an arbitrary system that reach a remarkably high level of coverage, but the effectiveness of test cases is limited by the availability of test oracles that can distinguish failing executions. Previous work by the authors has explored the use of unsupervised and semi-supervised learning techniques to develop test oracles so that the correctness of software outputs and behaviours on new test cases can be predicated [1], [2], [10], and experimental results demonstrate the promise of this approach. In this paper, we present an evaluation study for test oracles based on machine-learning approaches via dynamic execution data (firstly, input/output pairs and secondly, amalgamations of input/output pairs and execution traces) by comparing their effectiveness with existing techniques from the specification mining domain (the data invariant detector Daikon [5]). The two approaches are evaluated on a range of mid-sized systems and compared in terms of their fault detection ability and false positive rate. The empirical study also discuss the major limitations and the most important properties related to the application of machine learning techniques as test oracles in practice. The study also gives a road map for further research direction in order to tackle some of discussed limitations such as accuracy and scalability. The results show that in most cases semi-supervised learning techniques performed far better as an automated test classifier than Daikon (especially in the case that input/output pairs were augmented with their execution traces). However, there is one system for which our strategy struggles and Daikon performed far better. Furthermore, unsupervised learning techniques performed on a par when compared with Daikon in several cases particularly when input/output pairs were used together with execution traces.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 6th International Conference on Engineering & MIS 2020

自引率

0.00%

发文量