测试集直径:量化测试用例集的多样性

2016 IEEE International Conference on Software Testing, Verification and Validation (ICST) Pub Date : 2015-06-10 DOI:10.1109/ICST.2016.33

R. Feldt, Simon M. Poulding, D. Clark, S. Yoo

{"title":"测试集直径:量化测试用例集的多样性","authors":"R. Feldt, Simon M. Poulding, D. Clark, S. Yoo","doi":"10.1109/ICST.2016.33","DOIUrl":null,"url":null,"abstract":"A common and natural intuition among software testers is that test cases need to differ if a software system is to be tested properly and its quality ensured. Consequently, much research has gone into formulating distance measures for how test cases, their inputs and/or their outputs differ. However, common to these proposals is that they are data type specific and/or calculate the diversity only between pairs of test inputs, traces or outputs. We propose a new metric to measure the diversity of sets of tests: the test set diameter (TSDm). It extends our earlier, pairwise test diversity metrics based on recent advances in information theory regarding the calculation of the normalized compression distance (NCD) for multisets. A key advantage is that TSDm is a universal measure of diversity and so can be applied to any test set regardless of data type of the test inputs (and, moreover, to other test-related data such as execution traces). But this universality comes at the cost of greater computational effort compared to competing approaches. Our experiments on four different systems show that the test set diameter can help select test sets with higher structural and fault coverage than random selection even when only applied to test inputs. This can enable early test design and selection, prior to even having a software system to test, and complement other types of test automation and analysis. We argue that this quantification of test set diversity creates a number of opportunities to better understand software quality and provides practical ways to increase it.","PeriodicalId":155554,"journal":{"name":"2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"106","resultStr":"{\"title\":\"Test Set Diameter: Quantifying the Diversity of Sets of Test Cases\",\"authors\":\"R. Feldt, Simon M. Poulding, D. Clark, S. Yoo\",\"doi\":\"10.1109/ICST.2016.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A common and natural intuition among software testers is that test cases need to differ if a software system is to be tested properly and its quality ensured. Consequently, much research has gone into formulating distance measures for how test cases, their inputs and/or their outputs differ. However, common to these proposals is that they are data type specific and/or calculate the diversity only between pairs of test inputs, traces or outputs. We propose a new metric to measure the diversity of sets of tests: the test set diameter (TSDm). It extends our earlier, pairwise test diversity metrics based on recent advances in information theory regarding the calculation of the normalized compression distance (NCD) for multisets. A key advantage is that TSDm is a universal measure of diversity and so can be applied to any test set regardless of data type of the test inputs (and, moreover, to other test-related data such as execution traces). But this universality comes at the cost of greater computational effort compared to competing approaches. Our experiments on four different systems show that the test set diameter can help select test sets with higher structural and fault coverage than random selection even when only applied to test inputs. This can enable early test design and selection, prior to even having a software system to test, and complement other types of test automation and analysis. We argue that this quantification of test set diversity creates a number of opportunities to better understand software quality and provides practical ways to increase it.\",\"PeriodicalId\":155554,\"journal\":{\"name\":\"2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"106\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICST.2016.33\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICST.2016.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 106

摘要

在软件测试人员中，一个常见的和自然的直觉是，如果要正确测试软件系统并确保其质量，则测试用例需要有所不同。因此，许多研究都进入了为测试用例、它们的输入和/或它们的输出如何不同而制定距离度量。然而，这些建议的共同点是它们是特定于数据类型和/或仅计算测试输入、跟踪或输出对之间的多样性。我们提出了一种度量测试集多样性的新度量:测试集直径(TSDm)。它扩展了我们之前的两两测试分集指标，该指标基于信息理论中关于多集归一化压缩距离(NCD)计算的最新进展。一个关键的优点是TSDm是多样性的通用度量，因此可以应用于任何测试集，而不考虑测试输入的数据类型(此外，还可以应用于其他与测试相关的数据，例如执行跟踪)。但是，与竞争方法相比，这种通用性的代价是更大的计算工作量。我们在四个不同系统上的实验表明，即使只应用于测试输入，测试集直径也可以帮助选择具有更高结构和故障覆盖率的测试集，而不是随机选择。这可以使早期的测试设计和选择成为可能，甚至在拥有一个要测试的软件系统之前，并补充其他类型的测试自动化和分析。我们认为，这种测试集多样性的量化创造了许多机会来更好地理解软件质量，并提供了实际的方法来提高它。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Test Set Diameter: Quantifying the Diversity of Sets of Test Cases

A common and natural intuition among software testers is that test cases need to differ if a software system is to be tested properly and its quality ensured. Consequently, much research has gone into formulating distance measures for how test cases, their inputs and/or their outputs differ. However, common to these proposals is that they are data type specific and/or calculate the diversity only between pairs of test inputs, traces or outputs. We propose a new metric to measure the diversity of sets of tests: the test set diameter (TSDm). It extends our earlier, pairwise test diversity metrics based on recent advances in information theory regarding the calculation of the normalized compression distance (NCD) for multisets. A key advantage is that TSDm is a universal measure of diversity and so can be applied to any test set regardless of data type of the test inputs (and, moreover, to other test-related data such as execution traces). But this universality comes at the cost of greater computational effort compared to competing approaches. Our experiments on four different systems show that the test set diameter can help select test sets with higher structural and fault coverage than random selection even when only applied to test inputs. This can enable early test design and selection, prior to even having a software system to test, and complement other types of test automation and analysis. We argue that this quantification of test set diversity creates a number of opportunities to better understand software quality and provides practical ways to increase it.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE International Conference on Software Testing, Verification and Validation (ICST)

自引率

0.00%

发文量