An Empirical Comparison of Similarity Measures for Abstract Test Case Prioritization

2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC) Pub Date : 2017-07-01 DOI:10.1109/COMPSAC.2017.271

Rubing Huang, Yunan Zhou, Weiwen Zong, D. Towey, Jinfu Chen

{"title":"An Empirical Comparison of Similarity Measures for Abstract Test Case Prioritization","authors":"Rubing Huang, Yunan Zhou, Weiwen Zong, D. Towey, Jinfu Chen","doi":"10.1109/COMPSAC.2017.271","DOIUrl":null,"url":null,"abstract":"Test case prioritization (TCP) attempts to order test cases such that those which are more important, according to some criterion or measurement, are executed earlier. TCP has been applied in many testing situations, including, for example, regression testing. An abstract test case (also called a model input) is an important type of test case, and has been widely used in practice, such as in configurable systems and software product lines. Similarity-based test case prioritization (STCP) has been proven to be cost-effective for abstract test cases (ATCs), but because there are many similarity measures which could be used to evaluate ATCs and to support STCP, we face the following question: How can we choose the similarity measure(s) for prioritizing ATCs that will deliver the most effective results? To address this, we studied fourteen measures and two popular STCP algorithms — local STCP (LSTCP), and global STCP (GSTCP). We also conducted an empirical study of five realworld programs, and investigated the efficacy of each similarity measure, according to the interaction coverage rate and fault detection rate. The results of these studies show that GSTCP outperforms LSTCP — in 61% to 84% of the cases, in terms of interaction coverage rates; and in 76% to 78% of the cases with respect to fault detection rates. Our studies also show that Overlap, the simplest similarity measure examined in this study, could obtain the overall best performance for LSTCP; and that Goodall3 has the best performance for GSTCP.","PeriodicalId":6556,"journal":{"name":"2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)","volume":"371 1","pages":"3-12"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC.2017.271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

Abstract

Test case prioritization (TCP) attempts to order test cases such that those which are more important, according to some criterion or measurement, are executed earlier. TCP has been applied in many testing situations, including, for example, regression testing. An abstract test case (also called a model input) is an important type of test case, and has been widely used in practice, such as in configurable systems and software product lines. Similarity-based test case prioritization (STCP) has been proven to be cost-effective for abstract test cases (ATCs), but because there are many similarity measures which could be used to evaluate ATCs and to support STCP, we face the following question: How can we choose the similarity measure(s) for prioritizing ATCs that will deliver the most effective results? To address this, we studied fourteen measures and two popular STCP algorithms — local STCP (LSTCP), and global STCP (GSTCP). We also conducted an empirical study of five realworld programs, and investigated the efficacy of each similarity measure, according to the interaction coverage rate and fault detection rate. The results of these studies show that GSTCP outperforms LSTCP — in 61% to 84% of the cases, in terms of interaction coverage rates; and in 76% to 78% of the cases with respect to fault detection rates. Our studies also show that Overlap, the simplest similarity measure examined in this study, could obtain the overall best performance for LSTCP; and that Goodall3 has the best performance for GSTCP.

查看原文本刊更多论文

抽象测试用例优先级相似度量的实证比较

测试用例优先级(TCP)尝试对测试用例进行排序，以便根据某些标准或度量，更重要的测试用例被更早地执行。TCP已应用于许多测试情况，例如，包括回归测试。抽象测试用例(也称为模型输入)是一种重要的测试用例类型，在实践中得到了广泛的应用，例如在可配置系统和软件产品线中。基于相似度的测试用例优先级(STCP)已经被证明对于抽象测试用例(atc)是具有成本效益的，但是因为有许多相似度量可以用来评估atc并支持STCP，我们面临以下问题:我们如何选择相似度量来对将交付最有效结果的atc进行优先级排序?为了解决这个问题，我们研究了14种测量方法和两种流行的STCP算法——本地STCP (LSTCP)和全局STCP (GSTCP)。我们还对五个现实世界的程序进行了实证研究，并根据交互覆盖率和故障检测率调查了每个相似度量的有效性。这些研究的结果表明，在交互覆盖率方面，GSTCP在61%至84%的情况下优于LSTCP;在76%到78%的案例中故障检出率。我们的研究还表明，重叠是本研究中最简单的相似性度量，可以获得LSTCP的整体最佳性能;Goodall3对GSTCP的性能最好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)

自引率

0.00%

发文量