A large-scale empirical comparison of static and dynamic test case prioritization techniques

Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering Pub Date : 2016-11-01 DOI:10.1145/2950290.2950344

Qi Luo, Kevin Moran, D. Poshyvanyk

{"title":"A large-scale empirical comparison of static and dynamic test case prioritization techniques","authors":"Qi Luo, Kevin Moran, D. Poshyvanyk","doi":"10.1145/2950290.2950344","DOIUrl":null,"url":null,"abstract":"The large body of existing research in Test Case Prioritization (TCP) techniques, can be broadly classified into two categories: dynamic techniques (that rely on run-time execution information) and static techniques (that operate directly on source and test code). Absent from this current body of work is a comprehensive study aimed at understanding and evaluating the static approaches and comparing them to dynamic approaches on a large set of projects. In this work, we perform the first extensive study aimed at empirically evaluating four static TCP techniques comparing them with state-of-research dynamic TCP techniques at different test-case granularities (e.g., method and class-level) in terms of effectiveness, efficiency and similarity of faults detected. This study was performed on 30 real-word Java programs encompassing 431 KLoC. In terms of effectiveness, we find that the static call-graph-based technique outperforms the other static techniques at test-class level, but the topic-model-based technique performs better at test-method level. In terms of efficiency, the static call-graph-based technique is also the most efficient when compared to other static techniques. When examining the similarity of faults detected for the four static techniques compared to the four dynamic ones, we find that on average, the faults uncovered by these two groups of techniques are quite dissimilar, with the top 10% of test cases agreeing on only 25% - 30% of detected faults. This prompts further research into the severity/importance of faults uncovered by these techniques, and into the potential for combining static and dynamic information for more effective approaches.","PeriodicalId":20532,"journal":{"name":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"61 5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"62","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2950290.2950344","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 62

Abstract

The large body of existing research in Test Case Prioritization (TCP) techniques, can be broadly classified into two categories: dynamic techniques (that rely on run-time execution information) and static techniques (that operate directly on source and test code). Absent from this current body of work is a comprehensive study aimed at understanding and evaluating the static approaches and comparing them to dynamic approaches on a large set of projects. In this work, we perform the first extensive study aimed at empirically evaluating four static TCP techniques comparing them with state-of-research dynamic TCP techniques at different test-case granularities (e.g., method and class-level) in terms of effectiveness, efficiency and similarity of faults detected. This study was performed on 30 real-word Java programs encompassing 431 KLoC. In terms of effectiveness, we find that the static call-graph-based technique outperforms the other static techniques at test-class level, but the topic-model-based technique performs better at test-method level. In terms of efficiency, the static call-graph-based technique is also the most efficient when compared to other static techniques. When examining the similarity of faults detected for the four static techniques compared to the four dynamic ones, we find that on average, the faults uncovered by these two groups of techniques are quite dissimilar, with the top 10% of test cases agreeing on only 25% - 30% of detected faults. This prompts further research into the severity/importance of faults uncovered by these techniques, and into the potential for combining static and dynamic information for more effective approaches.

查看原文本刊更多论文

静态和动态测试用例优先级技术的大规模经验比较

测试用例优先级(TCP)技术的大量现有研究可以大致分为两类:动态技术(依赖于运行时执行信息)和静态技术(直接对源代码和测试代码进行操作)。目前的工作中缺少一项全面的研究，旨在理解和评估静态方法，并将它们与大量项目中的动态方法进行比较。在这项工作中，我们进行了第一次广泛的研究，旨在对四种静态TCP技术进行经验评估，并将它们与不同测试用例粒度(例如，方法和类级别)的动态TCP技术在有效性、效率和检测到的故障相似性方面进行比较。这项研究是在包含431个KLoC的30个真实的Java程序上进行的。在有效性方面，我们发现基于静态调用图的技术在测试类级别上优于其他静态技术，但基于主题模型的技术在测试方法级别上表现更好。就效率而言，与其他静态技术相比，基于静态调用图的技术也是最有效的。在对比四种静态技术和四种动态技术检测到的故障的相似度时，我们发现，平均而言，这两组技术发现的故障非常不相似，前10%的测试用例只有25% - 30%的检测到的故障是一致的。这促使人们进一步研究这些技术所发现的故障的严重性/重要性，以及结合静态和动态信息以获得更有效方法的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering

自引率

0.00%

发文量