Dominik Holling, Sebastian Banescu, Marco Probst, A. Petrovska, A. Pretschner
{"title":"Nequivack: Assessing Mutation Score Confidence","authors":"Dominik Holling, Sebastian Banescu, Marco Probst, A. Petrovska, A. Pretschner","doi":"10.1109/ICSTW.2016.29","DOIUrl":null,"url":null,"abstract":"The mutation score is defined as the number of killed mutants divided by the number of non-equivalent mutants. However, whether a mutant is equivalent to the original program is undecidable in general. Thus, even when improving a test suite, a mutant score assessing this test suite may become worse during the development of a system, because of equivalent mutants introduced during mutant creation. This is a fundamental problem. Using static analysis and symbolic execution, we show how to establish non-equivalence or \"don't know\" among mutants. If the number of don't knows is small, this is a good indicator that a computed mutation score actually reflects its above definition. We can therefore have an increased confidence that mutation score trends correspond to actual improvements of a test suite's quality, and are not overly polluted by equivalent mutants. Using a set of 14 representative unit size programs, we show that for some, but not all, of these programs, the above confidence can indeed be established. We also evaluate the reproducibility, efficiency and effectiveness of our Nequivack tool. Our findings are that reproducibility is completely given. A single mutant analysis can be performed within 3 seconds on average, which is efficient for practical and industrial applications.","PeriodicalId":335145,"journal":{"name":"2016 IEEE Ninth International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Ninth International Conference on Software Testing, Verification and Validation Workshops (ICSTW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSTW.2016.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
The mutation score is defined as the number of killed mutants divided by the number of non-equivalent mutants. However, whether a mutant is equivalent to the original program is undecidable in general. Thus, even when improving a test suite, a mutant score assessing this test suite may become worse during the development of a system, because of equivalent mutants introduced during mutant creation. This is a fundamental problem. Using static analysis and symbolic execution, we show how to establish non-equivalence or "don't know" among mutants. If the number of don't knows is small, this is a good indicator that a computed mutation score actually reflects its above definition. We can therefore have an increased confidence that mutation score trends correspond to actual improvements of a test suite's quality, and are not overly polluted by equivalent mutants. Using a set of 14 representative unit size programs, we show that for some, but not all, of these programs, the above confidence can indeed be established. We also evaluate the reproducibility, efficiency and effectiveness of our Nequivack tool. Our findings are that reproducibility is completely given. A single mutant analysis can be performed within 3 seconds on average, which is efficient for practical and industrial applications.