{"title":"On the efficiency of automated testing","authors":"Marcel Böhme, Soumya Paul","doi":"10.1145/2635868.2635923","DOIUrl":null,"url":null,"abstract":"The aim of automated program testing is to gain confidence about a program's correctness by sampling its input space. The sampling process can be either systematic or random. For every systematic testing technique the sampling is informed by the analysis of some program artefacts, like the specification, the source code (e.g., to achieve coverage), or even faulty versions of the program (e.g., mutation testing). This analysis incurs some cost. In contrast, random testing is unsystematic and does not sustain any analysis cost. In this paper, we investigate the theoretical efficiency of systematic versus random testing. First, we mathematically model the most effective systematic testing technique S_0 in which every sampled test input strictly increases the \"degree of confidence\" and is subject to the analysis cost c. Note that the efficiency of S_0 depends on c. Specifically, if we increase c, we also increase the time it takes S_0 to establish the same degree of confidence. So, there exists a maximum analysis cost beyond which R is generally more efficient than S_0. Given that we require the confidence that the program works correctly for x% of its input, we prove an upper bound on c of S_0, beyond which R is more efficient on the average. We also show that this bound depends asymptotically only on x. For instance, let R take 10ms time to sample one test input; to establish that the program works correctly for 90% of its input, S_0 must take less than 41ms to sample one test input. Otherwise, R is expected to establish the 90%-degree of confidence earlier. We prove similar bounds on the cost if the software tester is interested in revealing as many errors as possible in a given time span.","PeriodicalId":250543,"journal":{"name":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2635868.2635923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
The aim of automated program testing is to gain confidence about a program's correctness by sampling its input space. The sampling process can be either systematic or random. For every systematic testing technique the sampling is informed by the analysis of some program artefacts, like the specification, the source code (e.g., to achieve coverage), or even faulty versions of the program (e.g., mutation testing). This analysis incurs some cost. In contrast, random testing is unsystematic and does not sustain any analysis cost. In this paper, we investigate the theoretical efficiency of systematic versus random testing. First, we mathematically model the most effective systematic testing technique S_0 in which every sampled test input strictly increases the "degree of confidence" and is subject to the analysis cost c. Note that the efficiency of S_0 depends on c. Specifically, if we increase c, we also increase the time it takes S_0 to establish the same degree of confidence. So, there exists a maximum analysis cost beyond which R is generally more efficient than S_0. Given that we require the confidence that the program works correctly for x% of its input, we prove an upper bound on c of S_0, beyond which R is more efficient on the average. We also show that this bound depends asymptotically only on x. For instance, let R take 10ms time to sample one test input; to establish that the program works correctly for 90% of its input, S_0 must take less than 41ms to sample one test input. Otherwise, R is expected to establish the 90%-degree of confidence earlier. We prove similar bounds on the cost if the software tester is interested in revealing as many errors as possible in a given time span.