{"title":"An Empirical Analysis of Blind Tests","authors":"Kesina Baral, Jeff Offutt","doi":"10.1109/icst46399.2020.00034","DOIUrl":null,"url":null,"abstract":"Modern software engineers automate as many tests as possible. Test automation allows tests to be run hundreds or thousands of times: hourly, daily, and sometimes continuously. This saves time and money, ensures reproducibility, and ultimately leads to software that is better and cheaper. Automated tests must include code to check that the output of the program on the test matches expected behavior. This code is called the test oracle and is typically implemented in assertions that flag the test as passing if the assertion evaluates to true and failing if not. Since automated tests require programming, many problems can occur. Some lead to false positives, where incorrect behavior is marked as correct, and others to false negatives, where correct behavior is marked as incorrect. This paper identifies and studies a common problem where test assertions are written incorrectly, leading to incorrect behavior that is not recognized. We call these tests blind because the test does not see the incorrect behavior. Blind tests cause false positives, essentially wasting the tests. This paper presents results from several human-based studies to assess the frequency of blind tests with different software and different populations of users. In our studies, the percent of blind tests ranged from a low of 39% to a high of 95%.","PeriodicalId":235967,"journal":{"name":"2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icst46399.2020.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Modern software engineers automate as many tests as possible. Test automation allows tests to be run hundreds or thousands of times: hourly, daily, and sometimes continuously. This saves time and money, ensures reproducibility, and ultimately leads to software that is better and cheaper. Automated tests must include code to check that the output of the program on the test matches expected behavior. This code is called the test oracle and is typically implemented in assertions that flag the test as passing if the assertion evaluates to true and failing if not. Since automated tests require programming, many problems can occur. Some lead to false positives, where incorrect behavior is marked as correct, and others to false negatives, where correct behavior is marked as incorrect. This paper identifies and studies a common problem where test assertions are written incorrectly, leading to incorrect behavior that is not recognized. We call these tests blind because the test does not see the incorrect behavior. Blind tests cause false positives, essentially wasting the tests. This paper presents results from several human-based studies to assess the frequency of blind tests with different software and different populations of users. In our studies, the percent of blind tests ranged from a low of 39% to a high of 95%.