{"title":"An Empirical Study of Flaky Tests in Android Apps","authors":"S. Thorve, Chandani Sreshtha, Na Meng","doi":"10.1109/ICSME.2018.00062","DOIUrl":null,"url":null,"abstract":"A flaky test is a test that may fail or pass for the same code under testing (CUT). Flaky tests could be harmful to developers because the non-deterministic test outcome is not reliable and developers cannot easily debug the code. A prior study characterized the root causes and fixing strategies of flaky tests by analyzing commits of 51 Apache open source projects, without analyzing any Android app. Due to the popular usage of Android devices and the multitude of interactions of Android apps with third-party software libraries, hardware, network, and users, we were curious to find if the Android apps manifested unique flakiness patterns and called for any special resolution for flaky tests as compared to the existing literature. For this paper, we conducted an empirical study to characterize the flaky tests in Android apps. By classifying the root causes and fixing strategies of flakiness, we aimed to investigate how our proposed characterization for flakiness in Android apps varies from prior findings, and whether there are domain-specific flakiness patterns. After mining GitHub, we found 29 Android projects containing 77 commits that were relevant to flakiness. We identified five root causes of Android apps' flakiness. We revealed three novel causes - Dependency, Program Logic, and UI. Five types of resolution strategies were observed to address the flaky behavior. Many of the examined commits show developers' attempt to fix flakiness by changing software implementation in various ways. However, there are still 13% commits that simply skipped or removed the flaky tests. Our observations provide useful insights for future research on flaky tests of Android apps.","PeriodicalId":6572,"journal":{"name":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"51 1","pages":"534-538"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME.2018.00062","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 56
Abstract
A flaky test is a test that may fail or pass for the same code under testing (CUT). Flaky tests could be harmful to developers because the non-deterministic test outcome is not reliable and developers cannot easily debug the code. A prior study characterized the root causes and fixing strategies of flaky tests by analyzing commits of 51 Apache open source projects, without analyzing any Android app. Due to the popular usage of Android devices and the multitude of interactions of Android apps with third-party software libraries, hardware, network, and users, we were curious to find if the Android apps manifested unique flakiness patterns and called for any special resolution for flaky tests as compared to the existing literature. For this paper, we conducted an empirical study to characterize the flaky tests in Android apps. By classifying the root causes and fixing strategies of flakiness, we aimed to investigate how our proposed characterization for flakiness in Android apps varies from prior findings, and whether there are domain-specific flakiness patterns. After mining GitHub, we found 29 Android projects containing 77 commits that were relevant to flakiness. We identified five root causes of Android apps' flakiness. We revealed three novel causes - Dependency, Program Logic, and UI. Five types of resolution strategies were observed to address the flaky behavior. Many of the examined commits show developers' attempt to fix flakiness by changing software implementation in various ways. However, there are still 13% commits that simply skipped or removed the flaky tests. Our observations provide useful insights for future research on flaky tests of Android apps.