Enrique A. da Roza, Jackson A. Prado Lima, Rogério C. Silva, S. Vergilio
{"title":"Machine Learning Regression Techniques for Test Case Prioritization in Continuous Integration Environment","authors":"Enrique A. da Roza, Jackson A. Prado Lima, Rogério C. Silva, S. Vergilio","doi":"10.1109/saner53432.2022.00034","DOIUrl":null,"url":null,"abstract":"Test Case Prioritization (TCP) techniques are a key factor in reducing the regression testing costs even more when Continuous Integration (CI) practices are adopted. TCP approaches based on failure history have been adopted in this context because they are more suitable for CI environment constraints: test budget and test case volatility, that is, test cases may be added or removed over the CI cycles. Promising approaches are based on Reinforcement Learning (RL), which learns with past prioritization, guided by a reward function. In this work, we introduce a TCP approach for CI environments based on the sliding window method, which can be instantiated with different Machine Learning (ML) algorithms. Unlike other ML approaches, it does not require retraining the model to perform the prioritization and any code analysis. As an alternative for the RL approaches, we apply the Random Forest (RF) algorithm and a Long Short Term Memory (LSTM) deep learning network in our evaluation. We use three time budgets and eleven systems. The results show the applicability of the approach considering the prioritization time and the time between the CI cycles. Both algorithms take just a few seconds to execute. The RF algorithm obtained the best performance for more restrictive budgets compared to the RL approaches described in the literature. Considering all systems and budgets, RF reaches Normalized Average Percentage of Faults Detected (NAPFD) values that are the best or statistically equivalent to the best ones in around 72% of the cases, and the LSTM network in 55% of them. Moreover, we discuss some implications of our results for the usage of the algorithms evaluated.","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/saner53432.2022.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Test Case Prioritization (TCP) techniques are a key factor in reducing the regression testing costs even more when Continuous Integration (CI) practices are adopted. TCP approaches based on failure history have been adopted in this context because they are more suitable for CI environment constraints: test budget and test case volatility, that is, test cases may be added or removed over the CI cycles. Promising approaches are based on Reinforcement Learning (RL), which learns with past prioritization, guided by a reward function. In this work, we introduce a TCP approach for CI environments based on the sliding window method, which can be instantiated with different Machine Learning (ML) algorithms. Unlike other ML approaches, it does not require retraining the model to perform the prioritization and any code analysis. As an alternative for the RL approaches, we apply the Random Forest (RF) algorithm and a Long Short Term Memory (LSTM) deep learning network in our evaluation. We use three time budgets and eleven systems. The results show the applicability of the approach considering the prioritization time and the time between the CI cycles. Both algorithms take just a few seconds to execute. The RF algorithm obtained the best performance for more restrictive budgets compared to the RL approaches described in the literature. Considering all systems and budgets, RF reaches Normalized Average Percentage of Faults Detected (NAPFD) values that are the best or statistically equivalent to the best ones in around 72% of the cases, and the LSTM network in 55% of them. Moreover, we discuss some implications of our results for the usage of the algorithms evaluated.