{"title":"持续集成成果的跨项目可预测性实证研究","authors":"Jing Xia, Yanhui Li, Chuanqi Wang","doi":"10.1109/WISA.2017.53","DOIUrl":null,"url":null,"abstract":"Build prediction can reduce latency between continuous integration outcomes and the corresponding decisions, improving the efficiency of development team. Current build predictions are generally within-project, making it unavailable on those projects without enough build data. Cross-project prediction is the-state-of-art technique to solve the lack of training data on the studied projects by importing data from other projects. However, no previous study focuses on cross-project build predictions and checks the performance in the real world projects. This paper carries out an empirical study on the performance of cross-project build prediction with a wide range of 126 opensource projects under 6 common classifiers. In this paper, to select the training sets for cross-project prediction, we introduce two widely used data selection methods: Burak Filter based on build-level and Bellwether Strategy based on project-level. According to the results of our experiments, we have the following observations. Firstly, by the comparison between these two methods, we find that project-level selection (Bellwether strategy) performs better than build-level selection (Burak Filter). Furthermore, we observe that the prediction results can be improved by clustering the 126 studied projects into several smaller communities containing about 20-40 projects. And among 6 used classifiers, we find that decision tree classifier performs the best. Finally, by computing the optimal prediction results, we conclude that current selection methods still need to be improved to get close to the optimal prediction in cross-project build predictions.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"224 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"An Empirical Study on the Cross-Project Predictability of Continuous Integration Outcomes\",\"authors\":\"Jing Xia, Yanhui Li, Chuanqi Wang\",\"doi\":\"10.1109/WISA.2017.53\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Build prediction can reduce latency between continuous integration outcomes and the corresponding decisions, improving the efficiency of development team. Current build predictions are generally within-project, making it unavailable on those projects without enough build data. Cross-project prediction is the-state-of-art technique to solve the lack of training data on the studied projects by importing data from other projects. However, no previous study focuses on cross-project build predictions and checks the performance in the real world projects. This paper carries out an empirical study on the performance of cross-project build prediction with a wide range of 126 opensource projects under 6 common classifiers. In this paper, to select the training sets for cross-project prediction, we introduce two widely used data selection methods: Burak Filter based on build-level and Bellwether Strategy based on project-level. According to the results of our experiments, we have the following observations. Firstly, by the comparison between these two methods, we find that project-level selection (Bellwether strategy) performs better than build-level selection (Burak Filter). Furthermore, we observe that the prediction results can be improved by clustering the 126 studied projects into several smaller communities containing about 20-40 projects. And among 6 used classifiers, we find that decision tree classifier performs the best. Finally, by computing the optimal prediction results, we conclude that current selection methods still need to be improved to get close to the optimal prediction in cross-project build predictions.\",\"PeriodicalId\":204706,\"journal\":{\"name\":\"2017 14th Web Information Systems and Applications Conference (WISA)\",\"volume\":\"224 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 14th Web Information Systems and Applications Conference (WISA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WISA.2017.53\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th Web Information Systems and Applications Conference (WISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2017.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Empirical Study on the Cross-Project Predictability of Continuous Integration Outcomes
Build prediction can reduce latency between continuous integration outcomes and the corresponding decisions, improving the efficiency of development team. Current build predictions are generally within-project, making it unavailable on those projects without enough build data. Cross-project prediction is the-state-of-art technique to solve the lack of training data on the studied projects by importing data from other projects. However, no previous study focuses on cross-project build predictions and checks the performance in the real world projects. This paper carries out an empirical study on the performance of cross-project build prediction with a wide range of 126 opensource projects under 6 common classifiers. In this paper, to select the training sets for cross-project prediction, we introduce two widely used data selection methods: Burak Filter based on build-level and Bellwether Strategy based on project-level. According to the results of our experiments, we have the following observations. Firstly, by the comparison between these two methods, we find that project-level selection (Bellwether strategy) performs better than build-level selection (Burak Filter). Furthermore, we observe that the prediction results can be improved by clustering the 126 studied projects into several smaller communities containing about 20-40 projects. And among 6 used classifiers, we find that decision tree classifier performs the best. Finally, by computing the optimal prediction results, we conclude that current selection methods still need to be improved to get close to the optimal prediction in cross-project build predictions.