{"title":"An assessment of heterogenous ensemble classifiers for analyzing change-proneness in open-source software systems","authors":"Megha Khanna, Ankita Bansal","doi":"10.1002/smr.2660","DOIUrl":null,"url":null,"abstract":"<p>Software managers constantly look out for methods that ensure cost effective development of good quality software products. An important means of accomplishing this is by allocating more resources to weak classes of a software product, which are prone to changes. Therefore, correct prediction of these change-prone classes is critical. Though various researchers have investigated the performance of several algorithms for identifying them, the search for an optimum classifier still persists. To this end, this study critically investigates the use of six Heterogenous Ensemble Classifiers (HEC) for Software Change Prediction (SCP) by empirically validating datasets obtained from 12 open-source software systems. The results of the study are statistically assessed using three robust performance indicators (AUC, F-measure and Mathew Correlation Coefficient) in two different validation scenarios (within project and cross-project). They indicate the superiority of Average Probability Voting Ensemble, a heterogenous classifier for determining change-proneness in the investigated systems. The average AUC values of software change prediction models developed using this ensemble classifier exhibited an improvement of 3%-9% and 3%-11% respectively when compared with its base learners and homogeneous counter parts. Similar observations were inferred using other investigated performance measures. Furthermore, the evidence obtained from the results suggests that the change in number of base learners or type of meta-learner does not exhibit significant change in the performance of corresponding heterogenous ensemble classifiers.</p>","PeriodicalId":48898,"journal":{"name":"Journal of Software-Evolution and Process","volume":"36 8","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Software-Evolution and Process","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/smr.2660","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Software managers constantly look out for methods that ensure cost effective development of good quality software products. An important means of accomplishing this is by allocating more resources to weak classes of a software product, which are prone to changes. Therefore, correct prediction of these change-prone classes is critical. Though various researchers have investigated the performance of several algorithms for identifying them, the search for an optimum classifier still persists. To this end, this study critically investigates the use of six Heterogenous Ensemble Classifiers (HEC) for Software Change Prediction (SCP) by empirically validating datasets obtained from 12 open-source software systems. The results of the study are statistically assessed using three robust performance indicators (AUC, F-measure and Mathew Correlation Coefficient) in two different validation scenarios (within project and cross-project). They indicate the superiority of Average Probability Voting Ensemble, a heterogenous classifier for determining change-proneness in the investigated systems. The average AUC values of software change prediction models developed using this ensemble classifier exhibited an improvement of 3%-9% and 3%-11% respectively when compared with its base learners and homogeneous counter parts. Similar observations were inferred using other investigated performance measures. Furthermore, the evidence obtained from the results suggests that the change in number of base learners or type of meta-learner does not exhibit significant change in the performance of corresponding heterogenous ensemble classifiers.