Jhansi Lakshmi Potharlanka, Maruthi Padmaja Turumella, R. Pote
{"title":"软件可靠性预测中的类不平衡特征选择与集成研究","authors":"Jhansi Lakshmi Potharlanka, Maruthi Padmaja Turumella, R. Pote","doi":"10.4018/ijossp.2019100102","DOIUrl":null,"url":null,"abstract":"Software quality can be improved by early software defect prediction models. However, class imbalance due to under representation of defects and the irrelevant metrics used to predict them are two major challenges that hinder the model performance. This article presents a new two-stage framework of Ensemble of Hybrid Feature selection (EHF) with Weighted Support Vector Machine Boosting (WSVMBoost), which further enhance the model performance. The EHF is the ensemble feature ranking of feature selection models such as filters and embedded models to select the relevant metrics. The classification ensembles, namely Random Forest, RUSBoost, WSVMBoost, and the base learners, namely Decision Tree, and SVM are also explored in this study using five software reliability datasets. From the statistical tests, EHF with WSVMBoost attained best mean rank in terms of performance than the rest of the feature selection hybrids in predicting the software defects. Additionally, this study has shown that both McCabe and Hasalted method level metrics are equally important in improving the model performance.","PeriodicalId":53605,"journal":{"name":"International Journal of Open Source Software and Processes","volume":"20 1","pages":"20-43"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Study on Class Imbalancing Feature Selection and Ensembles on Software Reliability Prediction\",\"authors\":\"Jhansi Lakshmi Potharlanka, Maruthi Padmaja Turumella, R. Pote\",\"doi\":\"10.4018/ijossp.2019100102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Software quality can be improved by early software defect prediction models. However, class imbalance due to under representation of defects and the irrelevant metrics used to predict them are two major challenges that hinder the model performance. This article presents a new two-stage framework of Ensemble of Hybrid Feature selection (EHF) with Weighted Support Vector Machine Boosting (WSVMBoost), which further enhance the model performance. The EHF is the ensemble feature ranking of feature selection models such as filters and embedded models to select the relevant metrics. The classification ensembles, namely Random Forest, RUSBoost, WSVMBoost, and the base learners, namely Decision Tree, and SVM are also explored in this study using five software reliability datasets. From the statistical tests, EHF with WSVMBoost attained best mean rank in terms of performance than the rest of the feature selection hybrids in predicting the software defects. Additionally, this study has shown that both McCabe and Hasalted method level metrics are equally important in improving the model performance.\",\"PeriodicalId\":53605,\"journal\":{\"name\":\"International Journal of Open Source Software and Processes\",\"volume\":\"20 1\",\"pages\":\"20-43\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Open Source Software and Processes\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/ijossp.2019100102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Open Source Software and Processes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijossp.2019100102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}
A Study on Class Imbalancing Feature Selection and Ensembles on Software Reliability Prediction
Software quality can be improved by early software defect prediction models. However, class imbalance due to under representation of defects and the irrelevant metrics used to predict them are two major challenges that hinder the model performance. This article presents a new two-stage framework of Ensemble of Hybrid Feature selection (EHF) with Weighted Support Vector Machine Boosting (WSVMBoost), which further enhance the model performance. The EHF is the ensemble feature ranking of feature selection models such as filters and embedded models to select the relevant metrics. The classification ensembles, namely Random Forest, RUSBoost, WSVMBoost, and the base learners, namely Decision Tree, and SVM are also explored in this study using five software reliability datasets. From the statistical tests, EHF with WSVMBoost attained best mean rank in terms of performance than the rest of the feature selection hybrids in predicting the software defects. Additionally, this study has shown that both McCabe and Hasalted method level metrics are equally important in improving the model performance.
期刊介绍:
The International Journal of Open Source Software and Processes (IJOSSP) publishes high-quality peer-reviewed and original research articles on the large field of open source software and processes. This wide area entails many intriguing question and facets, including the special development process performed by a large number of geographically dispersed programmers, community issues like coordination and communication, motivations of the participants, and also economic and legal issues. Beyond this topic, open source software is an example of a highly distributed innovation process led by the users. Therefore, many aspects have relevance beyond the realm of software and its development. In this tradition, IJOSSP also publishes papers on these topics. IJOSSP is a multi-disciplinary outlet, and welcomes submissions from all relevant fields of research and applying a multitude of research approaches.