{"title":"Feature selection in software defect prediction: A comparative study","authors":"Misha Kakkar, Sarika Jain","doi":"10.1109/CONFLUENCE.2016.7508200","DOIUrl":null,"url":null,"abstract":"Software has become a vital part of human's life - hence building defect free software is a must. Various studies have been carried out to predict defects, probability of defect prone modules, and implementation of defect prediction for real life softwares. The focus of this paper is towards building a framework using attribute selection for defect prediction based on five classifiers IBk, KStar, LWL, Random Tree and Random Forest. Performance comparison is done on the basis of accuracy and ROC values. The result and analysis shows that the framework has reduced total number of attributes used for each dataset by 6 folds on average, also LWL performed better than other four classifiers when tested with 10 Cross Validation (10CV) and percentage split of 66%.","PeriodicalId":299044,"journal":{"name":"2016 6th International Conference - Cloud System and Big Data Engineering (Confluence)","volume":"37 7","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 6th International Conference - Cloud System and Big Data Engineering (Confluence)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONFLUENCE.2016.7508200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
Software has become a vital part of human's life - hence building defect free software is a must. Various studies have been carried out to predict defects, probability of defect prone modules, and implementation of defect prediction for real life softwares. The focus of this paper is towards building a framework using attribute selection for defect prediction based on five classifiers IBk, KStar, LWL, Random Tree and Random Forest. Performance comparison is done on the basis of accuracy and ROC values. The result and analysis shows that the framework has reduced total number of attributes used for each dataset by 6 folds on average, also LWL performed better than other four classifiers when tested with 10 Cross Validation (10CV) and percentage split of 66%.