{"title":"Using SMOTE and Heterogeneous Stacking in Ensemble learning for Software Defect Prediction","authors":"S. El-Shorbagy, Wael El-Gammal, W. Abdelmoez","doi":"10.1145/3220267.3220286","DOIUrl":null,"url":null,"abstract":"Nowadays, there are a lot of classifications models used for predictions in the software engineering field such as effort estimation and defect prediction. One of these models is the ensemble learning machine that improves model performance by combining multiple models in different ways to get a more powerful model.\n One of the problems facing the prediction model is the misclassification of the minority samples. This problem mainly appears in the case of defect prediction. Our aim is the classification of defects which are considered minority samples during the training phase. This can be improved by implementing the Synthetic Minority Over-Sampling Technique (SMOTE) before the implementation of the ensemble model which leads to over-sample the minority class instances.\n In this paper, our work propose applying a new ensemble model by combining the SMOTE technique with the heterogeneous stacking ensemble to get the most benefit and performance in training a dataset that focus on the minority subset as in the software prediction study. Our proposed model shows better performance that overcomes other techniques results applied on the minority samples of the defect prediction.","PeriodicalId":177522,"journal":{"name":"International Conference on Software and Information Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Software and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3220267.3220286","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
Nowadays, there are a lot of classifications models used for predictions in the software engineering field such as effort estimation and defect prediction. One of these models is the ensemble learning machine that improves model performance by combining multiple models in different ways to get a more powerful model.
One of the problems facing the prediction model is the misclassification of the minority samples. This problem mainly appears in the case of defect prediction. Our aim is the classification of defects which are considered minority samples during the training phase. This can be improved by implementing the Synthetic Minority Over-Sampling Technique (SMOTE) before the implementation of the ensemble model which leads to over-sample the minority class instances.
In this paper, our work propose applying a new ensemble model by combining the SMOTE technique with the heterogeneous stacking ensemble to get the most benefit and performance in training a dataset that focus on the minority subset as in the software prediction study. Our proposed model shows better performance that overcomes other techniques results applied on the minority samples of the defect prediction.