{"title":"A hybrid Machine Learning methodology for imbalanced datasets","authors":"A. Lipitakis, S. Kotsiantis","doi":"10.1109/IISA.2014.6878762","DOIUrl":null,"url":null,"abstract":"In the Machine Learning systems several imbalanced data sets exhibit skewed class distributions in which most cases are allocated to a class and far fewer cases to a smaller one. A classifier induced from an imbalanced data set has usually a low error rate for the majority class and an unacceptable error rate for the minority class. In this paper a synoptic review of the various related methodologies is given, a new ensemble methodology is introduced and an experimental study with other ensembles is presented. The proposed method that combines the power of OverBagging and Rotation Forest algorithms improves the identification of a difficult small class, while keeping the classification ability of the other class in an acceptable accuracy level.","PeriodicalId":298835,"journal":{"name":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","volume":"19 1-2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISA.2014.6878762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In the Machine Learning systems several imbalanced data sets exhibit skewed class distributions in which most cases are allocated to a class and far fewer cases to a smaller one. A classifier induced from an imbalanced data set has usually a low error rate for the majority class and an unacceptable error rate for the minority class. In this paper a synoptic review of the various related methodologies is given, a new ensemble methodology is introduced and an experimental study with other ensembles is presented. The proposed method that combines the power of OverBagging and Rotation Forest algorithms improves the identification of a difficult small class, while keeping the classification ability of the other class in an acceptable accuracy level.