{"title":"Data level approach for imbalanced class handling on educational data mining multiclass classification","authors":"Yoga Pristyanto, Irfan Pratama, A. F. Nugraha","doi":"10.1109/ICOIACT.2018.8350792","DOIUrl":null,"url":null,"abstract":"In Educational Data Mining (EDM), researchers usually overlook the balance of the distribution on a dataset. It can seriously affect the result of the classification process. Theoretically, the majority of classifier assumed that the distribution of the data is relatively balanced. Hence, the performance of the classification algorithm just become less effective and need to be handled so the problem can be solved. This study will explain about imbalanced class on multiclass EDM dataset handling mechanism using the combination of SMOTE and OSS. SMOTE and OSS method provides balancing mechanism for the dataset's distribution, so that the classification results will be enhanced in terms of classification performance. The result shows that the combination of SMOTE and OSS can enhance the performance of SVM as the classification method that used in this study. Those combination of methods produce the accuracy, sensitivity, specificity, and g-mean score as high as 88.637%, 92.292%, 95.554%, 93.796% respectively. Hence, the SMOTE and OSS combination can be a viable solution for imbalanced class on EDM's multiclass dataset.","PeriodicalId":6660,"journal":{"name":"2018 International Conference on Information and Communications Technology (ICOIACT)","volume":"192 1","pages":"310-314"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Information and Communications Technology (ICOIACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOIACT.2018.8350792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27
Abstract
In Educational Data Mining (EDM), researchers usually overlook the balance of the distribution on a dataset. It can seriously affect the result of the classification process. Theoretically, the majority of classifier assumed that the distribution of the data is relatively balanced. Hence, the performance of the classification algorithm just become less effective and need to be handled so the problem can be solved. This study will explain about imbalanced class on multiclass EDM dataset handling mechanism using the combination of SMOTE and OSS. SMOTE and OSS method provides balancing mechanism for the dataset's distribution, so that the classification results will be enhanced in terms of classification performance. The result shows that the combination of SMOTE and OSS can enhance the performance of SVM as the classification method that used in this study. Those combination of methods produce the accuracy, sensitivity, specificity, and g-mean score as high as 88.637%, 92.292%, 95.554%, 93.796% respectively. Hence, the SMOTE and OSS combination can be a viable solution for imbalanced class on EDM's multiclass dataset.