{"title":"SMOTE-Out、SMOTE-Cosine和Selected-SMOTE:一种处理数据级别不平衡的增强策略","authors":"Fajri Koto","doi":"10.1109/ICACSIS.2014.7065849","DOIUrl":null,"url":null,"abstract":"The imbalanced dataset often becomes obstacle in supervised learning process. Imbalance is case in which the example in training data belonging to one class is heavily outnumber the examples in the other class. Applying classifier to this dataset results in the failure of classifier to learn the minority class. Synthetic Minority Oversampling Technique (SMOTE) is a well known over-sampling method that tackles imbalance in data level. SMOTE creates synthetic example between two close vectors that lay together. Our study considers three improvements of SMOTE and call them as SMOTE-Out, SMOTE-Cosine, and Selected-SMOTE, in order to cover cases which are not already done by SMOTE. To investigate the proposed method, our experiments were conducted with eighteen different datasets. The results show that our proposed SMOTE give some improvements of B-ACC and F1-Score.","PeriodicalId":443250,"journal":{"name":"2014 International Conference on Advanced Computer Science and Information System","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"SMOTE-Out, SMOTE-Cosine, and Selected-SMOTE: An enhancement strategy to handle imbalance in data level\",\"authors\":\"Fajri Koto\",\"doi\":\"10.1109/ICACSIS.2014.7065849\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The imbalanced dataset often becomes obstacle in supervised learning process. Imbalance is case in which the example in training data belonging to one class is heavily outnumber the examples in the other class. Applying classifier to this dataset results in the failure of classifier to learn the minority class. Synthetic Minority Oversampling Technique (SMOTE) is a well known over-sampling method that tackles imbalance in data level. SMOTE creates synthetic example between two close vectors that lay together. Our study considers three improvements of SMOTE and call them as SMOTE-Out, SMOTE-Cosine, and Selected-SMOTE, in order to cover cases which are not already done by SMOTE. To investigate the proposed method, our experiments were conducted with eighteen different datasets. The results show that our proposed SMOTE give some improvements of B-ACC and F1-Score.\",\"PeriodicalId\":443250,\"journal\":{\"name\":\"2014 International Conference on Advanced Computer Science and Information System\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Advanced Computer Science and Information System\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACSIS.2014.7065849\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Advanced Computer Science and Information System","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACSIS.2014.7065849","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SMOTE-Out, SMOTE-Cosine, and Selected-SMOTE: An enhancement strategy to handle imbalance in data level
The imbalanced dataset often becomes obstacle in supervised learning process. Imbalance is case in which the example in training data belonging to one class is heavily outnumber the examples in the other class. Applying classifier to this dataset results in the failure of classifier to learn the minority class. Synthetic Minority Oversampling Technique (SMOTE) is a well known over-sampling method that tackles imbalance in data level. SMOTE creates synthetic example between two close vectors that lay together. Our study considers three improvements of SMOTE and call them as SMOTE-Out, SMOTE-Cosine, and Selected-SMOTE, in order to cover cases which are not already done by SMOTE. To investigate the proposed method, our experiments were conducted with eighteen different datasets. The results show that our proposed SMOTE give some improvements of B-ACC and F1-Score.