{"title":"基于聚类的不平衡分类原型生成","authors":"Huajuan Ren, Bei Yang","doi":"10.1109/ICSGEA.2019.00102","DOIUrl":null,"url":null,"abstract":"Class imbalance classification has become a crucial problem in machine learning. Under-sampling is a widely adopted technique to address imbalance classification, which mainly depends on either randomly or heuristically resampling on the majority class samples. These sample-based under-sampling methods ignore part of the majority class information during the training. In this paper, we propose a clustering-based prototype generation technique to generate representative the majority and minority class instances with relatively balance ratio, which reduces the imbalanced ratio and the overlap of boundary samples, so as to facilitate classification tasks. We evaluate this algorithm on 8 imbalanced datasets, showing that the proposed method outperforms the other three under-sampling approaches.","PeriodicalId":201721,"journal":{"name":"2019 International Conference on Smart Grid and Electrical Automation (ICSGEA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Clustering-Based Prototype Generation for Imbalance Classification\",\"authors\":\"Huajuan Ren, Bei Yang\",\"doi\":\"10.1109/ICSGEA.2019.00102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Class imbalance classification has become a crucial problem in machine learning. Under-sampling is a widely adopted technique to address imbalance classification, which mainly depends on either randomly or heuristically resampling on the majority class samples. These sample-based under-sampling methods ignore part of the majority class information during the training. In this paper, we propose a clustering-based prototype generation technique to generate representative the majority and minority class instances with relatively balance ratio, which reduces the imbalanced ratio and the overlap of boundary samples, so as to facilitate classification tasks. We evaluate this algorithm on 8 imbalanced datasets, showing that the proposed method outperforms the other three under-sampling approaches.\",\"PeriodicalId\":201721,\"journal\":{\"name\":\"2019 International Conference on Smart Grid and Electrical Automation (ICSGEA)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Smart Grid and Electrical Automation (ICSGEA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSGEA.2019.00102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Smart Grid and Electrical Automation (ICSGEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSGEA.2019.00102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Clustering-Based Prototype Generation for Imbalance Classification
Class imbalance classification has become a crucial problem in machine learning. Under-sampling is a widely adopted technique to address imbalance classification, which mainly depends on either randomly or heuristically resampling on the majority class samples. These sample-based under-sampling methods ignore part of the majority class information during the training. In this paper, we propose a clustering-based prototype generation technique to generate representative the majority and minority class instances with relatively balance ratio, which reduces the imbalanced ratio and the overlap of boundary samples, so as to facilitate classification tasks. We evaluate this algorithm on 8 imbalanced datasets, showing that the proposed method outperforms the other three under-sampling approaches.