{"title":"基于多样性的班级不平衡成本敏感学习方法","authors":"S. Dong, Yongcheng Wu","doi":"10.1145/3208788.3208792","DOIUrl":null,"url":null,"abstract":"It is often the case that datasets are imbalanced in the real world. In this situation, it is minimizing misclassification costs rather than classification accuracy that is the primary goal of classification algorithms. To tackle this problem and improve the performance of classifiers, sampling is widely employed. In this paper, we propose a new diversity-based under-sampling technique for class-imbalanced datasets. The key idea is to balance a data set by choosing only the potential informative samples of the majority class according to diversity of class probability calculation. The experimental results on 5 class-imbalanced datasets show that our method performs better than two existing sampling techniques in terms of total misclassification costs.","PeriodicalId":211585,"journal":{"name":"Proceedings of 2018 International Conference on Mathematics and Artificial Intelligence","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A diversity-based method for class-imbalanced cost-sensitive learning\",\"authors\":\"S. Dong, Yongcheng Wu\",\"doi\":\"10.1145/3208788.3208792\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is often the case that datasets are imbalanced in the real world. In this situation, it is minimizing misclassification costs rather than classification accuracy that is the primary goal of classification algorithms. To tackle this problem and improve the performance of classifiers, sampling is widely employed. In this paper, we propose a new diversity-based under-sampling technique for class-imbalanced datasets. The key idea is to balance a data set by choosing only the potential informative samples of the majority class according to diversity of class probability calculation. The experimental results on 5 class-imbalanced datasets show that our method performs better than two existing sampling techniques in terms of total misclassification costs.\",\"PeriodicalId\":211585,\"journal\":{\"name\":\"Proceedings of 2018 International Conference on Mathematics and Artificial Intelligence\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of 2018 International Conference on Mathematics and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3208788.3208792\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2018 International Conference on Mathematics and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3208788.3208792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A diversity-based method for class-imbalanced cost-sensitive learning
It is often the case that datasets are imbalanced in the real world. In this situation, it is minimizing misclassification costs rather than classification accuracy that is the primary goal of classification algorithms. To tackle this problem and improve the performance of classifiers, sampling is widely employed. In this paper, we propose a new diversity-based under-sampling technique for class-imbalanced datasets. The key idea is to balance a data set by choosing only the potential informative samples of the majority class according to diversity of class probability calculation. The experimental results on 5 class-imbalanced datasets show that our method performs better than two existing sampling techniques in terms of total misclassification costs.