一种结合采样和遗传算法的集成分类器来解决多类不平衡问题

Q4 Mathematics
Archana Purwar, S. Singh
{"title":"一种结合采样和遗传算法的集成分类器来解决多类不平衡问题","authors":"Archana Purwar, S. Singh","doi":"10.1504/ijdats.2020.10026827","DOIUrl":null,"url":null,"abstract":"To handle datasets with imbalanced classes is an exigent problem in the area of machine learning and data mining. Though a lot of work has been done by many researchers in the literature for two-class imbalanced problems, the multiclass problems still need to be explored. In this paper, we propose sampling and genetic algorithm based ensemble classifier (SA-GABEC) to handle imbalanced classes. SA-GABEC tries to find the best subset of classifiers for a given sample that is precise in predictions and can create an acceptable diversity in features subspace. These subsets of classifiers are fused together to give better predictions as compared to a single classifier. Moreover, this paper also proposes modified SA-GABEC which performs the feature selection before applying sampling and outperforms SA-GABEC. The performance of the proposed classifiers is evaluated and compared with GAB-EPA, Adaboost and bagging using minority class recall and extended G-mean.","PeriodicalId":38582,"journal":{"name":"International Journal of Data Analysis Techniques and Strategies","volume":"1 1","pages":"30-42"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A novel ensemble classifier by combining sampling and genetic algorithm to combat multiclass imbalanced problems\",\"authors\":\"Archana Purwar, S. Singh\",\"doi\":\"10.1504/ijdats.2020.10026827\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To handle datasets with imbalanced classes is an exigent problem in the area of machine learning and data mining. Though a lot of work has been done by many researchers in the literature for two-class imbalanced problems, the multiclass problems still need to be explored. In this paper, we propose sampling and genetic algorithm based ensemble classifier (SA-GABEC) to handle imbalanced classes. SA-GABEC tries to find the best subset of classifiers for a given sample that is precise in predictions and can create an acceptable diversity in features subspace. These subsets of classifiers are fused together to give better predictions as compared to a single classifier. Moreover, this paper also proposes modified SA-GABEC which performs the feature selection before applying sampling and outperforms SA-GABEC. The performance of the proposed classifiers is evaluated and compared with GAB-EPA, Adaboost and bagging using minority class recall and extended G-mean.\",\"PeriodicalId\":38582,\"journal\":{\"name\":\"International Journal of Data Analysis Techniques and Strategies\",\"volume\":\"1 1\",\"pages\":\"30-42\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Data Analysis Techniques and Strategies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijdats.2020.10026827\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Analysis Techniques and Strategies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijdats.2020.10026827","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 4

摘要

类不平衡数据集的处理是机器学习和数据挖掘领域亟待解决的问题。虽然文献中许多研究者已经对两类不平衡问题做了大量的研究,但多类不平衡问题仍有待探索。本文提出了基于采样和遗传算法的集成分类器(SA-GABEC)来处理不平衡类。SA-GABEC试图为给定的样本找到分类器的最佳子集,该子集在预测中是精确的,并且可以在特征子空间中创建可接受的多样性。与单个分类器相比,这些分类器子集被融合在一起以提供更好的预测。此外,本文还提出了改进的SA-GABEC算法,该算法在进行采样前进行特征选择,优于SA-GABEC算法。使用少数类召回率和扩展g均值对所提出分类器的性能进行了评估,并与gaba - epa、Adaboost和bagging进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A novel ensemble classifier by combining sampling and genetic algorithm to combat multiclass imbalanced problems
To handle datasets with imbalanced classes is an exigent problem in the area of machine learning and data mining. Though a lot of work has been done by many researchers in the literature for two-class imbalanced problems, the multiclass problems still need to be explored. In this paper, we propose sampling and genetic algorithm based ensemble classifier (SA-GABEC) to handle imbalanced classes. SA-GABEC tries to find the best subset of classifiers for a given sample that is precise in predictions and can create an acceptable diversity in features subspace. These subsets of classifiers are fused together to give better predictions as compared to a single classifier. Moreover, this paper also proposes modified SA-GABEC which performs the feature selection before applying sampling and outperforms SA-GABEC. The performance of the proposed classifiers is evaluated and compared with GAB-EPA, Adaboost and bagging using minority class recall and extended G-mean.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Data Analysis Techniques and Strategies
International Journal of Data Analysis Techniques and Strategies Decision Sciences-Information Systems and Management
CiteScore
1.20
自引率
0.00%
发文量
21
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信