An Improved Imbalanced Data Classification Algorithm Based on SVM

Ming Yan, Jun Wang, Dan-jiao Li, Jin Meng
{"title":"An Improved Imbalanced Data Classification Algorithm Based on SVM","authors":"Ming Yan, Jun Wang, Dan-jiao Li, Jin Meng","doi":"10.1109/ICCSI55536.2022.9970637","DOIUrl":null,"url":null,"abstract":"When we deal with most real-world classification problems, the collected datasets are mostly imbalanced. Dataset imbalance means that the number of samples of a certain class greatly exceeds the number of samples of other classes in the dataset, but often a minority class is the main object of our research. When classifying imbalanced datasets, it is easy to misclassify the minority class samples with higher misclassification costs. Therefore, the classification of imbalanced datasets is one of the main difficulties in the field of data mining. In this paper, we propose a support vector machine (SVM) algorithm based on improved whale optimization algorithm, called SWOA-SVM. This algorithm introduces the social group optimization algorithm (SGO) to optimize the problem that the WOA algorithm is prone to premature maturity, and improves the optimization process of the WOA. The performance of SWOA-SVM has been evaluated with SVM and other improved algorithms on multiple commonly used imbalanced datasets, using AUC, Accuracy and G-mean as performance evaluation criteria. The experimental results show that the algorithm can effectively improve the recognition rate of positive samples when dealing with different experimental datasets, which verifies the effectiveness of the algorithm.","PeriodicalId":421514,"journal":{"name":"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Cyber-Physical Social Intelligence (ICCSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSI55536.2022.9970637","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

When we deal with most real-world classification problems, the collected datasets are mostly imbalanced. Dataset imbalance means that the number of samples of a certain class greatly exceeds the number of samples of other classes in the dataset, but often a minority class is the main object of our research. When classifying imbalanced datasets, it is easy to misclassify the minority class samples with higher misclassification costs. Therefore, the classification of imbalanced datasets is one of the main difficulties in the field of data mining. In this paper, we propose a support vector machine (SVM) algorithm based on improved whale optimization algorithm, called SWOA-SVM. This algorithm introduces the social group optimization algorithm (SGO) to optimize the problem that the WOA algorithm is prone to premature maturity, and improves the optimization process of the WOA. The performance of SWOA-SVM has been evaluated with SVM and other improved algorithms on multiple commonly used imbalanced datasets, using AUC, Accuracy and G-mean as performance evaluation criteria. The experimental results show that the algorithm can effectively improve the recognition rate of positive samples when dealing with different experimental datasets, which verifies the effectiveness of the algorithm.
一种改进的基于SVM的不平衡数据分类算法
当我们处理大多数现实世界的分类问题时,收集到的数据集大多是不平衡的。数据集不平衡是指某一类的样本数量大大超过数据集中其他类的样本数量,但往往少数类是我们研究的主要对象。在对不平衡数据集进行分类时,容易对少数类样本进行错分类,错分类代价较高。因此,不平衡数据集的分类是数据挖掘领域的主要难点之一。本文提出了一种基于改进鲸鱼优化算法的支持向量机(SVM)算法,称为swa -SVM。该算法引入了社会群体优化算法(social group optimization algorithm, SGO),对WOA算法容易早熟的问题进行了优化,改进了WOA的优化过程。在多个常用的不平衡数据集上,以AUC、Accuracy和G-mean作为性能评价标准,利用SVM和其他改进算法对swa -SVM的性能进行了评价。实验结果表明,该算法在处理不同的实验数据集时,能有效提高阳性样本的识别率,验证了算法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信