一种新的模糊自适应不平衡数据分类算法

IF 2 4区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Harshita Patel, D. Rajput, O. Stan, L. Miclea
{"title":"一种新的模糊自适应不平衡数据分类算法","authors":"Harshita Patel, D. Rajput, O. Stan, L. Miclea","doi":"10.32604/cmc.2022.017114","DOIUrl":null,"url":null,"abstract":"Classification of imbalanced data is a well explored issue in the data mining and machine learning community where one class representation is overwhelmed by other classes. The Imbalanced distribution of data is a natural occurrence in real world datasets, so needed to be dealt with carefully to get important insights. In case of imbalance in data sets, traditional classifiers have to sacrifice their performances, therefore lead to misclassifications. This paper suggests a weighted nearest neighbor approach in a fuzzy manner to deal with this issue. We have adapted the ‘existing algorithm modification solution’ to learn from imbalanced datasets that classify data without manipulating the natural distribution of data unlike the other popular data balancing methods. The K nearest neighbor is a non-parametric classification method that is mostly used in machine learning problems. Fuzzy classification with the nearest neighbor clears the belonging of an instance to classes and optimal weights with improved nearest neighbor concept helping to correctly classify imbalanced data. The proposed hybrid approach takes care of imbalance nature of data and reduces the inaccuracies appear in applications of original and traditional classifiers. Results show that it performs well over the existing fuzzy nearest neighbor and weighted neighbor strategies for imbalanced learning.","PeriodicalId":10440,"journal":{"name":"Cmc-computers Materials & Continua","volume":"26 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"A New Fuzzy Adaptive Algorithm to Classify Imbalanced Data\",\"authors\":\"Harshita Patel, D. Rajput, O. Stan, L. Miclea\",\"doi\":\"10.32604/cmc.2022.017114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classification of imbalanced data is a well explored issue in the data mining and machine learning community where one class representation is overwhelmed by other classes. The Imbalanced distribution of data is a natural occurrence in real world datasets, so needed to be dealt with carefully to get important insights. In case of imbalance in data sets, traditional classifiers have to sacrifice their performances, therefore lead to misclassifications. This paper suggests a weighted nearest neighbor approach in a fuzzy manner to deal with this issue. We have adapted the ‘existing algorithm modification solution’ to learn from imbalanced datasets that classify data without manipulating the natural distribution of data unlike the other popular data balancing methods. The K nearest neighbor is a non-parametric classification method that is mostly used in machine learning problems. Fuzzy classification with the nearest neighbor clears the belonging of an instance to classes and optimal weights with improved nearest neighbor concept helping to correctly classify imbalanced data. The proposed hybrid approach takes care of imbalance nature of data and reduces the inaccuracies appear in applications of original and traditional classifiers. Results show that it performs well over the existing fuzzy nearest neighbor and weighted neighbor strategies for imbalanced learning.\",\"PeriodicalId\":10440,\"journal\":{\"name\":\"Cmc-computers Materials & Continua\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cmc-computers Materials & Continua\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.32604/cmc.2022.017114\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cmc-computers Materials & Continua","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.32604/cmc.2022.017114","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 7

摘要

不平衡数据的分类是数据挖掘和机器学习社区中一个很好的探索问题,其中一个类表示被其他类淹没。数据的不平衡分布在现实世界的数据集中是一种自然现象,因此需要仔细处理以获得重要的见解。在数据集不平衡的情况下,传统的分类器不得不牺牲其性能,从而导致误分类。本文提出了一种模糊加权最近邻法来处理这一问题。我们已经调整了“现有的算法修改解决方案”,从不平衡的数据集中学习数据分类,而不像其他流行的数据平衡方法那样操纵数据的自然分布。K近邻是一种非参数分类方法,主要用于机器学习问题。基于最近邻的模糊分类清除了实例对类的归属,改进了最近邻概念的最优权值有助于正确分类不平衡数据。该方法兼顾了数据的不平衡性,降低了传统分类器和原始分类器在应用中出现的不准确性。结果表明,该方法在不平衡学习方面优于现有的模糊近邻和加权近邻策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A New Fuzzy Adaptive Algorithm to Classify Imbalanced Data
Classification of imbalanced data is a well explored issue in the data mining and machine learning community where one class representation is overwhelmed by other classes. The Imbalanced distribution of data is a natural occurrence in real world datasets, so needed to be dealt with carefully to get important insights. In case of imbalance in data sets, traditional classifiers have to sacrifice their performances, therefore lead to misclassifications. This paper suggests a weighted nearest neighbor approach in a fuzzy manner to deal with this issue. We have adapted the ‘existing algorithm modification solution’ to learn from imbalanced datasets that classify data without manipulating the natural distribution of data unlike the other popular data balancing methods. The K nearest neighbor is a non-parametric classification method that is mostly used in machine learning problems. Fuzzy classification with the nearest neighbor clears the belonging of an instance to classes and optimal weights with improved nearest neighbor concept helping to correctly classify imbalanced data. The proposed hybrid approach takes care of imbalance nature of data and reduces the inaccuracies appear in applications of original and traditional classifiers. Results show that it performs well over the existing fuzzy nearest neighbor and weighted neighbor strategies for imbalanced learning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cmc-computers Materials & Continua
Cmc-computers Materials & Continua 工程技术-材料科学:综合
CiteScore
5.30
自引率
19.40%
发文量
345
审稿时长
1 months
期刊介绍: This journal publishes original research papers in the areas of computer networks, artificial intelligence, big data management, software engineering, multimedia, cyber security, internet of things, materials genome, integrated materials science, data analysis, modeling, and engineering of designing and manufacturing of modern functional and multifunctional materials. Novel high performance computing methods, big data analysis, and artificial intelligence that advance material technologies are especially welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信