Applying Cost-Sensitive Learning Methods to Improve Extremely Unbalanced Big Data Problems Using Random Forest

K. V. Ramana, Yuvasri. B, Sultanuddin Sj, P. Ponsudha, Sowmya Pd, A. V. Sangeetha
{"title":"Applying Cost-Sensitive Learning Methods to Improve Extremely Unbalanced Big Data Problems Using Random Forest","authors":"K. V. Ramana, Yuvasri. B, Sultanuddin Sj, P. Ponsudha, Sowmya Pd, A. V. Sangeetha","doi":"10.1109/ACCAI58221.2023.10199250","DOIUrl":null,"url":null,"abstract":"In a larger part minority characterization issue, class irregularity in the dataset(s) can definitely misshape the exhibition of classifiers, creating an expectation predisposition for the greater part class. A negative (larger part) class expectation predisposition could make impeding impacts if the positive (minority) class is the gathering of interest and the application region being referred to states that a false negative is significantly more costly than a false certain. The decrease of class divergence is made more troublesome by big data because of the different and muddled design of the similarly bigger datasets. This exploration presents a wide evaluation of distributed works inside the past 8 years, zeroed in on fashionable unevenness (i.e., a greater part to-minority class proportion somewhere in the range of 100:1 and 10,000:1) in big data to survey the cutting edge in addressing ominous outcomes connected with class irregularity. In this paper we propose two methods for managing the imbalanced data grouping issue utilizing irregular backwoods. The other depends on an inspecting approach, though the first depends on cost-sensitive learning. Execution pointers like review and exactness, false-positive and false-negative rates.","PeriodicalId":382104,"journal":{"name":"2023 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACCAI58221.2023.10199250","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In a larger part minority characterization issue, class irregularity in the dataset(s) can definitely misshape the exhibition of classifiers, creating an expectation predisposition for the greater part class. A negative (larger part) class expectation predisposition could make impeding impacts if the positive (minority) class is the gathering of interest and the application region being referred to states that a false negative is significantly more costly than a false certain. The decrease of class divergence is made more troublesome by big data because of the different and muddled design of the similarly bigger datasets. This exploration presents a wide evaluation of distributed works inside the past 8 years, zeroed in on fashionable unevenness (i.e., a greater part to-minority class proportion somewhere in the range of 100:1 and 10,000:1) in big data to survey the cutting edge in addressing ominous outcomes connected with class irregularity. In this paper we propose two methods for managing the imbalanced data grouping issue utilizing irregular backwoods. The other depends on an inspecting approach, though the first depends on cost-sensitive learning. Execution pointers like review and exactness, false-positive and false-negative rates.
应用成本敏感学习方法改进随机森林极度不平衡大数据问题
在大部分少数群体特征问题中,数据集中的类不规则性肯定会扭曲分类器的展示,从而产生对大部分类的期望倾向。如果正类(少数)是兴趣的集合,并且所引用的应用区域声明假阴性比假确定的代价要高得多,则负类(大部分)期望倾向可能会产生阻碍性影响。由于类似的大数据集的不同和混乱的设计,大数据使类差异的减小变得更加麻烦。本探索对过去8年的分布式作品进行了广泛的评估,将重点放在大数据中流行的不均匀性(即在100:1和10,000:1之间的较大部分与少数阶级的比例)上,以调查在解决与阶级不均匀相关的不祥结果方面的前沿。在本文中,我们提出了两种利用不规则落后区来处理不平衡数据分组问题的方法。另一个依赖于检查方法,尽管第一个依赖于成本敏感学习。执行指针,如审查和准确性,假阳性和假阴性率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信