Leveraging Sampling Schemes on Skewed Class Distribution to Enhance Male Fertility Detection with Ensemble AI Learners

IF 1.1 4区 计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Debasmita GhoshRoy, P. A. Alvi, KC Santosh
{"title":"Leveraging Sampling Schemes on Skewed Class Distribution to Enhance Male Fertility Detection with Ensemble AI Learners","authors":"Debasmita GhoshRoy, P. A. Alvi, KC Santosh","doi":"10.1142/s0218001424510030","DOIUrl":null,"url":null,"abstract":"<p>Designing effective AI models becomes a challenge when dealing with imbalanced/skewed class distributions in datasets. Addressing this, re-sampling techniques often come into play as potential solutions. In this investigation, we delve into the male fertility dataset, exploring 14 re-sampling approaches to understand their impact on enhancing predictive model performance. The research employs conventional AI learners to gauge male fertility potential. Notably, five ensemble AI learners are studied, their performances are compared, and their results are evaluated using four measurement indices. Through comprehensive comparative analysis, we identify substantial enhancement in model effectiveness. Our findings showcase that the LightGBM model with SMOTE-ENN re-sampling stands out, achieving an efficacy of 96.66% and an F1-Score of 95.60% through 5-fold cross-validation. Interestingly, the CatBoost model, without re-sampling, exhibits strong performance, achieving an efficacy of 86.99% and an F1-Score of 93.02%. Furthermore, we benchmark our approach against state-of-the-art methods in male fertility prediction, particularly highlighting the use of re-sampling techniques like SMOTE and ESLSMOTE. Consequently, our proposed model emerges as a robust and efficient computational framework, promising accurate male fertility prediction.</p>","PeriodicalId":54949,"journal":{"name":"International Journal of Pattern Recognition and Artificial Intelligence","volume":"32 1","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Pattern Recognition and Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1142/s0218001424510030","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Designing effective AI models becomes a challenge when dealing with imbalanced/skewed class distributions in datasets. Addressing this, re-sampling techniques often come into play as potential solutions. In this investigation, we delve into the male fertility dataset, exploring 14 re-sampling approaches to understand their impact on enhancing predictive model performance. The research employs conventional AI learners to gauge male fertility potential. Notably, five ensemble AI learners are studied, their performances are compared, and their results are evaluated using four measurement indices. Through comprehensive comparative analysis, we identify substantial enhancement in model effectiveness. Our findings showcase that the LightGBM model with SMOTE-ENN re-sampling stands out, achieving an efficacy of 96.66% and an F1-Score of 95.60% through 5-fold cross-validation. Interestingly, the CatBoost model, without re-sampling, exhibits strong performance, achieving an efficacy of 86.99% and an F1-Score of 93.02%. Furthermore, we benchmark our approach against state-of-the-art methods in male fertility prediction, particularly highlighting the use of re-sampling techniques like SMOTE and ESLSMOTE. Consequently, our proposed model emerges as a robust and efficient computational framework, promising accurate male fertility prediction.

利用偏斜类分布的采样方案,增强人工智能学习者的男性生育力检测能力
在处理数据集中不平衡/倾斜的类别分布时,设计有效的人工智能模型成为一项挑战。为了解决这个问题,重新采样技术往往成为潜在的解决方案。在这项研究中,我们深入研究了男性生育率数据集,探索了 14 种再采样方法,以了解它们对提高预测模型性能的影响。研究采用传统的人工智能学习器来评估男性生育力潜力。值得注意的是,我们研究了五种集合人工智能学习器,比较了它们的性能,并使用四种测量指数评估了它们的结果。通过综合比较分析,我们发现模型的有效性得到了大幅提升。我们的研究结果表明,采用 SMOTE-ENN 重新采样的 LightGBM 模型表现突出,通过 5 倍交叉验证,其有效性达到 96.66%,F1 分数达到 95.60%。有趣的是,CatBoost 模型在没有重新采样的情况下也表现出了强劲的性能,达到了 86.99% 的效率和 93.02% 的 F1 分数。此外,我们还将我们的方法与最先进的男性生育力预测方法进行了比较,特别强调了 SMOTE 和 ESLSMOTE 等再采样技术的使用。因此,我们提出的模型是一个稳健、高效的计算框架,有望准确预测男性生育能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.90
自引率
13.30%
发文量
201
审稿时长
15.8 months
期刊介绍: The International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI) welcomes both theory-oriented and innovative applications articles on new developments and is of interest to both researchers in academia and industry. The current scope of this journal includes: • Pattern Recognition • Machine Learning • Deep Learning • Document Analysis • Image Processing • Signal Processing • Computer Vision • Biometrics • Biomedical Image Analysis • Artificial Intelligence In addition to regular papers describing original research work, survey articles on timely and important research topics are highly welcome. Special issues with focused topics within the scope of this journal are also published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信