Prediction of polycystic ovary syndrome using machine learning with SFS and Boruta feature selection: an explainable AI approach.

IF 2.2 4区 医学 Q3 ANDROLOGY
Monali Ramteke, Shital Raut
{"title":"Prediction of polycystic ovary syndrome using machine learning with SFS and Boruta feature selection: an explainable AI approach.","authors":"Monali Ramteke, Shital Raut","doi":"10.1080/19396368.2025.2560839","DOIUrl":null,"url":null,"abstract":"<p><p>Polycystic Ovary Syndrome (PCOS) is a complex endocrine disorder affecting numerous women of reproductive age, characterized by a variety of clinical and biochemical features. Accurate classification and diagnosis of PCOS remains challenging due to the heterogeneous nature of its manifestations. This study introduces a robust machine learning framework that combines a voting ensemble model with two distinct feature selection techniques, Sequential Forward Selection (SFS) and Boruta, to enhance the accuracy in classifying PCOS. We also utilized Explainable Artificial Intelligence (XAI) techniques, such as Shapley Additive Explanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), Partial Dependence Plot (PDP), AnchorTabular, and Permutation Importance, to interpret the ensemble model. These methods provide essential insights into the significance of key features for predicting PCOS patients. Results show that the proposed ensemble learning model achieved optimal performance with the feature selection technique used. Specifically, the proposed voting ensemble classifier and features picked by SFS had the highest accuracy among all models. This method can help in PCOS diagnosis and support early intervention.</p>","PeriodicalId":22184,"journal":{"name":"Systems Biology in Reproductive Medicine","volume":"71 1","pages":"439-460"},"PeriodicalIF":2.2000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems Biology in Reproductive Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/19396368.2025.2560839","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/21 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ANDROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Polycystic Ovary Syndrome (PCOS) is a complex endocrine disorder affecting numerous women of reproductive age, characterized by a variety of clinical and biochemical features. Accurate classification and diagnosis of PCOS remains challenging due to the heterogeneous nature of its manifestations. This study introduces a robust machine learning framework that combines a voting ensemble model with two distinct feature selection techniques, Sequential Forward Selection (SFS) and Boruta, to enhance the accuracy in classifying PCOS. We also utilized Explainable Artificial Intelligence (XAI) techniques, such as Shapley Additive Explanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), Partial Dependence Plot (PDP), AnchorTabular, and Permutation Importance, to interpret the ensemble model. These methods provide essential insights into the significance of key features for predicting PCOS patients. Results show that the proposed ensemble learning model achieved optimal performance with the feature selection technique used. Specifically, the proposed voting ensemble classifier and features picked by SFS had the highest accuracy among all models. This method can help in PCOS diagnosis and support early intervention.

使用SFS和Boruta特征选择的机器学习预测多囊卵巢综合征:一种可解释的人工智能方法。
多囊卵巢综合征(PCOS)是一种影响众多育龄妇女的复杂内分泌疾病,具有多种临床和生化特征。由于多囊卵巢综合征表现的异质性,其准确的分类和诊断仍然具有挑战性。本研究引入了一个鲁棒的机器学习框架,该框架将投票集成模型与两种不同的特征选择技术(顺序前向选择(SFS)和Boruta)相结合,以提高PCOS分类的准确性。我们还利用可解释人工智能(XAI)技术,如Shapley加性解释(SHAP)、局部可解释模型不可知解释(LIME)、部分依赖图(PDP)、锚表(AnchorTabular)和置换重要性(Permutation Importance)来解释集成模型。这些方法为预测PCOS患者的关键特征提供了重要的见解。结果表明,采用特征选择技术的集成学习模型取得了最优的学习性能。其中,提出的投票集成分类器和SFS选择的特征在所有模型中准确率最高。该方法有助于多囊卵巢综合征的诊断和早期干预。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.30
自引率
4.20%
发文量
27
审稿时长
>12 weeks
期刊介绍: Systems Biology in Reproductive Medicine, SBiRM, publishes Research Articles, Communications, Applications Notes that include protocols a Clinical Corner that includes case reports, Review Articles and Hypotheses and Letters to the Editor on human and animal reproduction. The journal will highlight the use of systems approaches including genomic, cellular, proteomic, metabolomic, bioinformatic, molecular, and biochemical, to address fundamental questions in reproductive biology, reproductive medicine, and translational research. The journal publishes research involving human and animal gametes, stem cells, developmental biology and toxicology, and clinical care in reproductive medicine. Specific areas of interest to the journal include: male factor infertility and germ cell biology, reproductive technologies (gamete micro-manipulation and cryopreservation, in vitro fertilization/embryo transfer (IVF/ET) and contraception. Research that is directed towards developing new or enhanced technologies for clinical medicine or scientific research in reproduction is of significant interest to the journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信