{"title":"Prediction of polycystic ovary syndrome using machine learning with SFS and Boruta feature selection: an explainable AI approach.","authors":"Monali Ramteke, Shital Raut","doi":"10.1080/19396368.2025.2560839","DOIUrl":null,"url":null,"abstract":"<p><p>Polycystic Ovary Syndrome (PCOS) is a complex endocrine disorder affecting numerous women of reproductive age, characterized by a variety of clinical and biochemical features. Accurate classification and diagnosis of PCOS remains challenging due to the heterogeneous nature of its manifestations. This study introduces a robust machine learning framework that combines a voting ensemble model with two distinct feature selection techniques, Sequential Forward Selection (SFS) and Boruta, to enhance the accuracy in classifying PCOS. We also utilized Explainable Artificial Intelligence (XAI) techniques, such as Shapley Additive Explanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), Partial Dependence Plot (PDP), AnchorTabular, and Permutation Importance, to interpret the ensemble model. These methods provide essential insights into the significance of key features for predicting PCOS patients. Results show that the proposed ensemble learning model achieved optimal performance with the feature selection technique used. Specifically, the proposed voting ensemble classifier and features picked by SFS had the highest accuracy among all models. This method can help in PCOS diagnosis and support early intervention.</p>","PeriodicalId":22184,"journal":{"name":"Systems Biology in Reproductive Medicine","volume":"71 1","pages":"439-460"},"PeriodicalIF":2.2000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems Biology in Reproductive Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/19396368.2025.2560839","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/21 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"ANDROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Polycystic Ovary Syndrome (PCOS) is a complex endocrine disorder affecting numerous women of reproductive age, characterized by a variety of clinical and biochemical features. Accurate classification and diagnosis of PCOS remains challenging due to the heterogeneous nature of its manifestations. This study introduces a robust machine learning framework that combines a voting ensemble model with two distinct feature selection techniques, Sequential Forward Selection (SFS) and Boruta, to enhance the accuracy in classifying PCOS. We also utilized Explainable Artificial Intelligence (XAI) techniques, such as Shapley Additive Explanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), Partial Dependence Plot (PDP), AnchorTabular, and Permutation Importance, to interpret the ensemble model. These methods provide essential insights into the significance of key features for predicting PCOS patients. Results show that the proposed ensemble learning model achieved optimal performance with the feature selection technique used. Specifically, the proposed voting ensemble classifier and features picked by SFS had the highest accuracy among all models. This method can help in PCOS diagnosis and support early intervention.
期刊介绍:
Systems Biology in Reproductive Medicine, SBiRM, publishes Research Articles, Communications, Applications Notes that include protocols a Clinical Corner that includes case reports, Review Articles and Hypotheses and Letters to the Editor on human and animal reproduction. The journal will highlight the use of systems approaches including genomic, cellular, proteomic, metabolomic, bioinformatic, molecular, and biochemical, to address fundamental questions in reproductive biology, reproductive medicine, and translational research. The journal publishes research involving human and animal gametes, stem cells, developmental biology and toxicology, and clinical care in reproductive medicine. Specific areas of interest to the journal include: male factor infertility and germ cell biology, reproductive technologies (gamete micro-manipulation and cryopreservation, in vitro fertilization/embryo transfer (IVF/ET) and contraception. Research that is directed towards developing new or enhanced technologies for clinical medicine or scientific research in reproduction is of significant interest to the journal.