S Vairachilai, Devarakonda Anuhya, Anjeleen Tirkey, S P Raja
{"title":"SLB - SMOTE logistic blending hybrid machine learning model for chronic polycystic ovary syndrome prediction with correlated feature selection.","authors":"S Vairachilai, Devarakonda Anuhya, Anjeleen Tirkey, S P Raja","doi":"10.1080/17538157.2024.2405868","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>In this study, we aimed to develop a machine learning (ML) model for predicting Polycystic Ovary Syndrome (PCOS) based on demographic, clinical, and biochemical parameters.</p><p><strong>Methodology: </strong>We collected data from Kaggle, which included information on age, body mass index, menstrual cycle length, follicle-stimulating hormone, hair growth, and more. Using this data, we trained several traditional ML and ensemble algorithms to predict PCOS.</p><p><strong>Results: </strong>Among the traditional ML algorithms, Logistic Regression emerged as the best, boasting the highest accuracy of 0.91 and an AUC of 0.90. In ensemble algorithms, the Blending algorithm outperformed other ensemble methods, also achieving an accuracy of 0.91 and an AUC of 0.90, with a balanced precision and recall of 0.88.</p><p><strong>Significance of the research: </strong>These results establish Logistic Regression and the Blending algorithm as optimal choices for accurate and reliable PCOS prediction, demonstrating strong discriminative power and the ability to correctly classify PCOS cases.</p>","PeriodicalId":101409,"journal":{"name":"Informatics for health & social care","volume":" ","pages":"190-211"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics for health & social care","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/17538157.2024.2405868","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/27 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: In this study, we aimed to develop a machine learning (ML) model for predicting Polycystic Ovary Syndrome (PCOS) based on demographic, clinical, and biochemical parameters.
Methodology: We collected data from Kaggle, which included information on age, body mass index, menstrual cycle length, follicle-stimulating hormone, hair growth, and more. Using this data, we trained several traditional ML and ensemble algorithms to predict PCOS.
Results: Among the traditional ML algorithms, Logistic Regression emerged as the best, boasting the highest accuracy of 0.91 and an AUC of 0.90. In ensemble algorithms, the Blending algorithm outperformed other ensemble methods, also achieving an accuracy of 0.91 and an AUC of 0.90, with a balanced precision and recall of 0.88.
Significance of the research: These results establish Logistic Regression and the Blending algorithm as optimal choices for accurate and reliable PCOS prediction, demonstrating strong discriminative power and the ability to correctly classify PCOS cases.