Using machine learning to predict patients with polycystic ovary disease in Chinese women

IF 2 4区 医学 Q2 OBSTETRICS & GYNECOLOGY
Chen-Yu Wang , Dee Pei , Chun-Kai Wang , Jyun-Cheng Ke , Siou-Ting Lee , Ta-Wei Chu , Yao-Jen Liang
{"title":"Using machine learning to predict patients with polycystic ovary disease in Chinese women","authors":"Chen-Yu Wang ,&nbsp;Dee Pei ,&nbsp;Chun-Kai Wang ,&nbsp;Jyun-Cheng Ke ,&nbsp;Siou-Ting Lee ,&nbsp;Ta-Wei Chu ,&nbsp;Yao-Jen Liang","doi":"10.1016/j.tjog.2024.09.019","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>With an estimated global frequency ranging from5 % to 21 %, polycystic ovary syndrome (PCOS) is one of the most prevalent hormonal disorders. There are many factors found to be related to PCOS. However, most of these researches used traditional methods such as multiple logistic regression (LR). Nowadays, machine learning (Mach-L) emerges as a new method and can be used in medical researches. In the present study, there were two goals: 1. Compare the accuracy of five alternative Mach-L techniques with that of conventional LR. 2. Use Mach-L to forecast PCOS and prioritize the risk factors.</div></div><div><h3>Materials and methods</h3><div>Totally, 170 PCOS patients and 950 control participants were included. We collected information on demographics, biochemistry, and lifestyle. PCOS was identified using Rotterdam criteria. Random Forest (RF), stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), extreme gradient boosting (XGBoost), and gradient boosting with categorical features support (CatBoost) are five Mach-L algorithms that were used. Models with lower estimation errors were better.</div></div><div><h3>Results</h3><div>By using <em>t</em>-test, we found subjects with PCOS were younger, glutamic oxaloacetic transaminase (GOT), glutamic pyruvic transaminase (GPT), γ-Glutamyl transferase (γ-GT), Triglyceride (TG), and educational levels were higher. All the five Mach-L methods had lower estimation errors compared to LR. The average of the AUC derived from Mach-L was mean AUC of 0.6669, higher than the that of LR (0.5908). Finally, age, TG, GPT, white blood cell count (WBC), uric acid (UA), and platelet (Plt) were the six most important risk factors selected by Mach-L.</div></div><div><h3>Conclusion</h3><div>Mach-L methods overtook conventional LR and age was the most significant factor, followed by TG, GPT, WBC, UA, and Plt in a cohort of Chinese women.</div></div>","PeriodicalId":49449,"journal":{"name":"Taiwanese Journal of Obstetrics & Gynecology","volume":"64 1","pages":"Pages 68-75"},"PeriodicalIF":2.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Taiwanese Journal of Obstetrics & Gynecology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1028455924002791","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

With an estimated global frequency ranging from5 % to 21 %, polycystic ovary syndrome (PCOS) is one of the most prevalent hormonal disorders. There are many factors found to be related to PCOS. However, most of these researches used traditional methods such as multiple logistic regression (LR). Nowadays, machine learning (Mach-L) emerges as a new method and can be used in medical researches. In the present study, there were two goals: 1. Compare the accuracy of five alternative Mach-L techniques with that of conventional LR. 2. Use Mach-L to forecast PCOS and prioritize the risk factors.

Materials and methods

Totally, 170 PCOS patients and 950 control participants were included. We collected information on demographics, biochemistry, and lifestyle. PCOS was identified using Rotterdam criteria. Random Forest (RF), stochastic gradient boosting (SGB), multivariate adaptive regression splines (MARS), extreme gradient boosting (XGBoost), and gradient boosting with categorical features support (CatBoost) are five Mach-L algorithms that were used. Models with lower estimation errors were better.

Results

By using t-test, we found subjects with PCOS were younger, glutamic oxaloacetic transaminase (GOT), glutamic pyruvic transaminase (GPT), γ-Glutamyl transferase (γ-GT), Triglyceride (TG), and educational levels were higher. All the five Mach-L methods had lower estimation errors compared to LR. The average of the AUC derived from Mach-L was mean AUC of 0.6669, higher than the that of LR (0.5908). Finally, age, TG, GPT, white blood cell count (WBC), uric acid (UA), and platelet (Plt) were the six most important risk factors selected by Mach-L.

Conclusion

Mach-L methods overtook conventional LR and age was the most significant factor, followed by TG, GPT, WBC, UA, and Plt in a cohort of Chinese women.
利用机器学习预测中国女性多囊卵巢疾病患者。
目的:多囊卵巢综合征(PCOS)是最常见的激素失调之一,全球发病率约为5%至21%。与多囊卵巢综合征有关的因素有很多。然而,这些研究大多采用传统的方法,如多元逻辑回归(LR)。如今,机器学习(Mach-L)作为一种新的方法出现,可以用于医学研究。在本研究中,有两个目标:1。比较五种替代的马赫- l技术与传统LR的精度。2. 利用Mach-L预测多囊卵巢综合征并对风险因素进行排序。材料与方法:共纳入170例PCOS患者和950例对照组。我们收集了人口统计、生物化学和生活方式方面的信息。采用鹿特丹标准确定多囊卵巢综合征。随机森林(RF)、随机梯度增强(SGB)、多元自适应回归样条(MARS)、极端梯度增强(XGBoost)和具有分类特征支持的梯度增强(CatBoost)是使用的五种Mach-L算法。估计误差较小的模型效果较好。结果:通过t检验,我们发现PCOS患者年龄较轻,谷草酰转氨酶(GOT)、谷丙转氨酶(GPT)、γ-谷氨酰转移酶(γ-GT)、甘油三酯(TG)水平较高。与LR相比,5种Mach-L方法的估计误差都较低。Mach-L的平均AUC为0.6669,高于LR(0.5908)。最后,年龄、TG、GPT、白细胞计数(WBC)、尿酸(UA)和血小板(Plt)是Mach-L选择的6个最重要的危险因素。结论:在中国女性队列中,Mach-L方法超过常规LR,年龄是最重要的因素,其次是TG、GPT、WBC、UA和Plt。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.60
自引率
23.80%
发文量
207
审稿时长
4-8 weeks
期刊介绍: Taiwanese Journal of Obstetrics and Gynecology is a peer-reviewed journal and open access publishing editorials, reviews, original articles, short communications, case reports, research letters, correspondence and letters to the editor in the field of obstetrics and gynecology. The aims of the journal are to: 1.Publish cutting-edge, innovative and topical research that addresses screening, diagnosis, management and care in women''s health 2.Deliver evidence-based information 3.Promote the sharing of clinical experience 4.Address women-related health promotion The journal provides comprehensive coverage of topics in obstetrics & gynecology and women''s health including maternal-fetal medicine, reproductive endocrinology/infertility, and gynecologic oncology. Taiwan Association of Obstetrics and Gynecology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信