筛查 2 型糖尿病的机器学习算法:来自 Fasa 成人队列研究的数据。

IF 2.7 Q3 ENDOCRINOLOGY & METABOLISM
Hanieh Karmand, Aref Andishgar, Reza Tabrizi, Alireza Sadeghi, Babak Pezeshki, Mahdi Ravankhah, Erfan Taherifard, Fariba Ahmadizar
{"title":"筛查 2 型糖尿病的机器学习算法:来自 Fasa 成人队列研究的数据。","authors":"Hanieh Karmand,&nbsp;Aref Andishgar,&nbsp;Reza Tabrizi,&nbsp;Alireza Sadeghi,&nbsp;Babak Pezeshki,&nbsp;Mahdi Ravankhah,&nbsp;Erfan Taherifard,&nbsp;Fariba Ahmadizar","doi":"10.1002/edm2.472","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Introduction</h3>\n \n <p>The application of machine learning (ML) is increasingly growing in biomedical sciences. This study aimed to evaluate factors associated with type 2 diabetes mellitus (T2DM) and compare the performance of ML methods in identifying individuals with the disease in an Iranian setting.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Using the baseline data from Fasa Adult Cohort Study (FACS) and in a sex-stratified manner, we studied factors associated with T2DM by applying seven different ML methods including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbours (KNN), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGB) and Bagging classifier (BAG). We further compared the performance of these methods; for each algorithm, accuracy, precision, sensitivity, specificity, F1 score, and Area Under Curve (AUC) were calculated.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>10,112 participants were recruited between 2014 and 2016, of whom 1246 had T2DM at baseline. 4566 (45%) participants were males, aged between 35 and 70 years. For males, age, sugar consumption, and history of hospitalization were the most weighted variables regarding their importance in screening for T2DM using the GBM model, respectively; these variables were sugar consumption, urine blood, and age for females. GBM outperformed other models for both males and females with AUC of 0.75 (0.69–0.82) and 0.76 (0.71–0.80), and F1 score of 0.33 (0.27–0.39) and 0.42 (0.38–0.46), respectively. GBM also showed a sensitivity of 0.24 (0.19–0.29) and a specificity of 0.98 (0.96–1.0) in males and a sensitivity of 0.38 (0.34–0.42) and specificity of 0.92 (0.89–0.95) in females. Notably, close performance characteristics were detected among other ML models.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>GBM model might achieve better performance in screening for T2DM in a south Iranian population.</p>\n </section>\n </div>","PeriodicalId":36522,"journal":{"name":"Endocrinology, Diabetes and Metabolism","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/edm2.472","citationCount":"0","resultStr":"{\"title\":\"Machine-learning algorithms in screening for type 2 diabetes mellitus: Data from Fasa Adults Cohort Study\",\"authors\":\"Hanieh Karmand,&nbsp;Aref Andishgar,&nbsp;Reza Tabrizi,&nbsp;Alireza Sadeghi,&nbsp;Babak Pezeshki,&nbsp;Mahdi Ravankhah,&nbsp;Erfan Taherifard,&nbsp;Fariba Ahmadizar\",\"doi\":\"10.1002/edm2.472\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Introduction</h3>\\n \\n <p>The application of machine learning (ML) is increasingly growing in biomedical sciences. This study aimed to evaluate factors associated with type 2 diabetes mellitus (T2DM) and compare the performance of ML methods in identifying individuals with the disease in an Iranian setting.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Using the baseline data from Fasa Adult Cohort Study (FACS) and in a sex-stratified manner, we studied factors associated with T2DM by applying seven different ML methods including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbours (KNN), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGB) and Bagging classifier (BAG). We further compared the performance of these methods; for each algorithm, accuracy, precision, sensitivity, specificity, F1 score, and Area Under Curve (AUC) were calculated.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>10,112 participants were recruited between 2014 and 2016, of whom 1246 had T2DM at baseline. 4566 (45%) participants were males, aged between 35 and 70 years. For males, age, sugar consumption, and history of hospitalization were the most weighted variables regarding their importance in screening for T2DM using the GBM model, respectively; these variables were sugar consumption, urine blood, and age for females. GBM outperformed other models for both males and females with AUC of 0.75 (0.69–0.82) and 0.76 (0.71–0.80), and F1 score of 0.33 (0.27–0.39) and 0.42 (0.38–0.46), respectively. GBM also showed a sensitivity of 0.24 (0.19–0.29) and a specificity of 0.98 (0.96–1.0) in males and a sensitivity of 0.38 (0.34–0.42) and specificity of 0.92 (0.89–0.95) in females. Notably, close performance characteristics were detected among other ML models.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>GBM model might achieve better performance in screening for T2DM in a south Iranian population.</p>\\n </section>\\n </div>\",\"PeriodicalId\":36522,\"journal\":{\"name\":\"Endocrinology, Diabetes and Metabolism\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/edm2.472\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Endocrinology, Diabetes and Metabolism\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/edm2.472\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENDOCRINOLOGY & METABOLISM\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Endocrinology, Diabetes and Metabolism","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/edm2.472","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

摘要

介绍:机器学习(ML)在生物医学领域的应用日益广泛。本研究旨在评估与 2 型糖尿病(T2DM)相关的因素,并比较 ML 方法在伊朗环境下识别糖尿病患者的性能:我们利用法萨成人队列研究(FACS)的基线数据,以性别分层的方式,通过应用七种不同的 ML 方法,包括逻辑回归(LR)、支持向量机(SVM)、随机森林(RF)、K-近邻(KNN)、梯度提升机(GBM)、极端梯度提升(XGB)和袋式分类器(BAG),研究了与 T2DM 相关的因素。我们进一步比较了这些方法的性能;计算了每种算法的准确度、精确度、灵敏度、特异性、F1 分数和曲线下面积(AUC):2014 年至 2016 年间共招募了 10112 名参与者,其中 1246 人基线时患有 T2DM。4566人(45%)为男性,年龄在35至70岁之间。在使用GBM模型筛查T2DM的重要性方面,男性的年龄、食糖摄入量和住院史分别是权重最高的变量;女性的这些变量分别是食糖摄入量、尿血和年龄。对男性和女性而言,GBM 的 AUC 分别为 0.75(0.69-0.82)和 0.76(0.71-0.80),F1 分别为 0.33(0.27-0.39)和 0.42(0.38-0.46),优于其他模型。GBM 对男性的灵敏度为 0.24(0.19-0.29),特异性为 0.98(0.96-1.0);对女性的灵敏度为 0.38(0.34-0.42),特异性为 0.92(0.89-0.95)。值得注意的是,其他 ML 模型的性能特征也很接近:结论:GBM 模型在伊朗南部人群中筛查 T2DM 的效果可能更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Machine-learning algorithms in screening for type 2 diabetes mellitus: Data from Fasa Adults Cohort Study

Machine-learning algorithms in screening for type 2 diabetes mellitus: Data from Fasa Adults Cohort Study

Introduction

The application of machine learning (ML) is increasingly growing in biomedical sciences. This study aimed to evaluate factors associated with type 2 diabetes mellitus (T2DM) and compare the performance of ML methods in identifying individuals with the disease in an Iranian setting.

Methods

Using the baseline data from Fasa Adult Cohort Study (FACS) and in a sex-stratified manner, we studied factors associated with T2DM by applying seven different ML methods including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbours (KNN), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGB) and Bagging classifier (BAG). We further compared the performance of these methods; for each algorithm, accuracy, precision, sensitivity, specificity, F1 score, and Area Under Curve (AUC) were calculated.

Results

10,112 participants were recruited between 2014 and 2016, of whom 1246 had T2DM at baseline. 4566 (45%) participants were males, aged between 35 and 70 years. For males, age, sugar consumption, and history of hospitalization were the most weighted variables regarding their importance in screening for T2DM using the GBM model, respectively; these variables were sugar consumption, urine blood, and age for females. GBM outperformed other models for both males and females with AUC of 0.75 (0.69–0.82) and 0.76 (0.71–0.80), and F1 score of 0.33 (0.27–0.39) and 0.42 (0.38–0.46), respectively. GBM also showed a sensitivity of 0.24 (0.19–0.29) and a specificity of 0.98 (0.96–1.0) in males and a sensitivity of 0.38 (0.34–0.42) and specificity of 0.92 (0.89–0.95) in females. Notably, close performance characteristics were detected among other ML models.

Conclusions

GBM model might achieve better performance in screening for T2DM in a south Iranian population.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Endocrinology, Diabetes and Metabolism
Endocrinology, Diabetes and Metabolism Medicine-Endocrinology, Diabetes and Metabolism
CiteScore
5.00
自引率
0.00%
发文量
66
审稿时长
6 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信