Type 2 diabetes risk prediction using glycemic control Metrics: A machine learning approach

IF 1.8 Q3 ENDOCRINOLOGY & METABOLISM
Radwan Qasrawi , Suliman Thwib , Ghada Issa , Razan Abu Ghoush , Malak Amro
{"title":"Type 2 diabetes risk prediction using glycemic control Metrics: A machine learning approach","authors":"Radwan Qasrawi ,&nbsp;Suliman Thwib ,&nbsp;Ghada Issa ,&nbsp;Razan Abu Ghoush ,&nbsp;Malak Amro","doi":"10.1016/j.hnm.2025.200341","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Type 2 Diabetes Mellitus (T2DM) remains a significant global health burden, particularly in low- and middle-income settings. Conventional prevention strategies often lack personalization, overlooking individual variability in lifestyle, nutrition, and health status. This study aimed to develop a personalized T2DM risk prediction model using machine learning (ML), integrating clinical, behavioral, and dietary data, including glycemic index (GI) and glycemic load (GL) derived from actual food and recipe intake.</div></div><div><h3>Methods</h3><div>Data from 3145 Palestinian adults (aged 18–60) were analyzed using statistical and machine learning (ML) techniques. Variables included age, sex, education, income, physical activity, smoking status, perceived health, and detailed nutritional intake, specifically glycemic index (GI) and glycemic load (GL). Nine ML models were developed using the AutoGluon-Tabular framework. Model performance was assessed via accuracy, area under the curve (AUC), and log loss. Feature importance analysis identified key predictors of T2DM risk.</div></div><div><h3>Results</h3><div>Women had significantly higher odds of diabetes than men, while rural residents had a lower risk compared to urban dwellers. People aged 50–59 were over six times more likely to be diabetic than those aged 18–29. Lower education and poor perceived health were also strong predictors. Diabetic participants consumed significantly lower GI (87.7 ± 36.1) and GL (241 ± 180.5) diets compared to non-diabetics (GI = 98.8 ± 35.5; GL = 303.3 ± 202.7; p = 0.001). Among the ML models, XGBoost and CatBoost performed best, with over 93 % accuracy and excellent prediction scores. Glycemic load, age, BMI, waist-to-hip ratio, and self-reported health status were the most important risk indicators.</div></div><div><h3>Conclusion</h3><div>This study showed the effectiveness of integrating machine learning with glycemic control metrics and lifestyle data for personalized T2DM prediction. Incorporating glycemic values from real food and recipe intake improved model accuracy and interpretability. These findings support the development of precision prevention strategies tailored to individual risk profiles, particularly in underserved populations.</div></div>","PeriodicalId":36125,"journal":{"name":"Human Nutrition and Metabolism","volume":"42 ","pages":"Article 200341"},"PeriodicalIF":1.8000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Nutrition and Metabolism","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666149725000453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Type 2 Diabetes Mellitus (T2DM) remains a significant global health burden, particularly in low- and middle-income settings. Conventional prevention strategies often lack personalization, overlooking individual variability in lifestyle, nutrition, and health status. This study aimed to develop a personalized T2DM risk prediction model using machine learning (ML), integrating clinical, behavioral, and dietary data, including glycemic index (GI) and glycemic load (GL) derived from actual food and recipe intake.

Methods

Data from 3145 Palestinian adults (aged 18–60) were analyzed using statistical and machine learning (ML) techniques. Variables included age, sex, education, income, physical activity, smoking status, perceived health, and detailed nutritional intake, specifically glycemic index (GI) and glycemic load (GL). Nine ML models were developed using the AutoGluon-Tabular framework. Model performance was assessed via accuracy, area under the curve (AUC), and log loss. Feature importance analysis identified key predictors of T2DM risk.

Results

Women had significantly higher odds of diabetes than men, while rural residents had a lower risk compared to urban dwellers. People aged 50–59 were over six times more likely to be diabetic than those aged 18–29. Lower education and poor perceived health were also strong predictors. Diabetic participants consumed significantly lower GI (87.7 ± 36.1) and GL (241 ± 180.5) diets compared to non-diabetics (GI = 98.8 ± 35.5; GL = 303.3 ± 202.7; p = 0.001). Among the ML models, XGBoost and CatBoost performed best, with over 93 % accuracy and excellent prediction scores. Glycemic load, age, BMI, waist-to-hip ratio, and self-reported health status were the most important risk indicators.

Conclusion

This study showed the effectiveness of integrating machine learning with glycemic control metrics and lifestyle data for personalized T2DM prediction. Incorporating glycemic values from real food and recipe intake improved model accuracy and interpretability. These findings support the development of precision prevention strategies tailored to individual risk profiles, particularly in underserved populations.
使用血糖控制指标预测2型糖尿病风险:一种机器学习方法
背景2型糖尿病(T2DM)仍然是一个重大的全球健康负担,特别是在低收入和中等收入环境中。传统的预防策略往往缺乏个性化,忽视了生活方式、营养和健康状况的个体差异。本研究旨在利用机器学习(ML)建立一个个性化的T2DM风险预测模型,整合临床、行为和饮食数据,包括来自实际食物和食谱摄入量的血糖指数(GI)和血糖负荷(GL)。方法采用统计学和机器学习(ML)技术对3145名巴勒斯坦成年人(18-60岁)的数据进行分析。变量包括年龄、性别、教育程度、收入、体育活动、吸烟状况、感知健康和详细的营养摄入,特别是血糖指数(GI)和血糖负荷(GL)。使用AutoGluon-Tabular框架开发了9个ML模型。通过准确性、曲线下面积(AUC)和对数损失来评估模型的性能。特征重要性分析确定了T2DM风险的关键预测因素。结果女性患糖尿病的几率明显高于男性,而农村居民患糖尿病的风险低于城市居民。50-59岁的人患糖尿病的可能性是18-29岁人群的6倍多。受教育程度较低和健康状况不佳也是强有力的预测因素。与非糖尿病患者(GI = 98.8±35.5;GL = 303.3±202.7;p = 0.001)相比,糖尿病参与者的GI(87.7±36.1)和GL(241±180.5)饮食显著降低。在ML模型中,XGBoost和CatBoost表现最好,准确率超过93%,预测分数优异。血糖负荷、年龄、BMI、腰臀比和自我报告的健康状况是最重要的风险指标。结论本研究显示了将机器学习与血糖控制指标和生活方式数据相结合用于个性化T2DM预测的有效性。结合实际食物和食谱摄入的血糖值提高了模型的准确性和可解释性。这些发现支持制定针对个人风险概况的精确预防战略,特别是在服务不足的人群中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Human Nutrition and Metabolism
Human Nutrition and Metabolism Agricultural and Biological Sciences-Food Science
CiteScore
1.50
自引率
0.00%
发文量
30
审稿时长
188 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信