Predicting diabetic retinopathy based on routine laboratory tests by machine learning algorithms.

IF 2.8 3区 医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL
Xiaohua Wan, Ruihuan Zhang, Yanan Wang, Wei Wei, Biao Song, Lin Zhang, Yanwei Hu
{"title":"Predicting diabetic retinopathy based on routine laboratory tests by machine learning algorithms.","authors":"Xiaohua Wan, Ruihuan Zhang, Yanan Wang, Wei Wei, Biao Song, Lin Zhang, Yanwei Hu","doi":"10.1186/s40001-025-02442-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>This study aimed to identify risk factors for diabetic retinopathy (DR) and develop machine learning (ML)-based predictive models using routine laboratory data in patients with type 2 diabetes mellitus (T2DM).</p><p><strong>Methods: </strong>Clinical data from 4259 T2DM inpatients at Beijing Tongren Hospital were analyzed, divided into a model construction data set (N = 3936) and an external validation data set (N = 323). Using 39 optimal variables, a prediction model was constructed using the eXtreme Gradient Boosting (XGBoost) algorithm and compared with four other algorithms: support vector machine (SVM), gradient boosting decision tree (GBDT), neural network (NN), and logistic regression (LR). The Shapley Additive exPlanation (SHAP) method was employed to interpret the XGBoost model. External validation was performed to assess model performance.</p><p><strong>Results: </strong>DR was present in 47.69% (N = 1877) of T2DM patients in the model construction data set. Among the models tested, the XGBoost model performed best with an AUC of 0.831, accuracy of 0.757, sensitivity of 0.754, specificity of 0.759, and F1-score of 0.752. SHAP explained feature importance for XGBoost model and identified key risk factors for DR. External validation yielded an accuracy of 0.650 for the XGBoost model.</p><p><strong>Conclusions: </strong>The XGBoost-based prediction model effectively assesses DR risk in T2DM patients using routine laboratory data, aiding clinicians in identifying high-risk individuals and guiding personalized management strategies, especially in medically underserved areas.</p>","PeriodicalId":11949,"journal":{"name":"European Journal of Medical Research","volume":"30 1","pages":"183"},"PeriodicalIF":2.8000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11921716/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s40001-025-02442-5","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: This study aimed to identify risk factors for diabetic retinopathy (DR) and develop machine learning (ML)-based predictive models using routine laboratory data in patients with type 2 diabetes mellitus (T2DM).

Methods: Clinical data from 4259 T2DM inpatients at Beijing Tongren Hospital were analyzed, divided into a model construction data set (N = 3936) and an external validation data set (N = 323). Using 39 optimal variables, a prediction model was constructed using the eXtreme Gradient Boosting (XGBoost) algorithm and compared with four other algorithms: support vector machine (SVM), gradient boosting decision tree (GBDT), neural network (NN), and logistic regression (LR). The Shapley Additive exPlanation (SHAP) method was employed to interpret the XGBoost model. External validation was performed to assess model performance.

Results: DR was present in 47.69% (N = 1877) of T2DM patients in the model construction data set. Among the models tested, the XGBoost model performed best with an AUC of 0.831, accuracy of 0.757, sensitivity of 0.754, specificity of 0.759, and F1-score of 0.752. SHAP explained feature importance for XGBoost model and identified key risk factors for DR. External validation yielded an accuracy of 0.650 for the XGBoost model.

Conclusions: The XGBoost-based prediction model effectively assesses DR risk in T2DM patients using routine laboratory data, aiding clinicians in identifying high-risk individuals and guiding personalized management strategies, especially in medically underserved areas.

基于机器学习算法的常规实验室测试预测糖尿病视网膜病变。
目的:本研究旨在识别糖尿病视网膜病变(DR)的危险因素,并利用2型糖尿病(T2DM)患者的常规实验室数据开发基于机器学习(ML)的预测模型。方法:分析北京同仁医院4259例T2DM住院患者的临床资料,分为模型构建数据集(N = 3936)和外部验证数据集(N = 323)。利用39个最优变量,利用极限梯度增强(XGBoost)算法构建预测模型,并与支持向量机(SVM)、梯度增强决策树(GBDT)、神经网络(NN)和逻辑回归(LR)等4种算法进行比较。采用Shapley加性解释(SHAP)方法对XGBoost模型进行解释。进行外部验证以评估模型的性能。结果:在模型构建数据集中,47.69% (N = 1877) T2DM患者出现DR。其中,XGBoost模型的AUC为0.831,准确率为0.757,灵敏度为0.754,特异性为0.759,f1评分为0.752。SHAP解释了XGBoost模型的特征重要性,并确定了dr的关键风险因素。外部验证得出XGBoost模型的准确性为0.650。结论:基于xgboost的预测模型使用常规实验室数据有效评估T2DM患者的DR风险,帮助临床医生识别高危人群并指导个性化管理策略,特别是在医疗服务不足的地区。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
European Journal of Medical Research
European Journal of Medical Research 医学-医学:研究与实验
CiteScore
3.20
自引率
0.00%
发文量
247
审稿时长
>12 weeks
期刊介绍: European Journal of Medical Research publishes translational and clinical research of international interest across all medical disciplines, enabling clinicians and other researchers to learn about developments and innovations within these disciplines and across the boundaries between disciplines. The journal publishes high quality research and reviews and aims to ensure that the results of all well-conducted research are published, regardless of their outcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信