利用机器学习中医特征检测2型糖尿病周围神经病变:一项横断面研究。

IF 3.3 3区 医学 Q2 MEDICAL INFORMATICS
Zhikui Tian, JiZhong Zhang, Yadong Fan, Xuan Sun, Dongjun Wang, XiaoFei Liu, GuoHui Lu, Hongwu Wang
{"title":"利用机器学习中医特征检测2型糖尿病周围神经病变:一项横断面研究。","authors":"Zhikui Tian, JiZhong Zhang, Yadong Fan, Xuan Sun, Dongjun Wang, XiaoFei Liu, GuoHui Lu, Hongwu Wang","doi":"10.1186/s12911-025-02932-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Aims: </strong>Diabetic peripheral neuropathy (DPN) is the most common complication of diabetes mellitus. Early identification of individuals at high risk of DPN is essential for successful early intervention. Traditional Chinese medicine (TCM) tongue diagnosis, one of the four diagnostic methods, lacks specific algorithms for TCM symptoms and tongue features. This study aims to develop machine learning (ML) models based on TCM to predict the risk of diabetic peripheral neuropathy (DPN) in patients with type 2 diabetes mellitus (T2DM).</p><p><strong>Methods: </strong>A total of 4723 patients were included in the analysis (4430 with T2DM and 293 with DPN). TFDA-1 was used to obtain tongue images during a questionnaire survey. LASSO (least absolute shrinkage and selection operator) logistic regression model with fivefold cross-validation was used to select imaging features, which were then screened using best subset selection. The synthetic minority oversampling technique (SMOTE) algorithm was applied to address the class imbalance and eliminate possible bias. The area under the receiver operating characteristic curve (AUC) was used to evaluate the model's performance. Four ML algorithms, namely logistic regression (LR), random forest (RF), support vector classifier (SVC), and light gradient boosting machine (LGBM), were used to build predictive models for DPN. The importance of covariates in DPN was ranked using classifiers with better performance.</p><p><strong>Results: </strong>The RF model performed the best, with an accuracy of 0.767, precision of 0.718, recall of 0.874, F-1 score of 0.789, and AUC of 0.77. With a value of 0.879, the LGBM model appeared to be the best regarding recall Age, sweating, dark red tongue, insomnia, and smoking were the five most significant RF features. Age, yellow coating, loose teeth, smoking, and insomnia were the five most significant features of the LGBM model.</p><p><strong>Conclusions: </strong>This cross-sectional study demonstrates that the RF and LGBM models can screen for high-risk DPN in T2DM patients using TCM symptoms and tongue features. The identified key TCM-related features, such as age, tongue coating, and other symptoms, may be advantageous in developing preventative measures for T2DM patients.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"90"},"PeriodicalIF":3.3000,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11837659/pdf/","citationCount":"0","resultStr":"{\"title\":\"Diabetic peripheral neuropathy detection of type 2 diabetes using machine learning from TCM features: a cross-sectional study.\",\"authors\":\"Zhikui Tian, JiZhong Zhang, Yadong Fan, Xuan Sun, Dongjun Wang, XiaoFei Liu, GuoHui Lu, Hongwu Wang\",\"doi\":\"10.1186/s12911-025-02932-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Aims: </strong>Diabetic peripheral neuropathy (DPN) is the most common complication of diabetes mellitus. Early identification of individuals at high risk of DPN is essential for successful early intervention. Traditional Chinese medicine (TCM) tongue diagnosis, one of the four diagnostic methods, lacks specific algorithms for TCM symptoms and tongue features. This study aims to develop machine learning (ML) models based on TCM to predict the risk of diabetic peripheral neuropathy (DPN) in patients with type 2 diabetes mellitus (T2DM).</p><p><strong>Methods: </strong>A total of 4723 patients were included in the analysis (4430 with T2DM and 293 with DPN). TFDA-1 was used to obtain tongue images during a questionnaire survey. LASSO (least absolute shrinkage and selection operator) logistic regression model with fivefold cross-validation was used to select imaging features, which were then screened using best subset selection. The synthetic minority oversampling technique (SMOTE) algorithm was applied to address the class imbalance and eliminate possible bias. The area under the receiver operating characteristic curve (AUC) was used to evaluate the model's performance. Four ML algorithms, namely logistic regression (LR), random forest (RF), support vector classifier (SVC), and light gradient boosting machine (LGBM), were used to build predictive models for DPN. The importance of covariates in DPN was ranked using classifiers with better performance.</p><p><strong>Results: </strong>The RF model performed the best, with an accuracy of 0.767, precision of 0.718, recall of 0.874, F-1 score of 0.789, and AUC of 0.77. With a value of 0.879, the LGBM model appeared to be the best regarding recall Age, sweating, dark red tongue, insomnia, and smoking were the five most significant RF features. Age, yellow coating, loose teeth, smoking, and insomnia were the five most significant features of the LGBM model.</p><p><strong>Conclusions: </strong>This cross-sectional study demonstrates that the RF and LGBM models can screen for high-risk DPN in T2DM patients using TCM symptoms and tongue features. The identified key TCM-related features, such as age, tongue coating, and other symptoms, may be advantageous in developing preventative measures for T2DM patients.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"25 1\",\"pages\":\"90\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-02-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11837659/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-025-02932-w\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-02932-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

摘要

目的:糖尿病周围神经病变(DPN)是糖尿病最常见的并发症。早期识别DPN高危人群对于成功的早期干预至关重要。中医舌诊作为四诊方法之一,缺乏针对中医症状和舌特征的具体算法。本研究旨在建立基于中医的机器学习(ML)模型,预测2型糖尿病(T2DM)患者发生糖尿病周围神经病变(DPN)的风险。方法:共纳入4723例患者(T2DM 4430例,DPN 293例)。在问卷调查中使用TFDA-1获得舌头图像。使用五重交叉验证的LASSO(最小绝对收缩和选择算子)逻辑回归模型选择成像特征,然后使用最佳子集选择对其进行筛选。采用合成少数派过采样技术(SMOTE)算法解决类不平衡问题,消除可能存在的偏差。用受者工作特征曲线下面积(AUC)来评价模型的性能。采用逻辑回归(LR)、随机森林(RF)、支持向量分类器(SVC)和光梯度增强机(LGBM)四种机器学习算法构建DPN预测模型。使用性能更好的分类器对DPN中协变量的重要性进行排序。结果:RF模型的准确率为0.767,精密度为0.718,召回率为0.874,F-1得分为0.789,AUC为0.77,具有最佳的预测效果。LGBM模型在回忆方面表现最好,其值为0.879,年龄、出汗、舌头暗红、失眠和吸烟是五个最显著的射频特征。年龄、黄牙、牙齿松动、吸烟、失眠是LGBM模型最显著的五个特征。结论:本横断面研究表明,RF和LGBM模型可以利用中医症状和舌部特征筛查T2DM患者的高危DPN。确定的关键中医相关特征,如年龄、舌苔和其他症状,可能有利于制定T2DM患者的预防措施。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Diabetic peripheral neuropathy detection of type 2 diabetes using machine learning from TCM features: a cross-sectional study.

Aims: Diabetic peripheral neuropathy (DPN) is the most common complication of diabetes mellitus. Early identification of individuals at high risk of DPN is essential for successful early intervention. Traditional Chinese medicine (TCM) tongue diagnosis, one of the four diagnostic methods, lacks specific algorithms for TCM symptoms and tongue features. This study aims to develop machine learning (ML) models based on TCM to predict the risk of diabetic peripheral neuropathy (DPN) in patients with type 2 diabetes mellitus (T2DM).

Methods: A total of 4723 patients were included in the analysis (4430 with T2DM and 293 with DPN). TFDA-1 was used to obtain tongue images during a questionnaire survey. LASSO (least absolute shrinkage and selection operator) logistic regression model with fivefold cross-validation was used to select imaging features, which were then screened using best subset selection. The synthetic minority oversampling technique (SMOTE) algorithm was applied to address the class imbalance and eliminate possible bias. The area under the receiver operating characteristic curve (AUC) was used to evaluate the model's performance. Four ML algorithms, namely logistic regression (LR), random forest (RF), support vector classifier (SVC), and light gradient boosting machine (LGBM), were used to build predictive models for DPN. The importance of covariates in DPN was ranked using classifiers with better performance.

Results: The RF model performed the best, with an accuracy of 0.767, precision of 0.718, recall of 0.874, F-1 score of 0.789, and AUC of 0.77. With a value of 0.879, the LGBM model appeared to be the best regarding recall Age, sweating, dark red tongue, insomnia, and smoking were the five most significant RF features. Age, yellow coating, loose teeth, smoking, and insomnia were the five most significant features of the LGBM model.

Conclusions: This cross-sectional study demonstrates that the RF and LGBM models can screen for high-risk DPN in T2DM patients using TCM symptoms and tongue features. The identified key TCM-related features, such as age, tongue coating, and other symptoms, may be advantageous in developing preventative measures for T2DM patients.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信