Application of machine learning algorithms to identify risk factors for depression in type 2 diabetes mellitus patients: A Taiwan diabetes registry study.
Yu-Wen Su, Wayne Huey-Herng Sheu, Chii-Min Hwu, Yu-Cheng Chen, Jung-Fu Chen, Yun-Shing Peng, Chien-Ning Huang, Yi-Jen Hung, Harn-Shen Chen
{"title":"Application of machine learning algorithms to identify risk factors for depression in type 2 diabetes mellitus patients: A Taiwan diabetes registry study.","authors":"Yu-Wen Su, Wayne Huey-Herng Sheu, Chii-Min Hwu, Yu-Cheng Chen, Jung-Fu Chen, Yun-Shing Peng, Chien-Ning Huang, Yi-Jen Hung, Harn-Shen Chen","doi":"10.1097/JCMA.0000000000001250","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>We analyzed variables reported during routine clinical practice using a registrational database to estimate risk factors for depression in people with type 2 diabetes mellitus.</p><p><strong>Methods: </strong>A Patient Health Questionnaire (PHQ-9) score of 15 was selected as the cut-off for clinically meaningful depression. Missing data was either filled in with a median value, the k -nearest neighbors' method, or the entire variable was removed. Logistic regression, random forest, and decision tree machine learning models were used to decide which factors were most relevant to depression. The accuracy of each algorithm was evaluated with a testing set.</p><p><strong>Results: </strong>When all variables were included in the logistic regression model, the area under the receiver operating characteristic curve was 0.81. In the random forest model, the most important factor was quality of life (QoL). Upon removing QoL-related variables, bloating, and autoimmune disease became the greatest contributing factors. Model accuracy was 83.1%. In the decision tree model, QoL was also observed as the most decisive factor. Upon removing QoL variables, bloating was the first node. Model accuracy was 82.5%.</p><p><strong>Conclusion: </strong>QoL, bloating, and autoimmune disease were the most important factors associated with depression in type 2 diabetes mellitus patients.</p>","PeriodicalId":94115,"journal":{"name":"Journal of the Chinese Medical Association : JCMA","volume":" ","pages":"513-519"},"PeriodicalIF":2.4000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Chinese Medical Association : JCMA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1097/JCMA.0000000000001250","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/30 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: We analyzed variables reported during routine clinical practice using a registrational database to estimate risk factors for depression in people with type 2 diabetes mellitus.
Methods: A Patient Health Questionnaire (PHQ-9) score of 15 was selected as the cut-off for clinically meaningful depression. Missing data was either filled in with a median value, the k -nearest neighbors' method, or the entire variable was removed. Logistic regression, random forest, and decision tree machine learning models were used to decide which factors were most relevant to depression. The accuracy of each algorithm was evaluated with a testing set.
Results: When all variables were included in the logistic regression model, the area under the receiver operating characteristic curve was 0.81. In the random forest model, the most important factor was quality of life (QoL). Upon removing QoL-related variables, bloating, and autoimmune disease became the greatest contributing factors. Model accuracy was 83.1%. In the decision tree model, QoL was also observed as the most decisive factor. Upon removing QoL variables, bloating was the first node. Model accuracy was 82.5%.
Conclusion: QoL, bloating, and autoimmune disease were the most important factors associated with depression in type 2 diabetes mellitus patients.