{"title":"[基于可解释机器学习的老年人抑郁症状预测及影响因素分析]。","authors":"W Y Su, S H Dong, H J Ge, Q Yu, G F Ma","doi":"10.3760/cma.j.cn112338-20240809-00488","DOIUrl":null,"url":null,"abstract":"<p><p><b>Objective:</b> This study aims to construct a machine learning model to predict depression symptoms in the elderly and analyze the key influencing factors of depression in the elderly using the shapley additive interpretation (SHAP) method. <b>Methods:</b> Based on entries from the 2018 China Health and Retirement Longitudinal Study database, a sample of 5 954 elderly individuals was selected. Feature selection using Support Vector Machine Recursive Feature Elimination, Extreme Gradient Boosting (XGBoost) - Recursive Feature Elimination (RFE), and the Lasso algorithm, which was combined with five classifiers-logistic regression, decision trees, random forests, support vector machines, and XGBoost-to explore the classification effectiveness for depressive symptoms in the elderly. Finally, the SHAP method was used to interpret the analysis of the model with the highest receiver operating characteristic curve areas under the curve (AUC). <b>Results:</b> The accuracy of 15 prediction models ranged from 0.702 to 0.743, with AUC between 0.730 and 0.795. Sensitivity was reported at 0.546 to 0.588, while specificity ranges from 0.783 to 0.865. The model XGBoost-RFE-XGBoost presented the highest AUC. Based on SHAP values, the top four factors influencing depressive symptoms in older adults were life satisfaction, duration of nighttime sleep, disability status, and self-rated health. <b>Conclusion:</b> This study developed a highly efficient and interpretable risk prediction model for depressive symptoms in older adults, which could help identify high-risk older adults and give personalized interventions.</p>","PeriodicalId":23968,"journal":{"name":"中华流行病学杂志","volume":"46 2","pages":"316-324"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"[Prediction of depression symptoms in seniors and analysis of influencing factors based on explainable machine learning].\",\"authors\":\"W Y Su, S H Dong, H J Ge, Q Yu, G F Ma\",\"doi\":\"10.3760/cma.j.cn112338-20240809-00488\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Objective:</b> This study aims to construct a machine learning model to predict depression symptoms in the elderly and analyze the key influencing factors of depression in the elderly using the shapley additive interpretation (SHAP) method. <b>Methods:</b> Based on entries from the 2018 China Health and Retirement Longitudinal Study database, a sample of 5 954 elderly individuals was selected. Feature selection using Support Vector Machine Recursive Feature Elimination, Extreme Gradient Boosting (XGBoost) - Recursive Feature Elimination (RFE), and the Lasso algorithm, which was combined with five classifiers-logistic regression, decision trees, random forests, support vector machines, and XGBoost-to explore the classification effectiveness for depressive symptoms in the elderly. Finally, the SHAP method was used to interpret the analysis of the model with the highest receiver operating characteristic curve areas under the curve (AUC). <b>Results:</b> The accuracy of 15 prediction models ranged from 0.702 to 0.743, with AUC between 0.730 and 0.795. Sensitivity was reported at 0.546 to 0.588, while specificity ranges from 0.783 to 0.865. The model XGBoost-RFE-XGBoost presented the highest AUC. Based on SHAP values, the top four factors influencing depressive symptoms in older adults were life satisfaction, duration of nighttime sleep, disability status, and self-rated health. <b>Conclusion:</b> This study developed a highly efficient and interpretable risk prediction model for depressive symptoms in older adults, which could help identify high-risk older adults and give personalized interventions.</p>\",\"PeriodicalId\":23968,\"journal\":{\"name\":\"中华流行病学杂志\",\"volume\":\"46 2\",\"pages\":\"316-324\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-02-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"中华流行病学杂志\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3760/cma.j.cn112338-20240809-00488\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"中华流行病学杂志","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3760/cma.j.cn112338-20240809-00488","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
[Prediction of depression symptoms in seniors and analysis of influencing factors based on explainable machine learning].
Objective: This study aims to construct a machine learning model to predict depression symptoms in the elderly and analyze the key influencing factors of depression in the elderly using the shapley additive interpretation (SHAP) method. Methods: Based on entries from the 2018 China Health and Retirement Longitudinal Study database, a sample of 5 954 elderly individuals was selected. Feature selection using Support Vector Machine Recursive Feature Elimination, Extreme Gradient Boosting (XGBoost) - Recursive Feature Elimination (RFE), and the Lasso algorithm, which was combined with five classifiers-logistic regression, decision trees, random forests, support vector machines, and XGBoost-to explore the classification effectiveness for depressive symptoms in the elderly. Finally, the SHAP method was used to interpret the analysis of the model with the highest receiver operating characteristic curve areas under the curve (AUC). Results: The accuracy of 15 prediction models ranged from 0.702 to 0.743, with AUC between 0.730 and 0.795. Sensitivity was reported at 0.546 to 0.588, while specificity ranges from 0.783 to 0.865. The model XGBoost-RFE-XGBoost presented the highest AUC. Based on SHAP values, the top four factors influencing depressive symptoms in older adults were life satisfaction, duration of nighttime sleep, disability status, and self-rated health. Conclusion: This study developed a highly efficient and interpretable risk prediction model for depressive symptoms in older adults, which could help identify high-risk older adults and give personalized interventions.
期刊介绍:
Chinese Journal of Epidemiology, established in 1981, is an advanced academic periodical in epidemiology and related disciplines in China, which, according to the principle of integrating theory with practice, mainly reports the major progress in epidemiological research. The columns of the journal include commentary, expert forum, original article, field investigation, disease surveillance, laboratory research, clinical epidemiology, basic theory or method and review, etc.
The journal is included by more than ten major biomedical databases and index systems worldwide, such as been indexed in Scopus, PubMed/MEDLINE, PubMed Central (PMC), Europe PubMed Central, Embase, Chemical Abstract, Chinese Science and Technology Paper and Citation Database (CSTPCD), Chinese core journal essentials overview, Chinese Science Citation Database (CSCD) core database, Chinese Biological Medical Disc (CBMdisc), and Chinese Medical Citation Index (CMCI), etc. It is one of the core academic journals and carefully selected core journals in preventive and basic medicine in China.