使用机器学习预测残疾老年人抑郁风险：基于CHARLS数据的分析

IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Frontiers in Artificial Intelligence Pub Date : 2025-07-02 eCollection Date: 2025-01-01 DOI:10.3389/frai.2025.1624171

Tongtong Jin, Ayitijiang Halili

{"title":"使用机器学习预测残疾老年人抑郁风险：基于CHARLS数据的分析","authors":"Tongtong Jin, Ayitijiang Halili","doi":"10.3389/frai.2025.1624171","DOIUrl":null,"url":null,"abstract":"Background: The advancement of artificial intelligence technologies has opened new avenues for depression prevention and management in older adults with disability (defined by basic or instrumental activities of daily living, BADL/IADL). This study systematically developed machine learning (ML) models to predict depression risk in disabled elderly individuals using longitudinal data from the China Health and Retirement Longitudinal Study (CHARLS), providing a potentially generalizable tool for early screening.Methods: This study utilized longitudinal data from the CHARLS 2011-2015 cohort. A three-stage serial consensus approach feature selection framework (LASSO, Elastic Net, and Boruta) was employed to identify 21 robust predictors from 74 candidate variables. Ten ML algorithms were evaluated: LR, HistGBM, MLP, XGBoost, bagging, DT, LightGBM, RF, SVM, and CatBoost. Temporal external validation was performed using an independent 2018-2020 cohort to assess model generalizability. Performance was comprehensively evaluated using accuracy, AUC, F1-score, precision, and recall metrics. The SHAP framework was employed to interpret feature contribution mechanisms.Results: Results demonstrated that the HistGBM model achieved optimal overall performance on the testing sets (AUC = 0.779, F1-score = 0.735, accuracy = 0.713), with only an 8.5% AUC difference between training and testing sets and a 10% difference between external validation and testing sets, indicating temporal stability. SHAP interpretability analysis revealed that sleep time (mean SHAP value = 0.344) in the health behavior domain and life satisfaction (0.339) and episodic memory (0.220) in the subjective perception domain contributed more significantly to prediction than traditional biomedical indicators.Conclusion: This study developed an AI-based tool for depression risk assessment in older adults with disability through a multi-stage feature selection process and a temporal external validation framework. These findings provide a practical screening instrument and a methodological reference for implementing AI technologies in geriatric mental health applications, thereby facilitating clinical translation of predictive analytics in this field.","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1624171"},"PeriodicalIF":4.7000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12263909/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting the risk of depression in older adults with disability using machine learning: an analysis based on CHARLS data.\",\"authors\":\"Tongtong Jin, Ayitijiang Halili\",\"doi\":\"10.3389/frai.2025.1624171\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: The advancement of artificial intelligence technologies has opened new avenues for depression prevention and management in older adults with disability (defined by basic or instrumental activities of daily living, BADL/IADL). This study systematically developed machine learning (ML) models to predict depression risk in disabled elderly individuals using longitudinal data from the China Health and Retirement Longitudinal Study (CHARLS), providing a potentially generalizable tool for early screening.Methods: This study utilized longitudinal data from the CHARLS 2011-2015 cohort. A three-stage serial consensus approach feature selection framework (LASSO, Elastic Net, and Boruta) was employed to identify 21 robust predictors from 74 candidate variables. Ten ML algorithms were evaluated: LR, HistGBM, MLP, XGBoost, bagging, DT, LightGBM, RF, SVM, and CatBoost. Temporal external validation was performed using an independent 2018-2020 cohort to assess model generalizability. Performance was comprehensively evaluated using accuracy, AUC, F1-score, precision, and recall metrics. The SHAP framework was employed to interpret feature contribution mechanisms.Results: Results demonstrated that the HistGBM model achieved optimal overall performance on the testing sets (AUC = 0.779, F1-score = 0.735, accuracy = 0.713), with only an 8.5% AUC difference between training and testing sets and a 10% difference between external validation and testing sets, indicating temporal stability. SHAP interpretability analysis revealed that sleep time (mean SHAP value = 0.344) in the health behavior domain and life satisfaction (0.339) and episodic memory (0.220) in the subjective perception domain contributed more significantly to prediction than traditional biomedical indicators.Conclusion: This study developed an AI-based tool for depression risk assessment in older adults with disability through a multi-stage feature selection process and a temporal external validation framework. These findings provide a practical screening instrument and a methodological reference for implementing AI technologies in geriatric mental health applications, thereby facilitating clinical translation of predictive analytics in this field.\",\"PeriodicalId\":33315,\"journal\":{\"name\":\"Frontiers in Artificial Intelligence\",\"volume\":\"8 \",\"pages\":\"1624171\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12263909/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frai.2025.1624171\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2025.1624171","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

背景：人工智能技术的进步为老年残疾（日常生活基本或工具性活动，BADL/IADL）的抑郁症预防和管理开辟了新的途径。本研究系统地开发了机器学习（ML）模型，利用中国健康与退休纵向研究（CHARLS）的纵向数据预测残疾老年人的抑郁风险，为早期筛查提供了一个潜在的推广工具。方法：本研究利用CHARLS 2011-2015队列的纵向数据。采用三阶段序列共识方法特征选择框架（LASSO， Elastic Net和Boruta）从74个候选变量中识别21个稳健预测因子。评估了十种ML算法：LR、HistGBM、MLP、XGBoost、bagging、DT、LightGBM、RF、SVM和CatBoost。使用独立的2018-2020年队列进行时间外部验证，以评估模型的可推广性。使用准确性、AUC、f1评分、精度和召回率指标对性能进行综合评估。采用SHAP框架解释特征贡献机制。结果表明，HistGBM模型在测试集上获得了最优的整体性能（AUC = 0.779,F1-score = 0.735，准确率 = 0.713），训练集与测试集的AUC差值仅为8.5%，外部验证集与测试集的AUC差值为10%，表明该模型具有时间稳定性。SHAP可解释性分析显示，健康行为领域的睡眠时间（平均SHAP值 = 0.344）、主观感知领域的生活满意度（0.339）和情景记忆（0.220）对预测的贡献大于传统生物医学指标。结论：本研究通过多阶段特征选择过程和时间外部验证框架，开发了一种基于人工智能的残疾老年人抑郁风险评估工具。这些发现为在老年心理健康应用中实施人工智能技术提供了实用的筛选工具和方法参考，从而促进了该领域预测分析的临床转化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Predicting the risk of depression in older adults with disability using machine learning: an analysis based on CHARLS data.

Background: The advancement of artificial intelligence technologies has opened new avenues for depression prevention and management in older adults with disability (defined by basic or instrumental activities of daily living, BADL/IADL). This study systematically developed machine learning (ML) models to predict depression risk in disabled elderly individuals using longitudinal data from the China Health and Retirement Longitudinal Study (CHARLS), providing a potentially generalizable tool for early screening.

Methods: This study utilized longitudinal data from the CHARLS 2011-2015 cohort. A three-stage serial consensus approach feature selection framework (LASSO, Elastic Net, and Boruta) was employed to identify 21 robust predictors from 74 candidate variables. Ten ML algorithms were evaluated: LR, HistGBM, MLP, XGBoost, bagging, DT, LightGBM, RF, SVM, and CatBoost. Temporal external validation was performed using an independent 2018-2020 cohort to assess model generalizability. Performance was comprehensively evaluated using accuracy, AUC, F1-score, precision, and recall metrics. The SHAP framework was employed to interpret feature contribution mechanisms.

Results: Results demonstrated that the HistGBM model achieved optimal overall performance on the testing sets (AUC = 0.779, F1-score = 0.735, accuracy = 0.713), with only an 8.5% AUC difference between training and testing sets and a 10% difference between external validation and testing sets, indicating temporal stability. SHAP interpretability analysis revealed that sleep time (mean SHAP value = 0.344) in the health behavior domain and life satisfaction (0.339) and episodic memory (0.220) in the subjective perception domain contributed more significantly to prediction than traditional biomedical indicators.

Conclusion: This study developed an AI-based tool for depression risk assessment in older adults with disability through a multi-stage feature selection process and a temporal external validation framework. These findings provide a practical screening instrument and a methodological reference for implementing AI technologies in geriatric mental health applications, thereby facilitating clinical translation of predictive analytics in this field.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊