Development and validation of a novel predictive model for dementia risk in middle-aged and elderly depression individuals: a large and longitudinal machine learning cohort study.

IF 7.9 1区 医学 Q1 CLINICAL NEUROLOGY
Xuan Xiao, Yihui Li, Qiaoboyang Wu, Xinting Liu, Xu Cao, Maiping Li, Jianjing Liu, Lianggeng Gong, Xi-Jian Dai
{"title":"Development and validation of a novel predictive model for dementia risk in middle-aged and elderly depression individuals: a large and longitudinal machine learning cohort study.","authors":"Xuan Xiao, Yihui Li, Qiaoboyang Wu, Xinting Liu, Xu Cao, Maiping Li, Jianjing Liu, Lianggeng Gong, Xi-Jian Dai","doi":"10.1186/s13195-025-01750-6","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Depression serves as a prodromal symptom of dementia, and individuals with depression exhibit a significantly higher risk of developing dementia. The aim of this study is to develop and validate a novel dementia risk prediction tool among middle-aged and elderly individuals with depression based on machine learning algorithms.</p><p><strong>Methods: </strong>This study included 31,587 middle-aged and elderly individuals with depression who did not have a diagnosis of dementia at baseline from a large UK population-based prospective cohort. A rigorous variable selection strategy was employed to identify risk and protective factors of dementia from an initial pool of 190 candidate variables, ultimately retaining 27 variables. Eight distinct data analysis strategies were utilized to develop and validate the dementia risk prediction model. The DeLong's test was applied to compare the statistical differences between different models.</p><p><strong>Results: </strong>During a median follow-up of 7.98 years, 896 incident dementia cases were identified among study participants. In model development employing an 8:2 data split (fivefold cross-validation for training), the Adaboost classifier achieved the optimal performance (AUC 0.861 ± 0.003), followed by XGBoost (AUC 0.839 ± 0.005) and CatBoost (AUC 0.828 ± 0.007) classifiers. To facilitate community generalization and clinical applicability, we develop a simplified model through a forward feature subset selection algorithm, retaining 12 variables. The simplified model maintained robust performance, with AdaBoost achieving the highest discriminative ability (AUC 0.859 ± 0.002), followed by XGBoost (AUC 0.835 ± 0.001) and CatBoost (AUC 0.821 ± 0.005). The DeLong's test revealed no statistically significant difference in AUC values between models using 12 and 27 variables (p = 0.278). For practical implementation, we deployed the optimal model to a web application for visualization and dementia risk assessment, named DRP-Depression.</p><p><strong>Conclusions: </strong>We developed a practical and easy-to-promote risk prediction model based on machine learning algorithms, and deployed it to a web application to provide a new and convenient tool for dementia risk prediction in the middle-aged and elderly individuals with depression.</p>","PeriodicalId":7516,"journal":{"name":"Alzheimer's Research & Therapy","volume":"17 1","pages":"103"},"PeriodicalIF":7.9000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12070709/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Alzheimer's Research & Therapy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13195-025-01750-6","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Depression serves as a prodromal symptom of dementia, and individuals with depression exhibit a significantly higher risk of developing dementia. The aim of this study is to develop and validate a novel dementia risk prediction tool among middle-aged and elderly individuals with depression based on machine learning algorithms.

Methods: This study included 31,587 middle-aged and elderly individuals with depression who did not have a diagnosis of dementia at baseline from a large UK population-based prospective cohort. A rigorous variable selection strategy was employed to identify risk and protective factors of dementia from an initial pool of 190 candidate variables, ultimately retaining 27 variables. Eight distinct data analysis strategies were utilized to develop and validate the dementia risk prediction model. The DeLong's test was applied to compare the statistical differences between different models.

Results: During a median follow-up of 7.98 years, 896 incident dementia cases were identified among study participants. In model development employing an 8:2 data split (fivefold cross-validation for training), the Adaboost classifier achieved the optimal performance (AUC 0.861 ± 0.003), followed by XGBoost (AUC 0.839 ± 0.005) and CatBoost (AUC 0.828 ± 0.007) classifiers. To facilitate community generalization and clinical applicability, we develop a simplified model through a forward feature subset selection algorithm, retaining 12 variables. The simplified model maintained robust performance, with AdaBoost achieving the highest discriminative ability (AUC 0.859 ± 0.002), followed by XGBoost (AUC 0.835 ± 0.001) and CatBoost (AUC 0.821 ± 0.005). The DeLong's test revealed no statistically significant difference in AUC values between models using 12 and 27 variables (p = 0.278). For practical implementation, we deployed the optimal model to a web application for visualization and dementia risk assessment, named DRP-Depression.

Conclusions: We developed a practical and easy-to-promote risk prediction model based on machine learning algorithms, and deployed it to a web application to provide a new and convenient tool for dementia risk prediction in the middle-aged and elderly individuals with depression.

开发和验证中老年抑郁症患者痴呆风险的新预测模型:一项大型纵向机器学习队列研究。
背景:抑郁症是痴呆症的前驱症状,抑郁症患者患痴呆症的风险明显更高。本研究的目的是开发和验证一种基于机器学习算法的中老年抑郁症患者痴呆症风险预测新工具。方法:本研究纳入了31587名中老年抑郁症患者,这些患者在基线时未被诊断为痴呆,来自英国以人群为基础的前瞻性队列研究。采用严格的变量选择策略,从最初的190个候选变量中确定痴呆的风险和保护因素,最终保留27个变量。八种不同的数据分析策略被用于开发和验证痴呆风险预测模型。采用DeLong’s检验比较不同模型间的统计差异。结果:在中位随访7.98年期间,在研究参与者中确定了896例痴呆病例。在采用8:2数据分割(五倍交叉验证训练)的模型开发中,Adaboost分类器获得了最佳性能(AUC 0.861±0.003),其次是XGBoost (AUC 0.839±0.005)和CatBoost (AUC 0.828±0.007)分类器。为了便于社区推广和临床应用,我们通过前向特征子集选择算法开发了简化模型,保留了12个变量。简化后的模型保持了良好的性能,AdaBoost的识别能力最高(AUC 0.859±0.002),其次是XGBoost (AUC 0.835±0.001)和CatBoost (AUC 0.821±0.005)。DeLong’s检验显示,使用12个变量和27个变量的模型之间的AUC值无统计学差异(p = 0.278)。为了实际实施,我们将最佳模型部署到一个web应用程序中,用于可视化和痴呆风险评估,命名为DRP-Depression。结论:我们开发了一种实用且易于推广的基于机器学习算法的风险预测模型,并将其部署到web应用程序中,为中老年抑郁症患者痴呆风险预测提供了一种新的、便捷的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Alzheimer's Research & Therapy
Alzheimer's Research & Therapy 医学-神经病学
CiteScore
13.10
自引率
3.30%
发文量
172
审稿时长
>12 weeks
期刊介绍: Alzheimer's Research & Therapy is an international peer-reviewed journal that focuses on translational research into Alzheimer's disease and other neurodegenerative diseases. It publishes open-access basic research, clinical trials, drug discovery and development studies, and epidemiologic studies. The journal also includes reviews, viewpoints, commentaries, debates, and reports. All articles published in Alzheimer's Research & Therapy are included in several reputable databases such as CAS, Current contents, DOAJ, Embase, Journal Citation Reports/Science Edition, MEDLINE, PubMed, PubMed Central, Science Citation Index Expanded (Web of Science) and Scopus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信