Predicting incident dementia in community-dwelling older adults using primary and secondary care data from electronic health records.

IF 4.1 Q1 CLINICAL NEUROLOGY
Brain communications Pub Date : 2024-12-24 eCollection Date: 2025-01-01 DOI:10.1093/braincomms/fcae469
Konstantin Georgiev, Yiqing Wang, Andrew Conkie, Annie Sinclair, Vyron Christodoulou, Saleh Seyedzadeh, Malcolm Price, Ann Wales, Nicholas L Mills, Susan D Shenkin, Joanne McPeake, Jacques D Fleuriot, Atul Anand
{"title":"Predicting incident dementia in community-dwelling older adults using primary and secondary care data from electronic health records.","authors":"Konstantin Georgiev, Yiqing Wang, Andrew Conkie, Annie Sinclair, Vyron Christodoulou, Saleh Seyedzadeh, Malcolm Price, Ann Wales, Nicholas L Mills, Susan D Shenkin, Joanne McPeake, Jacques D Fleuriot, Atul Anand","doi":"10.1093/braincomms/fcae469","DOIUrl":null,"url":null,"abstract":"<p><p>Predicting risk of future dementia is essential for primary prevention strategies, particularly in the era of novel immunotherapies. However, few studies have developed population-level prediction models using existing routine healthcare data. In this longitudinal retrospective cohort study, we predicted incident dementia using primary and secondary care health records at 5, 10 and 13 years in 144 113 Scottish older adults who were dementia-free prior to 1st April 2009. Gradient-boosting (XGBoost) prediction models were trained on two feature subsets: data-driven (using all 171 extracted variables) and clinically supervised (22 curated variables). We used a random-stratified internal validation set to rank top predictors in each model, assessing performance stratified by age and socioeconomic deprivation. Predictions were stratified into 10 equally sized risk deciles and ranked by response rate. Over 13 years of follow-up, 11 143 (8%) patients developed dementia. The data-driven models achieved marginally better precision-recall area-under-the-curve scores of 0.18, 0.26 and 0.30 compared to clinically supervised models with scores of 0.17, 0.27 and 0.29 for incident dementia at 5, 10 and 13 years, respectively. The clinically supervised model achieved comparable specificity 0.88 [95% confidence interval (CI) 0.87-0.88] and sensitivity (0.55, 95% CI 0.53-0.57) to the data-driven model for prediction at 13 years. The most important model features were age, deprivation and frailty, measured by a modified electronic frailty index excluding known cognitive deficits. Model precision was consistent across socioeconomic deprivation quintiles but lower in younger-onset (<70 years) dementia cases. At 13 years, dementia was diagnosed in 32% of the population classified as highest risk with 40% of individuals in this group below the age of 80. Personalized estimates of future dementia risk from routinely collected healthcare data could influence risk factor modification and help to target brain imaging and novel immunotherapies in selected individuals with pre-symptomatic disease.</p>","PeriodicalId":93915,"journal":{"name":"Brain communications","volume":"7 1","pages":"fcae469"},"PeriodicalIF":4.1000,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11697165/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brain communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/braincomms/fcae469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Predicting risk of future dementia is essential for primary prevention strategies, particularly in the era of novel immunotherapies. However, few studies have developed population-level prediction models using existing routine healthcare data. In this longitudinal retrospective cohort study, we predicted incident dementia using primary and secondary care health records at 5, 10 and 13 years in 144 113 Scottish older adults who were dementia-free prior to 1st April 2009. Gradient-boosting (XGBoost) prediction models were trained on two feature subsets: data-driven (using all 171 extracted variables) and clinically supervised (22 curated variables). We used a random-stratified internal validation set to rank top predictors in each model, assessing performance stratified by age and socioeconomic deprivation. Predictions were stratified into 10 equally sized risk deciles and ranked by response rate. Over 13 years of follow-up, 11 143 (8%) patients developed dementia. The data-driven models achieved marginally better precision-recall area-under-the-curve scores of 0.18, 0.26 and 0.30 compared to clinically supervised models with scores of 0.17, 0.27 and 0.29 for incident dementia at 5, 10 and 13 years, respectively. The clinically supervised model achieved comparable specificity 0.88 [95% confidence interval (CI) 0.87-0.88] and sensitivity (0.55, 95% CI 0.53-0.57) to the data-driven model for prediction at 13 years. The most important model features were age, deprivation and frailty, measured by a modified electronic frailty index excluding known cognitive deficits. Model precision was consistent across socioeconomic deprivation quintiles but lower in younger-onset (<70 years) dementia cases. At 13 years, dementia was diagnosed in 32% of the population classified as highest risk with 40% of individuals in this group below the age of 80. Personalized estimates of future dementia risk from routinely collected healthcare data could influence risk factor modification and help to target brain imaging and novel immunotherapies in selected individuals with pre-symptomatic disease.

利用来自电子健康记录的初级和二级保健数据预测社区居住老年人痴呆的发生率。
预测未来痴呆的风险对于初级预防策略至关重要,特别是在新的免疫疗法时代。然而,很少有研究利用现有的常规医疗保健数据建立人口水平的预测模型。在这项纵向回顾性队列研究中,我们对144113名在2009年4月1日前无痴呆的苏格兰老年人在5年、10年和13年的初级和二级保健健康记录进行了预测。梯度增强(XGBoost)预测模型在两个特征子集上进行训练:数据驱动(使用所有171个提取的变量)和临床监督(22个管理变量)。我们使用随机分层的内部验证集对每个模型中的顶级预测因子进行排名,评估按年龄和社会经济剥夺分层的表现。预测结果被分成10个同等大小的风险十分位数,并按响应率进行排名。在13年的随访中,1143名(8%)患者发展为痴呆。与临床监督模型相比,数据驱动模型在5年、10年和13年的痴呆发生率的精确召回率得分分别为0.18、0.26和0.30,略高于临床监督模型的0.17、0.27和0.29。临床监督模型的特异性为0.88[95%可信区间(CI) 0.87-0.88],敏感性为0.55,95% CI 0.53-0.57),与数据驱动模型相比,13年预测的特异性为0.88[95%可信区间(CI) 0.87-0.88]。最重要的模型特征是年龄,剥夺和脆弱,通过修改的电子脆弱指数来衡量,不包括已知的认知缺陷。模型精度在社会经济剥夺五分之一中是一致的,但在年轻发病(
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.00
自引率
0.00%
发文量
0
审稿时长
6 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信