Machine learning to predict dementia for American Indian and Alaska native peoples: a retrospective cohort study

IF 7 Q1 HEALTH CARE SCIENCES & SERVICES

Lancet Regional Health-Americas Pub Date : 2025-02-13 DOI:10.1016/j.lana.2025.101013

Kayleen Ports , Jiahui Dai , Kyle Conniff , Maria M. Corrada , Spero M. Manson , Joan O’Connell , Luohua Jiang

{"title":"Machine learning to predict dementia for American Indian and Alaska native peoples: a retrospective cohort study","authors":"Kayleen Ports , Jiahui Dai , Kyle Conniff , Maria M. Corrada , Spero M. Manson , Joan O’Connell , Luohua Jiang","doi":"10.1016/j.lana.2025.101013","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Dementia is an increasing concern among American Indian and Alaska Native (AI/AN) communities, yet machine learning models utilizing electronic health record (EHR) data have not been developed or validated for this population. This study aimed to develop a two-year dementia risk prediction model for AI/AN individuals actively using Indian Health Service (IHS) and Tribal health services.</div></div><div><h3>Methods</h3><div>Seven years of data were obtained from the IHS National Data Warehouse and related EHR databases and divided into a five-year baseline period (FY2007–2011) and a two-year dementia prediction period (FY2012–2013). Four algorithms were assessed: logistic regression, Least Absolute Shrinkage and Selection Operator (LASSO), random forest, and eXtreme Gradient Boosting (XGBoost). Dementia Risk Score (DRS)-based and extended models were developed for each algorithm, with performance evaluated by the area under the receiver operating characteristic curve (AUC).</div></div><div><h3>Findings</h3><div>The study cohort included 17,398 AI/AN adults aged ≥ 65 years who were dementia-free at baseline, of whom 59.8% were female. Over the two-year follow-up, 611 individuals (3.5%) were diagnosed with incident dementia. Extended models for logistic regression, LASSO, and XGBoost performed comparably: AUCs (95% CI) of 0.83 (0.79, 0.86), 0.83 (0.79, 0.86), and 0.82 (0.79, 0.86). These top-performing models shared 12 of the 15 highest-ranked predictors, with novel predictors including service utilization.</div></div><div><h3>Interpretation</h3><div>Machine learning algorithms utilizing EHR data can effectively predict two-year dementia risk among AI/AN older adults. These models could aid IHS and Tribal health clinicians in identifying high-risk individuals, facilitating timely interventions and improved care coordination.</div></div><div><h3>Funding</h3><div><span>NIH</span>.</div></div>","PeriodicalId":29783,"journal":{"name":"Lancet Regional Health-Americas","volume":"43 ","pages":"Article 101013"},"PeriodicalIF":7.0000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lancet Regional Health-Americas","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667193X25000237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Dementia is an increasing concern among American Indian and Alaska Native (AI/AN) communities, yet machine learning models utilizing electronic health record (EHR) data have not been developed or validated for this population. This study aimed to develop a two-year dementia risk prediction model for AI/AN individuals actively using Indian Health Service (IHS) and Tribal health services.

Methods

Seven years of data were obtained from the IHS National Data Warehouse and related EHR databases and divided into a five-year baseline period (FY2007–2011) and a two-year dementia prediction period (FY2012–2013). Four algorithms were assessed: logistic regression, Least Absolute Shrinkage and Selection Operator (LASSO), random forest, and eXtreme Gradient Boosting (XGBoost). Dementia Risk Score (DRS)-based and extended models were developed for each algorithm, with performance evaluated by the area under the receiver operating characteristic curve (AUC).

Findings

The study cohort included 17,398 AI/AN adults aged ≥ 65 years who were dementia-free at baseline, of whom 59.8% were female. Over the two-year follow-up, 611 individuals (3.5%) were diagnosed with incident dementia. Extended models for logistic regression, LASSO, and XGBoost performed comparably: AUCs (95% CI) of 0.83 (0.79, 0.86), 0.83 (0.79, 0.86), and 0.82 (0.79, 0.86). These top-performing models shared 12 of the 15 highest-ranked predictors, with novel predictors including service utilization.

Interpretation

Machine learning algorithms utilizing EHR data can effectively predict two-year dementia risk among AI/AN older adults. These models could aid IHS and Tribal health clinicians in identifying high-risk individuals, facilitating timely interventions and improved care coordination.

Funding

NIH.

查看原文本刊更多论文

机器学习预测美国印第安人和阿拉斯加原住民痴呆症：一项回顾性队列研究

痴呆症是美国印第安人和阿拉斯加原住民（AI/ an）社区日益关注的问题，但利用电子健康记录（EHR）数据的机器学习模型尚未针对这一人群开发或验证。本研究旨在为积极使用印第安健康服务（IHS）和部落健康服务的AI/AN个体开发两年痴呆风险预测模型。方法从IHS国家数据仓库和相关EHR数据库获取7年数据，分为5年基线期（2007 - 2011财年）和2年痴呆预测期（2012 - 2013财年）。评估了四种算法：逻辑回归、最小绝对收缩和选择算子（LASSO）、随机森林和极端梯度增强（XGBoost）。为每种算法建立了基于痴呆风险评分（DRS）的模型和扩展模型，并通过受试者工作特征曲线下面积（AUC）来评估其性能。研究队列包括17,398名≥65岁的AI/AN成人，基线时无痴呆，其中59.8%为女性。在两年的随访中，611人（3.5%）被诊断为偶发性痴呆。逻辑回归、LASSO和XGBoost的扩展模型表现相当：auc （95% CI）为0.83（0.79,0.86）、0.83（0.79,0.86）和0.82（0.79,0.86）。这些表现最好的模型共享了15个排名最高的预测因子中的12个，其中包括服务利用率等新的预测因子。利用电子病历数据的机器学习算法可以有效预测AI/AN老年人的两年痴呆风险。这些模型可以帮助IHS和部落健康临床医生识别高风险个体，促进及时干预和改善护理协调。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Lancet Regional Health-Americas Multiple-

CiteScore

8.00

自引率

0.00%

发文量

期刊介绍： The Lancet Regional Health – Americas, an open-access journal, contributes to The Lancet's global initiative by focusing on health-care quality and access in the Americas. It aims to advance clinical practice and health policy in the region, promoting better health outcomes. The journal publishes high-quality original research advocating change or shedding light on clinical practice and health policy. It welcomes submissions on various regional health topics, including infectious diseases, non-communicable diseases, child and adolescent health, maternal and reproductive health, emergency care, health policy, and health equity.