Karen C Schliep, Jeffrey Thornhill, JoAnn T Tschanz, Julio C Facelli, Truls Østbye, Michelle K Sorweid, Ken R Smith, Michael Varner, Richard D Boyce, Christine J Cliatt Brown, Huong Meeks, Samir Abdelrahman
{"title":"利用电子健康记录预测阿尔茨海默氏症和相关痴呆症的发病:缓存县老龄记忆研究(1995-2008 年)的发现。","authors":"Karen C Schliep, Jeffrey Thornhill, JoAnn T Tschanz, Julio C Facelli, Truls Østbye, Michelle K Sorweid, Ken R Smith, Michael Varner, Richard D Boyce, Christine J Cliatt Brown, Huong Meeks, Samir Abdelrahman","doi":"10.1186/s12911-024-02728-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Clinical notes, biomarkers, and neuroimaging have proven valuable in dementia prediction models. Whether commonly available structured clinical data can predict dementia is an emerging area of research. We aimed to predict gold-standard, research-based diagnoses of dementia including Alzheimer's disease (AD) and/or Alzheimer's disease related dementias (ADRD), in addition to ICD-based AD and/or ADRD diagnoses, in a well-phenotyped, population-based cohort using a machine learning approach.</p><p><strong>Methods: </strong>Administrative healthcare data (k = 163 diagnostic features), in addition to census/vital record sociodemographic data (k = 6 features), were linked to the Cache County Study (CCS, 1995-2008).</p><p><strong>Results: </strong>Among successfully linked UPDB-CCS participants (n = 4206), 522 (12.4%) had incident dementia (AD alone, AD comorbid with ADRD, or ADRD alone) as per the CCS \"gold standard\" assessments. Random Forest models, with a 1-year prediction window, achieved the best performance with an Area Under the Curve (AUC) of 0.67. Accuracy declined for dementia subtypes: AD/ADRD (AUC = 0.65); ADRD (AUC = 0.49). Accuracy improved when using ICD-based dementia diagnoses (AUC = 0.77).</p><p><strong>Discussion: </strong>Commonly available structured clinical data (without labs, notes, or prescription information) demonstrate modest ability to predict \"gold-standard\" research-based AD/ADRD diagnoses, corroborated by prior research. Using ICD diagnostic codes to identify dementia as done in the majority of machine learning dementia prediction models, as compared to \"gold-standard\" dementia diagnoses, can result in higher accuracy, but whether these models are predicting true dementia warrants further research.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520673/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting the onset of Alzheimer's disease and related dementia using electronic health records: findings from the cache county study on memory in aging (1995-2008).\",\"authors\":\"Karen C Schliep, Jeffrey Thornhill, JoAnn T Tschanz, Julio C Facelli, Truls Østbye, Michelle K Sorweid, Ken R Smith, Michael Varner, Richard D Boyce, Christine J Cliatt Brown, Huong Meeks, Samir Abdelrahman\",\"doi\":\"10.1186/s12911-024-02728-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Clinical notes, biomarkers, and neuroimaging have proven valuable in dementia prediction models. Whether commonly available structured clinical data can predict dementia is an emerging area of research. We aimed to predict gold-standard, research-based diagnoses of dementia including Alzheimer's disease (AD) and/or Alzheimer's disease related dementias (ADRD), in addition to ICD-based AD and/or ADRD diagnoses, in a well-phenotyped, population-based cohort using a machine learning approach.</p><p><strong>Methods: </strong>Administrative healthcare data (k = 163 diagnostic features), in addition to census/vital record sociodemographic data (k = 6 features), were linked to the Cache County Study (CCS, 1995-2008).</p><p><strong>Results: </strong>Among successfully linked UPDB-CCS participants (n = 4206), 522 (12.4%) had incident dementia (AD alone, AD comorbid with ADRD, or ADRD alone) as per the CCS \\\"gold standard\\\" assessments. Random Forest models, with a 1-year prediction window, achieved the best performance with an Area Under the Curve (AUC) of 0.67. Accuracy declined for dementia subtypes: AD/ADRD (AUC = 0.65); ADRD (AUC = 0.49). Accuracy improved when using ICD-based dementia diagnoses (AUC = 0.77).</p><p><strong>Discussion: </strong>Commonly available structured clinical data (without labs, notes, or prescription information) demonstrate modest ability to predict \\\"gold-standard\\\" research-based AD/ADRD diagnoses, corroborated by prior research. Using ICD diagnostic codes to identify dementia as done in the majority of machine learning dementia prediction models, as compared to \\\"gold-standard\\\" dementia diagnoses, can result in higher accuracy, but whether these models are predicting true dementia warrants further research.</p>\",\"PeriodicalId\":3,\"journal\":{\"name\":\"ACS Applied Electronic Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520673/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Electronic Materials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-024-02728-4\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02728-4","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Predicting the onset of Alzheimer's disease and related dementia using electronic health records: findings from the cache county study on memory in aging (1995-2008).
Introduction: Clinical notes, biomarkers, and neuroimaging have proven valuable in dementia prediction models. Whether commonly available structured clinical data can predict dementia is an emerging area of research. We aimed to predict gold-standard, research-based diagnoses of dementia including Alzheimer's disease (AD) and/or Alzheimer's disease related dementias (ADRD), in addition to ICD-based AD and/or ADRD diagnoses, in a well-phenotyped, population-based cohort using a machine learning approach.
Methods: Administrative healthcare data (k = 163 diagnostic features), in addition to census/vital record sociodemographic data (k = 6 features), were linked to the Cache County Study (CCS, 1995-2008).
Results: Among successfully linked UPDB-CCS participants (n = 4206), 522 (12.4%) had incident dementia (AD alone, AD comorbid with ADRD, or ADRD alone) as per the CCS "gold standard" assessments. Random Forest models, with a 1-year prediction window, achieved the best performance with an Area Under the Curve (AUC) of 0.67. Accuracy declined for dementia subtypes: AD/ADRD (AUC = 0.65); ADRD (AUC = 0.49). Accuracy improved when using ICD-based dementia diagnoses (AUC = 0.77).
Discussion: Commonly available structured clinical data (without labs, notes, or prescription information) demonstrate modest ability to predict "gold-standard" research-based AD/ADRD diagnoses, corroborated by prior research. Using ICD diagnostic codes to identify dementia as done in the majority of machine learning dementia prediction models, as compared to "gold-standard" dementia diagnoses, can result in higher accuracy, but whether these models are predicting true dementia warrants further research.