利用电子健康记录数据,开发并验证用于识别阿尔茨海默病患者的可计算表型。

IF 4 Q1 CLINICAL NEUROLOGY
Xing He, Ruoqi Wei, Yu Huang, Zhaoyi Chen, Tianchen Lyu, Sarah Bost, Jiayi Tong, Lu Li, Yujia Zhou, Zhao Li, Jingchuan Guo, Huilin Tang, Fei Wang, Steven DeKosky, Hua Xu, Yong Chen, Rui Zhang, Jie Xu, Yi Guo, Yonghui Wu, Jiang Bian
{"title":"利用电子健康记录数据,开发并验证用于识别阿尔茨海默病患者的可计算表型。","authors":"Xing He, Ruoqi Wei, Yu Huang, Zhaoyi Chen, Tianchen Lyu, Sarah Bost, Jiayi Tong, Lu Li, Yujia Zhou, Zhao Li, Jingchuan Guo, Huilin Tang, Fei Wang, Steven DeKosky, Hua Xu, Yong Chen, Rui Zhang, Jie Xu, Yi Guo, Yonghui Wu, Jiang Bian","doi":"10.1002/dad2.12613","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Alzheimer's disease (AD) is often misclassified in electronic health records (EHRs) when relying solely on diagnosis codes. This study aimed to develop a more accurate, computable phenotype (CP) for identifying AD patients using structured and unstructured EHR data.</p><p><strong>Methods: </strong>We used EHRs from the University of Florida Health (UFHealth) system and created rule-based CPs iteratively through manual chart reviews. The CPs were then validated using data from the University of Texas Health Science Center at Houston (UTHealth) and the University of Minnesota (UMN).</p><p><strong>Results: </strong>Our best-performing CP was \"<i>patient has at least 2 AD diagnoses and AD-related keywords in AD encounters</i>,\" with an F1-score of 0.817 at UF, 0.961 at UTHealth, and 0.623 at UMN, respectively.</p><p><strong>Discussion: </strong>We developed and validated rule-based CPs for AD identification with good performance, which will be crucial for studies that aim to use real-world data like EHRs.</p><p><strong>Highlights: </strong>Developed a computable phenotype (CP) to identify Alzheimer's disease (AD) patients using EHR data.Utilized both structured and unstructured EHR data to enhance CP accuracy.Achieved a high F1-score of 0.817 at UFHealth, and 0.961 and 0.623 at UTHealth and UMN.Validated the CP across different demographics, ensuring robustness and fairness.</p>","PeriodicalId":53226,"journal":{"name":"Alzheimer''s and Dementia: Diagnosis, Assessment and Disease Monitoring","volume":"16 3","pages":"e12613"},"PeriodicalIF":4.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11220631/pdf/","citationCount":"0","resultStr":"{\"title\":\"Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data.\",\"authors\":\"Xing He, Ruoqi Wei, Yu Huang, Zhaoyi Chen, Tianchen Lyu, Sarah Bost, Jiayi Tong, Lu Li, Yujia Zhou, Zhao Li, Jingchuan Guo, Huilin Tang, Fei Wang, Steven DeKosky, Hua Xu, Yong Chen, Rui Zhang, Jie Xu, Yi Guo, Yonghui Wu, Jiang Bian\",\"doi\":\"10.1002/dad2.12613\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Alzheimer's disease (AD) is often misclassified in electronic health records (EHRs) when relying solely on diagnosis codes. This study aimed to develop a more accurate, computable phenotype (CP) for identifying AD patients using structured and unstructured EHR data.</p><p><strong>Methods: </strong>We used EHRs from the University of Florida Health (UFHealth) system and created rule-based CPs iteratively through manual chart reviews. The CPs were then validated using data from the University of Texas Health Science Center at Houston (UTHealth) and the University of Minnesota (UMN).</p><p><strong>Results: </strong>Our best-performing CP was \\\"<i>patient has at least 2 AD diagnoses and AD-related keywords in AD encounters</i>,\\\" with an F1-score of 0.817 at UF, 0.961 at UTHealth, and 0.623 at UMN, respectively.</p><p><strong>Discussion: </strong>We developed and validated rule-based CPs for AD identification with good performance, which will be crucial for studies that aim to use real-world data like EHRs.</p><p><strong>Highlights: </strong>Developed a computable phenotype (CP) to identify Alzheimer's disease (AD) patients using EHR data.Utilized both structured and unstructured EHR data to enhance CP accuracy.Achieved a high F1-score of 0.817 at UFHealth, and 0.961 and 0.623 at UTHealth and UMN.Validated the CP across different demographics, ensuring robustness and fairness.</p>\",\"PeriodicalId\":53226,\"journal\":{\"name\":\"Alzheimer''s and Dementia: Diagnosis, Assessment and Disease Monitoring\",\"volume\":\"16 3\",\"pages\":\"e12613\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11220631/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Alzheimer''s and Dementia: Diagnosis, Assessment and Disease Monitoring\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/dad2.12613\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Alzheimer''s and Dementia: Diagnosis, Assessment and Disease Monitoring","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/dad2.12613","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

导言:在电子健康记录(EHR)中,如果仅仅依靠诊断代码,阿尔茨海默病(AD)往往会被错误分类。本研究旨在利用结构化和非结构化的电子病历数据开发一种更准确的可计算表型(CP),用于识别阿尔茨海默病患者:方法:我们使用佛罗里达大学健康(UFHealth)系统的电子病历,通过人工病历审查反复创建基于规则的 CP。然后使用德克萨斯大学休斯顿卫生科学中心(UTHealth)和明尼苏达大学(UMN)的数据对 CP 进行验证:结果:我们的最佳CP是 "患者在AD会诊中至少有2项AD诊断和AD相关关键词",在UF的F1分数为0.817,在UTHealth的F1分数为0.961,在UMN的F1分数为0.623:我们开发并验证了基于规则的AD识别表型,其性能良好,这对于旨在使用EHR等真实世界数据的研究至关重要:利用结构化和非结构化电子病历数据来提高 CP 的准确性。在 UFHealth 取得了 0.817 的高 F1 分数,在 UTHealth 和 UMN 分别取得了 0.961 和 0.623 的高 F1 分数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data.

Introduction: Alzheimer's disease (AD) is often misclassified in electronic health records (EHRs) when relying solely on diagnosis codes. This study aimed to develop a more accurate, computable phenotype (CP) for identifying AD patients using structured and unstructured EHR data.

Methods: We used EHRs from the University of Florida Health (UFHealth) system and created rule-based CPs iteratively through manual chart reviews. The CPs were then validated using data from the University of Texas Health Science Center at Houston (UTHealth) and the University of Minnesota (UMN).

Results: Our best-performing CP was "patient has at least 2 AD diagnoses and AD-related keywords in AD encounters," with an F1-score of 0.817 at UF, 0.961 at UTHealth, and 0.623 at UMN, respectively.

Discussion: We developed and validated rule-based CPs for AD identification with good performance, which will be crucial for studies that aim to use real-world data like EHRs.

Highlights: Developed a computable phenotype (CP) to identify Alzheimer's disease (AD) patients using EHR data.Utilized both structured and unstructured EHR data to enhance CP accuracy.Achieved a high F1-score of 0.817 at UFHealth, and 0.961 and 0.623 at UTHealth and UMN.Validated the CP across different demographics, ensuring robustness and fairness.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.80
自引率
7.50%
发文量
101
审稿时长
8 weeks
期刊介绍: Alzheimer''s & Dementia: Diagnosis, Assessment & Disease Monitoring (DADM) is an open access, peer-reviewed, journal from the Alzheimer''s Association® that will publish new research that reports the discovery, development and validation of instruments, technologies, algorithms, and innovative processes. Papers will cover a range of topics interested in the early and accurate detection of individuals with memory complaints and/or among asymptomatic individuals at elevated risk for various forms of memory disorders. The expectation for published papers will be to translate fundamental knowledge about the neurobiology of the disease into practical reports that describe both the conceptual and methodological aspects of the submitted scientific inquiry. Published topics will explore the development of biomarkers, surrogate markers, and conceptual/methodological challenges. Publication priority will be given to papers that 1) describe putative surrogate markers that accurately track disease progression, 2) biomarkers that fulfill international regulatory requirements, 3) reports from large, well-characterized population-based cohorts that comprise the heterogeneity and diversity of asymptomatic individuals and 4) algorithmic development that considers multi-marker arrays (e.g., integrated-omics, genetics, biofluids, imaging, etc.) and advanced computational analytics and technologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信