Extracting Cognitive Impairment Assessment Information From Unstructured Notes in Electronic Health Records Using Natural Language Processing Tools: Validation with Clinical Assessment Data.

IF 3.4 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Clinical Epidemiology Pub Date : 2025-04-15 eCollection Date: 2025-01-01 DOI:10.2147/CLEP.S504259
Kuan-Yuan Wang, Mufaddal Mahesri, John Novoa-Laurentiev, Lily G Bessette, Cassandra York, Heidi Zakoul, Su Been Lee, Kerry Ngan, Li Zhou, Dae Hyun Kim, Kueiyu Joshua Lin
{"title":"Extracting Cognitive Impairment Assessment Information From Unstructured Notes in Electronic Health Records Using Natural Language Processing Tools: Validation with Clinical Assessment Data.","authors":"Kuan-Yuan Wang, Mufaddal Mahesri, John Novoa-Laurentiev, Lily G Bessette, Cassandra York, Heidi Zakoul, Su Been Lee, Kerry Ngan, Li Zhou, Dae Hyun Kim, Kueiyu Joshua Lin","doi":"10.2147/CLEP.S504259","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>We aimed to develop a Natural Language Processing (NLP) algorithm to extract cognitive scores from electronic health records (EHR) data and compare them with cognitive function recorded by Centers for Medicare & Medicaid Services (CMS)-mandated clinical assessments in nursing homes and home health visits.</p><p><strong>Patients and methods: </strong>We identified a cohort of Medicare beneficiaries who had either the Minimum Data Set (MDS) or Outcome and Assessment Information Set (OASIS) linked to EHR data from the Research Patient Data Registry (Mass General Brigham, Boston, MA) from 2010 to 2019. We applied an NLP approach to identify the Montreal Cognitive Assessment (MoCA) and the Mini-Mental State Examination (MMSE) scores from unstructured clinician notes in EHR. Using the NLP-extracted MoCA or MMSE scores from EHR, we compared mean differences of extracted MoCA or MMSE by cognition status determined by MDS (impaired vs intact cognition) and OASIS (severe impairment vs intact cognition) data, respectively.</p><p><strong>Results: </strong>Our study cohort had 7419 patients who had MDS (19.7%) or OASIS (80.3%) assessments, with a mean age of 80 (SD=7) years and 60% female. In EHR, the NLP algorithm extracted cognitive test scores with 97% accuracy (95% CI: 92-99%) for MoCA and 100% accuracy (95% CI: 84-100%) for MMSE. In MDS, the mean difference in extracted MoCA was -5.6 (95% CI: -8.7, -2.4, p=0.0008), and the mean difference in extracted MMSE was -7.9 (95% CI: -12.4, -3.5, p=0.0012). In OASIS, the mean difference in extracted MoCA and extracted MMSE was -4.8 (95% CI: -9.1, -0.6, p=0.0006) and -4.5 (95% CI: -9.5, -0.5, p=0.0182), respectively.</p><p><strong>Conclusion: </strong>We developed an NLP algorithm to accurately extract cognitive scores from unstructured EHR, and these extracted cognitive scores were well correlated with cognition function recorded in CMS-mandated clinical assessments. This could help researchers identify patients with various degrees of cognitive impairment in EHR-based research.</p>","PeriodicalId":10362,"journal":{"name":"Clinical Epidemiology","volume":"17 ","pages":"353-365"},"PeriodicalIF":3.4000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12009745/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/CLEP.S504259","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: We aimed to develop a Natural Language Processing (NLP) algorithm to extract cognitive scores from electronic health records (EHR) data and compare them with cognitive function recorded by Centers for Medicare & Medicaid Services (CMS)-mandated clinical assessments in nursing homes and home health visits.

Patients and methods: We identified a cohort of Medicare beneficiaries who had either the Minimum Data Set (MDS) or Outcome and Assessment Information Set (OASIS) linked to EHR data from the Research Patient Data Registry (Mass General Brigham, Boston, MA) from 2010 to 2019. We applied an NLP approach to identify the Montreal Cognitive Assessment (MoCA) and the Mini-Mental State Examination (MMSE) scores from unstructured clinician notes in EHR. Using the NLP-extracted MoCA or MMSE scores from EHR, we compared mean differences of extracted MoCA or MMSE by cognition status determined by MDS (impaired vs intact cognition) and OASIS (severe impairment vs intact cognition) data, respectively.

Results: Our study cohort had 7419 patients who had MDS (19.7%) or OASIS (80.3%) assessments, with a mean age of 80 (SD=7) years and 60% female. In EHR, the NLP algorithm extracted cognitive test scores with 97% accuracy (95% CI: 92-99%) for MoCA and 100% accuracy (95% CI: 84-100%) for MMSE. In MDS, the mean difference in extracted MoCA was -5.6 (95% CI: -8.7, -2.4, p=0.0008), and the mean difference in extracted MMSE was -7.9 (95% CI: -12.4, -3.5, p=0.0012). In OASIS, the mean difference in extracted MoCA and extracted MMSE was -4.8 (95% CI: -9.1, -0.6, p=0.0006) and -4.5 (95% CI: -9.5, -0.5, p=0.0182), respectively.

Conclusion: We developed an NLP algorithm to accurately extract cognitive scores from unstructured EHR, and these extracted cognitive scores were well correlated with cognition function recorded in CMS-mandated clinical assessments. This could help researchers identify patients with various degrees of cognitive impairment in EHR-based research.

使用自然语言处理工具从电子健康记录中的非结构化笔记中提取认知障碍评估信息:与临床评估数据验证。
目的:我们旨在开发一种自然语言处理(NLP)算法,从电子健康记录(EHR)数据中提取认知评分,并将其与医疗保险和医疗补助服务中心(CMS)在养老院和家庭健康访问中强制进行的临床评估记录的认知功能进行比较。患者和方法:我们确定了一组医疗保险受益人,他们的最小数据集(MDS)或结果和评估信息集(OASIS)与2010年至2019年研究患者数据登记处(Mass General Brigham, Boston, MA)的EHR数据相关。我们应用NLP方法从电子病历的非结构化临床医生笔记中识别蒙特利尔认知评估(MoCA)和迷你精神状态检查(MMSE)分数。利用nlp从EHR中提取的MoCA或MMSE评分,我们分别通过MDS(受损与完整认知)和OASIS(严重损伤与完整认知)数据确定的认知状态来比较提取的MoCA或MMSE的平均差异。结果:我们的研究队列有7419例MDS(19.7%)或OASIS(80.3%)评估患者,平均年龄为80岁(SD=7),其中60%为女性。在EHR中,NLP算法提取MoCA的认知测试分数的准确率为97% (95% CI: 92-99%), MMSE的准确率为100% (95% CI: 84-100%)。在MDS中,提取的MoCA的平均差异为-5.6 (95% CI: -8.7, -2.4, p=0.0008),提取的MMSE的平均差异为-7.9 (95% CI: -12.4, -3.5, p=0.0012)。在OASIS中,提取的MoCA和MMSE的平均差值分别为-4.8 (95% CI: -9.1, -0.6, p=0.0006)和-4.5 (95% CI: -9.5, -0.5, p=0.0182)。结论:我们开发了一种NLP算法,可以准确地从非结构化电子病历中提取认知评分,这些提取的认知评分与cms强制临床评估中记录的认知功能有很好的相关性。这可以帮助研究人员在基于电子病历的研究中识别患有不同程度认知障碍的患者。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Clinical Epidemiology
Clinical Epidemiology Medicine-Epidemiology
CiteScore
6.30
自引率
5.10%
发文量
169
审稿时长
16 weeks
期刊介绍: Clinical Epidemiology is an international, peer reviewed, open access journal. Clinical Epidemiology focuses on the application of epidemiological principles and questions relating to patients and clinical care in terms of prevention, diagnosis, prognosis, and treatment. Clinical Epidemiology welcomes papers covering these topics in form of original research and systematic reviews. Clinical Epidemiology has a special interest in international electronic medical patient records and other routine health care data, especially as applied to safety of medical interventions, clinical utility of diagnostic procedures, understanding short- and long-term clinical course of diseases, clinical epidemiological and biostatistical methods, and systematic reviews. When considering submission of a paper utilizing publicly-available data, authors should ensure that such studies add significantly to the body of knowledge and that they use appropriate validated methods for identifying health outcomes. The journal has launched special series describing existing data sources for clinical epidemiology, international health care systems and validation studies of algorithms based on databases and registries.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信