Identification of Patients With Congestive Heart Failure From the Electronic Health Records of Two Hospitals: Retrospective Study.

IF 3.8 3区医学 Q2 MEDICAL INFORMATICS

JMIR Medical Informatics Pub Date : 2025-04-10 DOI:10.2196/64113

Daniel Sumsion, Elijah Davis, Marta Fernandes, Ruoqi Wei, Rebecca Milde, Jet Malou Veltink, Wan-Yee Kong, Yiwen Xiong, Samvrit Rao, Tara Westover, Lydia Petersen, Niels Turley, Arjun Singh, Stephanie Buss, Shibani Mukerji, Sahar Zafar, Sudeshna Das, Valdery Moura Junior, Manohar Ghanta, Aditya Gupta, Jennifer Kim, Katie Stone, Emmanuel Mignot, Dennis Hwang, Lynn Marie Trotti, Gari D Clifford, Umakanth Katwa, Robert Thomas, M Brandon Westover, Haoqi Sun

{"title":"Identification of Patients With Congestive Heart Failure From the Electronic Health Records of Two Hospitals: Retrospective Study.","authors":"Daniel Sumsion, Elijah Davis, Marta Fernandes, Ruoqi Wei, Rebecca Milde, Jet Malou Veltink, Wan-Yee Kong, Yiwen Xiong, Samvrit Rao, Tara Westover, Lydia Petersen, Niels Turley, Arjun Singh, Stephanie Buss, Shibani Mukerji, Sahar Zafar, Sudeshna Das, Valdery Moura Junior, Manohar Ghanta, Aditya Gupta, Jennifer Kim, Katie Stone, Emmanuel Mignot, Dennis Hwang, Lynn Marie Trotti, Gari D Clifford, Umakanth Katwa, Robert Thomas, M Brandon Westover, Haoqi Sun","doi":"10.2196/64113","DOIUrl":null,"url":null,"abstract":"Background: Congestive heart failure (CHF) is a common cause of hospital admissions. Medical records contain valuable information about CHF, but manual chart review is time-consuming. Claims databases (using International Classification of Diseases [ICD] codes) provide a scalable alternative but are less accurate. Automated analysis of medical records through natural language processing (NLP) enables more efficient adjudication but has not yet been validated across multiple sites.Objective: We seek to accurately classify the diagnosis of CHF based on structured and unstructured data from each patient, including medications, ICD codes, and information extracted through NLP of notes left by providers, by comparing the effectiveness of several machine learning models.Methods: We developed an NLP model to identify CHF from medical records using electronic health records (EHRs) from two hospitals (Mass General Hospital and Beth Israel Deaconess Medical Center; from 2010 to 2023), with 2800 clinical visit notes from 1821 patients. We trained and compared the performance of logistic regression, random forests, and RoBERTa models. We measured model performance using area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC). These models were also externally validated by training the data on one hospital sample and testing on the other, and an overall estimated error was calculated using a completely random sample from both hospitals.Results: The average age of the patients was 66.7 (SD 17.2) years; 978 (54.3%) out of 1821 patients were female. The logistic regression model achieved the best performance using a combination of ICD codes, medications, and notes, with an AUROC of 0.968 (95% CI 0.940-0.982) and an AUPRC of 0.921 (95% CI 0.835-0.969). The models that only used ICD codes or medications had lower performance. The estimated overall error rate in a random EHR sample was 1.6%. The model also showed high external validity from training on Mass General Hospital data and testing on Beth Israel Deaconess Medical Center data (AUROC 0.927, 95% CI 0.908-0.944) and vice versa (AUROC 0.968, 95% CI 0.957-0.976).Conclusions: The proposed EHR-based phenotyping model for CHF achieved excellent performance, external validity, and generalization across two institutions. The model enables multiple downstream uses, paving the way for large-scale studies of CHF treatment effectiveness, comorbidities, outcomes, and mechanisms.","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e64113"},"PeriodicalIF":3.8000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12022513/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/64113","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Congestive heart failure (CHF) is a common cause of hospital admissions. Medical records contain valuable information about CHF, but manual chart review is time-consuming. Claims databases (using International Classification of Diseases [ICD] codes) provide a scalable alternative but are less accurate. Automated analysis of medical records through natural language processing (NLP) enables more efficient adjudication but has not yet been validated across multiple sites.

Objective: We seek to accurately classify the diagnosis of CHF based on structured and unstructured data from each patient, including medications, ICD codes, and information extracted through NLP of notes left by providers, by comparing the effectiveness of several machine learning models.

Methods: We developed an NLP model to identify CHF from medical records using electronic health records (EHRs) from two hospitals (Mass General Hospital and Beth Israel Deaconess Medical Center; from 2010 to 2023), with 2800 clinical visit notes from 1821 patients. We trained and compared the performance of logistic regression, random forests, and RoBERTa models. We measured model performance using area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC). These models were also externally validated by training the data on one hospital sample and testing on the other, and an overall estimated error was calculated using a completely random sample from both hospitals.

Results: The average age of the patients was 66.7 (SD 17.2) years; 978 (54.3%) out of 1821 patients were female. The logistic regression model achieved the best performance using a combination of ICD codes, medications, and notes, with an AUROC of 0.968 (95% CI 0.940-0.982) and an AUPRC of 0.921 (95% CI 0.835-0.969). The models that only used ICD codes or medications had lower performance. The estimated overall error rate in a random EHR sample was 1.6%. The model also showed high external validity from training on Mass General Hospital data and testing on Beth Israel Deaconess Medical Center data (AUROC 0.927, 95% CI 0.908-0.944) and vice versa (AUROC 0.968, 95% CI 0.957-0.976).

Conclusions: The proposed EHR-based phenotyping model for CHF achieved excellent performance, external validity, and generalization across two institutions. The model enables multiple downstream uses, paving the way for large-scale studies of CHF treatment effectiveness, comorbidities, outcomes, and mechanisms.

查看原文本刊更多论文

从两家医院的电子病历中识别充血性心力衰竭患者：回顾性研究

背景：充血性心力衰竭（CHF）是住院的常见原因。医疗记录包含有关CHF的宝贵信息，但手动检查图表非常耗时。索赔数据库（使用国际疾病分类[ICD]代码）提供了一个可扩展的替代方案，但不太准确。通过自然语言处理（NLP）对医疗记录进行自动分析，可以实现更有效的裁决，但尚未在多个站点进行验证。目的：我们试图通过比较几种机器学习模型的有效性，基于每个患者的结构化和非结构化数据，包括药物、ICD代码和通过NLP提取的提供者留下的笔记信息，对CHF的诊断进行准确分类。方法：我们开发了一个NLP模型，从两家医院(麻省总医院和贝斯以色列女执事医疗中心；从2010年到2023年)，收集1821名患者的2800份临床就诊记录。我们训练并比较了逻辑回归、随机森林和RoBERTa模型的性能。我们使用接收者工作特征曲线下面积（AUROC）和精确召回曲线下面积（AUPRC）来测量模型的性能。这些模型还通过在一家医院样本上训练数据并在另一家医院样本上进行测试进行了外部验证，并使用来自两家医院的完全随机样本计算了总体估计误差。结果：患者平均年龄66.7岁（SD 17.2）；1821例患者中女性978例（54.3%）。使用ICD代码、药物和注释组合时，logistic回归模型达到最佳效果，AUROC为0.968 (95% CI 0.940-0.982)， AUPRC为0.921 （95% CI 0.835-0.969）。仅使用ICD代码或药物的模型表现较差。在随机电子病历样本中估计的总体错误率为1.6%。通过对麻省总医院数据的训练和对贝斯以色列女执事医疗中心数据的测试，模型也显示出较高的外部效度（AUROC 0.927, 95% CI 0.908-0.944），反之亦然（AUROC 0.968, 95% CI 0.957-0.976）。结论：提出的基于ehr的CHF表型模型具有优异的性能、外部有效性和跨两个机构的通用性。该模型可用于多种下游用途，为CHF治疗效果、合并症、结果和机制的大规模研究铺平了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR Medical Informatics Medicine-Health Informatics

CiteScore

7.90

自引率

3.10%

发文量

173

审稿时长

12 weeks

期刊介绍： JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals. Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.