ChatGPT-4 extraction of heart failure symptoms and signs from electronic health records

IF 5.6 2区医学 Q1 CARDIAC & CARDIOVASCULAR SYSTEMS

Progress in cardiovascular diseases Pub Date : 2024-11-01 DOI:10.1016/j.pcad.2024.10.010

T. Elizabeth Workman , Ali Ahmed , Helen M. Sheriff , Venkatesh K. Raman , Sijian Zhang , Yijun Shao , Charles Faselis , Gregg C. Fonarow , Qing Zeng-Treitler

{"title":"ChatGPT-4 extraction of heart failure symptoms and signs from electronic health records","authors":"T. Elizabeth Workman , Ali Ahmed , Helen M. Sheriff , Venkatesh K. Raman , Sijian Zhang , Yijun Shao , Charles Faselis , Gregg C. Fonarow , Qing Zeng-Treitler","doi":"10.1016/j.pcad.2024.10.010","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Natural language processing (NLP) can facilitate research utilizing data from electronic health records (EHRs). Large language models can potentially improve NLP applications leveraging EHR notes. The objective of this study was to assess the performance of zero-shot learning using Chat Generative Pre-trained Transformer 4 (ChatGPT-4) for extraction of symptoms and signs, and compare its performance to baseline machine learning and rule-based methods developed using annotated data.</div></div><div><h3>Methods and results</h3><div>From unstructured clinical notes of the national EHR data of the Veterans healthcare system, we extracted 1999 text snippets containing relevant keywords for heart failure symptoms and signs, which were then annotated by two clinicians. We also created 102 synthetic snippets that were semantically similar to snippets randomly selected from the original 1999 snippets. The authors applied zero-shot learning, using two different forms of prompt engineering in a symptom and sign extraction task with ChatGPT-4, utilizing the synthetic snippets. For comparison, baseline models using machine learning and rule-based methods were trained using the original 1999 annotated text snippets, and then used to classify the 102 synthetic snippets.</div><div>The best zero-shot learning application achieved 90.6 % precision, 100 % recall, and 95 % F1 score, outperforming the best baseline method, which achieved 54.9 % precision, 82.4 % recall, and 65.5 % F1 score. Prompt style and temperature settings influenced zero-shot learning performance.</div></div><div><h3>Conclusions</h3><div>Zero-shot learning utilizing ChatGPT-4 significantly outperformed traditional machine learning and rule-based NLP. Prompt type and temperature settings affected zero-shot learning performance. These findings suggest a more efficient means of symptoms and signs extraction than traditional machine learning and rule-based methods.</div></div>","PeriodicalId":21156,"journal":{"name":"Progress in cardiovascular diseases","volume":"87 ","pages":"Pages 44-49"},"PeriodicalIF":5.6000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Progress in cardiovascular diseases","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0033062024001476","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Background

Natural language processing (NLP) can facilitate research utilizing data from electronic health records (EHRs). Large language models can potentially improve NLP applications leveraging EHR notes. The objective of this study was to assess the performance of zero-shot learning using Chat Generative Pre-trained Transformer 4 (ChatGPT-4) for extraction of symptoms and signs, and compare its performance to baseline machine learning and rule-based methods developed using annotated data.

Methods and results

From unstructured clinical notes of the national EHR data of the Veterans healthcare system, we extracted 1999 text snippets containing relevant keywords for heart failure symptoms and signs, which were then annotated by two clinicians. We also created 102 synthetic snippets that were semantically similar to snippets randomly selected from the original 1999 snippets. The authors applied zero-shot learning, using two different forms of prompt engineering in a symptom and sign extraction task with ChatGPT-4, utilizing the synthetic snippets. For comparison, baseline models using machine learning and rule-based methods were trained using the original 1999 annotated text snippets, and then used to classify the 102 synthetic snippets.

The best zero-shot learning application achieved 90.6 % precision, 100 % recall, and 95 % F1 score, outperforming the best baseline method, which achieved 54.9 % precision, 82.4 % recall, and 65.5 % F1 score. Prompt style and temperature settings influenced zero-shot learning performance.

Conclusions

Zero-shot learning utilizing ChatGPT-4 significantly outperformed traditional machine learning and rule-based NLP. Prompt type and temperature settings affected zero-shot learning performance. These findings suggest a more efficient means of symptoms and signs extraction than traditional machine learning and rule-based methods.

查看原文本刊更多论文

ChatGPT-4 从电子健康记录中提取心衰症状和体征。

背景：自然语言处理（NLP）可以促进利用电子健康记录（EHR）数据的研究。大型语言模型有可能改善利用电子健康记录笔记的 NLP 应用。本研究的目的是评估使用 Chat Generative Pre-trained Transformer 4 (ChatGPT-4) 进行零镜头学习提取症状和体征的性能，并将其性能与使用注释数据开发的基线机器学习和基于规则的方法进行比较：我们从退伍军人医疗保健系统的国家电子病历数据的非结构化临床笔记中提取了 1999 个包含心衰症状和体征相关关键词的文本片段，然后由两名临床医生对这些片段进行了注释。我们还创建了 102 个合成片段，这些片段在语义上与从 1999 年原始片段中随机选取的片段相似。作者在 ChatGPT-4 的症状和体征提取任务中使用了两种不同形式的提示工程，并利用合成片段进行了零点学习。为了进行比较，使用机器学习和基于规则的方法对 1999 年原始注释文本片段进行了基线模型训练，然后用于对 102 个合成片段进行分类。最佳零点学习应用的精确度为 90.6%，召回率为 100%，F1 分数为 95%，优于最佳基线方法，后者的精确度为 54.9%，召回率为 82.4%，F1 分数为 65.5%。提示风格和温度设置影响了零点学习的性能：结论：利用 ChatGPT-4 进行的零点学习明显优于传统的机器学习和基于规则的 NLP。提示类型和温度设置影响了零点学习性能。这些研究结果表明，与传统的机器学习和基于规则的方法相比，零点学习是一种更有效的症状和体征提取方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Progress in cardiovascular diseases 医学-心血管系统

CiteScore

10.90

自引率

6.60%

发文量

审稿时长

7 days

期刊介绍： Progress in Cardiovascular Diseases provides comprehensive coverage of a single topic related to heart and circulatory disorders in each issue. Some issues include special articles, definitive reviews that capture the state of the art in the management of particular clinical problems in cardiology.