Dependability of Large Language Models in Cardiovascular Medicine: A Scoping Review.

IF 2.1 4区 医学 Q2 ANESTHESIOLOGY
Ying Ying Jia, Lin Yan Pang, Ming Ming Bi, Xiang Lu Yang, Jian Ping Song
{"title":"Dependability of Large Language Models in Cardiovascular Medicine: A Scoping Review.","authors":"Ying Ying Jia, Lin Yan Pang, Ming Ming Bi, Xiang Lu Yang, Jian Ping Song","doi":"10.1053/j.jvca.2025.07.026","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The adoption of large language models (LLMs) in both clinical and consumer healthcare settings has surged exponentially. However, there remains limited evidence on their reliability and impact in cardiovascular practice.</p><p><strong>Objectives: </strong>This scoping review was designed to consolidate the existing biomedical literature on applicability, reliability, and quality improvement strategies for the integration of LLMs into the cardiovascular domain. Following Cochrane methodology and Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines, three electronic databases (PubMed, Web of Science, and Embase) were systematically searched to identify pertinent studies published between August 2020 and February 2025. Articles addressing the development, implementation, and assessment of LLMs in cardiovascular medicine were selected for comprehensive analysis.</p><p><strong>Results: </strong>Twenty-five eligible publications evaluated the performance of LLMs in responding to cardiology-related questions, encompassing parameters such as accuracy, response latency, indirectness, completeness, and so on. The assessment methodology varied considerably across studies. LLMs demonstrated potential utility in cardiovascular decision-making, myocarditis management, cardiac arrest diagnosis and treatment, and image differentiation.</p><p><strong>Conclusions: </strong>Although some LLM-generated responses to cardiovascular-related questions exhibit acceptable levels of quality, significant drawbacks persist. These include verbosity, inaccuracies, occasional misinformation, inconsistent outputs to identical questions, bias, and poor reproducibility. Overall, this work highlights the urgent need for continued refinement and validation.</p>","PeriodicalId":15176,"journal":{"name":"Journal of cardiothoracic and vascular anesthesia","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of cardiothoracic and vascular anesthesia","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1053/j.jvca.2025.07.026","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The adoption of large language models (LLMs) in both clinical and consumer healthcare settings has surged exponentially. However, there remains limited evidence on their reliability and impact in cardiovascular practice.

Objectives: This scoping review was designed to consolidate the existing biomedical literature on applicability, reliability, and quality improvement strategies for the integration of LLMs into the cardiovascular domain. Following Cochrane methodology and Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines, three electronic databases (PubMed, Web of Science, and Embase) were systematically searched to identify pertinent studies published between August 2020 and February 2025. Articles addressing the development, implementation, and assessment of LLMs in cardiovascular medicine were selected for comprehensive analysis.

Results: Twenty-five eligible publications evaluated the performance of LLMs in responding to cardiology-related questions, encompassing parameters such as accuracy, response latency, indirectness, completeness, and so on. The assessment methodology varied considerably across studies. LLMs demonstrated potential utility in cardiovascular decision-making, myocarditis management, cardiac arrest diagnosis and treatment, and image differentiation.

Conclusions: Although some LLM-generated responses to cardiovascular-related questions exhibit acceptable levels of quality, significant drawbacks persist. These include verbosity, inaccuracies, occasional misinformation, inconsistent outputs to identical questions, bias, and poor reproducibility. Overall, this work highlights the urgent need for continued refinement and validation.

心血管医学中大语言模型的可靠性:一项范围综述。
背景:大型语言模型(llm)在临床和消费者医疗保健环境中的应用呈指数级增长。然而,关于它们在心血管实践中的可靠性和影响的证据仍然有限。目的:本综述旨在整合现有的生物医学文献,探讨llm整合到心血管领域的适用性、可靠性和质量改进策略。按照Cochrane方法和系统评价和荟萃分析指南的首选报告项目,系统检索了三个电子数据库(PubMed, Web of Science和Embase),以确定2020年8月至2025年2月期间发表的相关研究。有关心血管医学法学硕士的发展、实施和评估的文章被选中进行综合分析。结果:25篇符合条件的出版物评估了llm在回答心脏病相关问题方面的表现,包括准确性、反应延迟、间接性、完整性等参数。不同研究的评估方法差异很大。llm在心血管决策、心肌炎管理、心脏骤停诊断和治疗以及图像鉴别方面显示出潜在的效用。结论:尽管一些法学硕士对心血管相关问题的回应表现出可接受的质量水平,但显著的缺陷仍然存在。这些问题包括冗长、不准确、偶尔的错误信息、对相同问题的不一致输出、偏见和可重复性差。总的来说,这项工作强调了继续细化和验证的迫切需要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.80
自引率
17.90%
发文量
606
审稿时长
37 days
期刊介绍: The Journal of Cardiothoracic and Vascular Anesthesia is primarily aimed at anesthesiologists who deal with patients undergoing cardiac, thoracic or vascular surgical procedures. JCVA features a multidisciplinary approach, with contributions from cardiac, vascular and thoracic surgeons, cardiologists, and other related specialists. Emphasis is placed on rapid publication of clinically relevant material.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信