Dependability of Large Language Models in Cardiovascular Medicine: A Scoping Review.

IF 2.1 4区医学 Q2 ANESTHESIOLOGY

Journal of cardiothoracic and vascular anesthesia Pub Date : 2025-07-22 DOI:10.1053/j.jvca.2025.07.026

Ying Ying Jia, Lin Yan Pang, Ming Ming Bi, Xiang Lu Yang, Jian Ping Song

{"title":"Dependability of Large Language Models in Cardiovascular Medicine: A Scoping Review.","authors":"Ying Ying Jia, Lin Yan Pang, Ming Ming Bi, Xiang Lu Yang, Jian Ping Song","doi":"10.1053/j.jvca.2025.07.026","DOIUrl":null,"url":null,"abstract":"Background: The adoption of large language models (LLMs) in both clinical and consumer healthcare settings has surged exponentially. However, there remains limited evidence on their reliability and impact in cardiovascular practice.Objectives: This scoping review was designed to consolidate the existing biomedical literature on applicability, reliability, and quality improvement strategies for the integration of LLMs into the cardiovascular domain. Following Cochrane methodology and Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines, three electronic databases (PubMed, Web of Science, and Embase) were systematically searched to identify pertinent studies published between August 2020 and February 2025. Articles addressing the development, implementation, and assessment of LLMs in cardiovascular medicine were selected for comprehensive analysis.Results: Twenty-five eligible publications evaluated the performance of LLMs in responding to cardiology-related questions, encompassing parameters such as accuracy, response latency, indirectness, completeness, and so on. The assessment methodology varied considerably across studies. LLMs demonstrated potential utility in cardiovascular decision-making, myocarditis management, cardiac arrest diagnosis and treatment, and image differentiation.Conclusions: Although some LLM-generated responses to cardiovascular-related questions exhibit acceptable levels of quality, significant drawbacks persist. These include verbosity, inaccuracies, occasional misinformation, inconsistent outputs to identical questions, bias, and poor reproducibility. Overall, this work highlights the urgent need for continued refinement and validation.","PeriodicalId":15176,"journal":{"name":"Journal of cardiothoracic and vascular anesthesia","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of cardiothoracic and vascular anesthesia","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1053/j.jvca.2025.07.026","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The adoption of large language models (LLMs) in both clinical and consumer healthcare settings has surged exponentially. However, there remains limited evidence on their reliability and impact in cardiovascular practice.

Objectives: This scoping review was designed to consolidate the existing biomedical literature on applicability, reliability, and quality improvement strategies for the integration of LLMs into the cardiovascular domain. Following Cochrane methodology and Preferred Reporting Items for Systematic Reviews and Meta-analyses guidelines, three electronic databases (PubMed, Web of Science, and Embase) were systematically searched to identify pertinent studies published between August 2020 and February 2025. Articles addressing the development, implementation, and assessment of LLMs in cardiovascular medicine were selected for comprehensive analysis.

Results: Twenty-five eligible publications evaluated the performance of LLMs in responding to cardiology-related questions, encompassing parameters such as accuracy, response latency, indirectness, completeness, and so on. The assessment methodology varied considerably across studies. LLMs demonstrated potential utility in cardiovascular decision-making, myocarditis management, cardiac arrest diagnosis and treatment, and image differentiation.

Conclusions: Although some LLM-generated responses to cardiovascular-related questions exhibit acceptable levels of quality, significant drawbacks persist. These include verbosity, inaccuracies, occasional misinformation, inconsistent outputs to identical questions, bias, and poor reproducibility. Overall, this work highlights the urgent need for continued refinement and validation.

查看原文本刊更多论文

心血管医学中大语言模型的可靠性：一项范围综述。

背景：大型语言模型（llm）在临床和消费者医疗保健环境中的应用呈指数级增长。然而，关于它们在心血管实践中的可靠性和影响的证据仍然有限。目的：本综述旨在整合现有的生物医学文献，探讨llm整合到心血管领域的适用性、可靠性和质量改进策略。按照Cochrane方法和系统评价和荟萃分析指南的首选报告项目，系统检索了三个电子数据库（PubMed， Web of Science和Embase），以确定2020年8月至2025年2月期间发表的相关研究。有关心血管医学法学硕士的发展、实施和评估的文章被选中进行综合分析。结果：25篇符合条件的出版物评估了llm在回答心脏病相关问题方面的表现，包括准确性、反应延迟、间接性、完整性等参数。不同研究的评估方法差异很大。llm在心血管决策、心肌炎管理、心脏骤停诊断和治疗以及图像鉴别方面显示出潜在的效用。结论：尽管一些法学硕士对心血管相关问题的回应表现出可接受的质量水平，但显著的缺陷仍然存在。这些问题包括冗长、不准确、偶尔的错误信息、对相同问题的不一致输出、偏见和可重复性差。总的来说，这项工作强调了继续细化和验证的迫切需要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of cardiothoracic and vascular anesthesia 医学-呼吸系统

CiteScore

4.80

自引率

17.90%

发文量

606

审稿时长

37 days

期刊介绍： The Journal of Cardiothoracic and Vascular Anesthesia is primarily aimed at anesthesiologists who deal with patients undergoing cardiac, thoracic or vascular surgical procedures. JCVA features a multidisciplinary approach, with contributions from cardiac, vascular and thoracic surgeons, cardiologists, and other related specialists. Emphasis is placed on rapid publication of clinically relevant material.