Large Language Models in Intracardiac Electrogram Interpretation: A New Frontier in Cardiac Diagnostics for Pacemaker Patients.

IF 1.5 4区 医学 Q3 CARDIAC & CARDIOVASCULAR SYSTEMS
Serdar Bozyel, Ahmet Berk Duman, Şadiye Nur Dalgıç, Abdülcebar Şipal, Faysal Şaylık, Şükriye Ebru Gölcük Önder, Metin Çağdaş, Tümer Erdem Güler, Tolga Aksu, Ulas Bağcı, Nurgül Keser
{"title":"Large Language Models in Intracardiac Electrogram Interpretation: A New Frontier in Cardiac Diagnostics for Pacemaker Patients.","authors":"Serdar Bozyel, Ahmet Berk Duman, Şadiye Nur Dalgıç, Abdülcebar Şipal, Faysal Şaylık, Şükriye Ebru Gölcük Önder, Metin Çağdaş, Tümer Erdem Güler, Tolga Aksu, Ulas Bağcı, Nurgül Keser","doi":"10.14744/AnatolJCardiol.2025.5238","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Interpreting intracardiac electrograms (EGMs) requires expertise that many cardiologists lack. Artificial intelligence models like ChatGPT-4o may improve diagnostic accuracy. This study evaluates ChatGPT-4o's performance in EGM interpretation across 4 scenarios (A-D) with increasing contextual information.</p><p><strong>Methods: </strong>Twenty EGM cases from The EHRA Book of Pacemaker, ICD, and CRT Troubleshooting were analyzed using ChatGPT-4o. Ten predefined features were assessed in Scenarios A and B, while Scenarios C and D required 20 correct responses per scenario across all cases. Performance was evaluated over 2 months using McNemar's test, Cohen's Kappa, and Prevalence- and Bias-Adjusted Kappa (PABAK).</p><p><strong>Results: </strong>Providing clinical context enhanced ChatGPT-4o's accuracy, improving from 57% (Scenario A) to 66% (Scenario B). \"No Answer\" rates decreased from 19.5% to 8%, while false responses increased from 8.5% to 11%, suggesting occasional misinterpretation. Agreement in Scenario A showed high reliability for atrial activity (κ = 0.7) and synchronization (κ = 0.7), but poor for chamber (κ = -0.26). In Scenario B, understanding achieved near-perfect agreement (Prevalence-Adjustment and Bias-Adjustment Kappa (PABAK) = 1), while ventricular activity remained unreliable (κ = -0.11). In Scenarios C (30%) and D (25%), accuracy was lower, and agreement between baseline and second-month responses remained fair (κ = 0.285 and 0.3, respectively), indicating limited consistency in complex decision-making tasks.</p><p><strong>Conclusion: </strong>This study provides the first systematic evaluation of ChatGPT-4o in EGM interpretation, demonstrating promising accuracy and reliability in structured tasks. While the model integrated contextual data well, its adaptability to complex cases was limited. Further optimization and validation are needed before clinical use.</p>","PeriodicalId":7835,"journal":{"name":"Anatolian Journal of Cardiology","volume":" ","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anatolian Journal of Cardiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.14744/AnatolJCardiol.2025.5238","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Interpreting intracardiac electrograms (EGMs) requires expertise that many cardiologists lack. Artificial intelligence models like ChatGPT-4o may improve diagnostic accuracy. This study evaluates ChatGPT-4o's performance in EGM interpretation across 4 scenarios (A-D) with increasing contextual information.

Methods: Twenty EGM cases from The EHRA Book of Pacemaker, ICD, and CRT Troubleshooting were analyzed using ChatGPT-4o. Ten predefined features were assessed in Scenarios A and B, while Scenarios C and D required 20 correct responses per scenario across all cases. Performance was evaluated over 2 months using McNemar's test, Cohen's Kappa, and Prevalence- and Bias-Adjusted Kappa (PABAK).

Results: Providing clinical context enhanced ChatGPT-4o's accuracy, improving from 57% (Scenario A) to 66% (Scenario B). "No Answer" rates decreased from 19.5% to 8%, while false responses increased from 8.5% to 11%, suggesting occasional misinterpretation. Agreement in Scenario A showed high reliability for atrial activity (κ = 0.7) and synchronization (κ = 0.7), but poor for chamber (κ = -0.26). In Scenario B, understanding achieved near-perfect agreement (Prevalence-Adjustment and Bias-Adjustment Kappa (PABAK) = 1), while ventricular activity remained unreliable (κ = -0.11). In Scenarios C (30%) and D (25%), accuracy was lower, and agreement between baseline and second-month responses remained fair (κ = 0.285 and 0.3, respectively), indicating limited consistency in complex decision-making tasks.

Conclusion: This study provides the first systematic evaluation of ChatGPT-4o in EGM interpretation, demonstrating promising accuracy and reliability in structured tasks. While the model integrated contextual data well, its adaptability to complex cases was limited. Further optimization and validation are needed before clinical use.

心内电图解释的大语言模型:心脏起搏器患者心脏诊断的新前沿。
背景:解读心内电图(EGMs)需要许多心脏病专家缺乏的专业知识。chatgpt - 40等人工智能模型可能会提高诊断的准确性。本研究评估了chatgpt - 40在4种情景(A-D)中随着上下文信息的增加在EGM解释中的表现。方法:采用chatgpt - 40对《心脏起搏器、ICD、CRT故障诊断手册》中的20例EGM病例进行分析。在情景A和B中评估了10个预定义的特征,而情景C和D在所有情况下每个情景需要20个正确的回答。使用McNemar's test、Cohen's Kappa和患病率与偏差调整Kappa (PABAK)对2个月后的表现进行评估。结果:提供临床背景可提高chatgpt - 40的准确性,从57%(情景A)提高到66%(情景B)。“没有回答”的比例从19.5%下降到8%,而错误回答的比例从8.5%上升到11%,这表明偶尔会出现误解。情景A的一致性显示心房活动(κ = 0.7)和同步(κ = 0.7)的可靠性较高,但房室(κ = -0.26)的可靠性较差。在情景B中,理解达到了近乎完美的一致(患病率-调整和偏倚-调整Kappa (PABAK) = 1),而心室活动仍然不可靠(κ = -0.11)。在情景C(30%)和D(25%)中,准确性较低,基线和第二个月反应之间的一致性保持公平(κ分别= 0.285和0.3),表明复杂决策任务的一致性有限。结论:本研究首次对chatgpt - 40在EGM解释中的应用进行了系统评估,证明了其在结构化任务中的准确性和可靠性。虽然该模型能很好地整合上下文数据,但对复杂情况的适应性有限。临床应用前需进一步优化和验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Anatolian Journal of Cardiology
Anatolian Journal of Cardiology CARDIAC & CARDIOVASCULAR SYSTEMS-
CiteScore
2.30
自引率
7.70%
发文量
270
审稿时长
12 weeks
期刊介绍: The Anatolian Journal of Cardiology is an international monthly periodical on cardiology published on independent, unbiased, double-blinded and peer-review principles. The journal’s publication language is English. The Anatolian Journal of Cardiology aims to publish qualified and original clinical, experimental and basic research on cardiology at the international level. The journal’s scope also covers editorial comments, reviews of innovations in medical education and practice, case reports, original images, scientific letters, educational articles, letters to the editor, articles on publication ethics, diagnostic puzzles, and issues in social cardiology. The target readership includes academic members, specialists, residents, and general practitioners working in the fields of adult cardiology, pediatric cardiology, cardiovascular surgery and internal medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信