Evaluating the Ability of Artificial Intelligence to Address Nuanced Cardiology Subspecialty Questions: ChatGPT and CathSAP

Saumya Nanda MBBS , Khaled Abaza MD , Pyae Hein Kyaw MBBS , Robert Frankel MD , Partha Sardar MD , Sahil A. Parikh MD , Tharun Shyam MBBS , Saurav Chatterjee MD
{"title":"Evaluating the Ability of Artificial Intelligence to Address Nuanced Cardiology Subspecialty Questions: ChatGPT and CathSAP","authors":"Saumya Nanda MBBS ,&nbsp;Khaled Abaza MD ,&nbsp;Pyae Hein Kyaw MBBS ,&nbsp;Robert Frankel MD ,&nbsp;Partha Sardar MD ,&nbsp;Sahil A. Parikh MD ,&nbsp;Tharun Shyam MBBS ,&nbsp;Saurav Chatterjee MD","doi":"10.1016/j.jscai.2025.102563","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Recent developments in artificial intelligence (AI), particularly in large language models, have shown promise in various fields, including health care. However, their performance on specialized medical board examinations, such as interventional cardiology assessments, remains relatively unexplored.</div></div><div><h3>Methods</h3><div>A cross-sectional study was conducted using a data set comprising 360 questions from the Cath Self Assessment Program (CathSAP) question bank. This study aimed to assess the overall performance of Chat Generative Pre-trained Transformer (ChatGPT) and compare it to that of average test takers. Additionally, the study evaluated the impact of pertinent educational materials on ChatGPT’s responses, both before and after exposure. The primary outcome measures included ChatGPT’s overall percentage score on the CathSAP examination and its performance across various subsections. Statistical significance was determined using the Kruskal-Wallis equality-of-populations rank test.</div></div><div><h3>Results</h3><div>Initially, ChatGPT achieved an overall score of 54.44% on the CathSAP exam, which improved significantly to 79.16% after exposure to relevant textual content. The improvement was statistically significant (<em>P</em> = .0003). Notably, the improved score was comparable with the average score achieved by typical test takers (as reported by CathSAP). ChatGPT demonstrated proficiency in sections covering basic science, pharmacology, and miscellaneous topics, although it struggled with anatomy, anatomic variants, and anatomic pathology questions.</div></div><div><h3>Conclusions</h3><div>The study demonstrates ChatGPT’s potential for learning and adapting to medical examination scenarios, with a notable enhancement in performance after exposure to educational materials. However, limitations such as the model’s inability to process certain visual materials and potential biases in AI models warrant further consideration. These findings underscore the need for continued research to optimize the use of AI in medical education and assessment.</div></div>","PeriodicalId":73990,"journal":{"name":"Journal of the Society for Cardiovascular Angiography & Interventions","volume":"4 3","pages":"Article 102563"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Society for Cardiovascular Angiography & Interventions","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772930325000043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background

Recent developments in artificial intelligence (AI), particularly in large language models, have shown promise in various fields, including health care. However, their performance on specialized medical board examinations, such as interventional cardiology assessments, remains relatively unexplored.

Methods

A cross-sectional study was conducted using a data set comprising 360 questions from the Cath Self Assessment Program (CathSAP) question bank. This study aimed to assess the overall performance of Chat Generative Pre-trained Transformer (ChatGPT) and compare it to that of average test takers. Additionally, the study evaluated the impact of pertinent educational materials on ChatGPT’s responses, both before and after exposure. The primary outcome measures included ChatGPT’s overall percentage score on the CathSAP examination and its performance across various subsections. Statistical significance was determined using the Kruskal-Wallis equality-of-populations rank test.

Results

Initially, ChatGPT achieved an overall score of 54.44% on the CathSAP exam, which improved significantly to 79.16% after exposure to relevant textual content. The improvement was statistically significant (P = .0003). Notably, the improved score was comparable with the average score achieved by typical test takers (as reported by CathSAP). ChatGPT demonstrated proficiency in sections covering basic science, pharmacology, and miscellaneous topics, although it struggled with anatomy, anatomic variants, and anatomic pathology questions.

Conclusions

The study demonstrates ChatGPT’s potential for learning and adapting to medical examination scenarios, with a notable enhancement in performance after exposure to educational materials. However, limitations such as the model’s inability to process certain visual materials and potential biases in AI models warrant further consideration. These findings underscore the need for continued research to optimize the use of AI in medical education and assessment.
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.40
自引率
0.00%
发文量
0
审稿时长
48 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信