Artificial intelligence versus physical medicine and rehabilitation residents: Can ChatGPT compete in clinical exam performance?

IF 2.8 4区 医学 Q1 REHABILITATION
PM&R Pub Date : 2025-10-03 DOI:10.1002/pmrj.70032
Aylin Ayyıldız, Selda Çiftci İnceoğlu, Banu Kuran, Kadriye Öneş
{"title":"Artificial intelligence versus physical medicine and rehabilitation residents: Can ChatGPT compete in clinical exam performance?","authors":"Aylin Ayyıldız, Selda Çiftci İnceoğlu, Banu Kuran, Kadriye Öneş","doi":"10.1002/pmrj.70032","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence has begun to replace human power in many areas today.</p><p><strong>Objective: </strong>To assess the performance of Chat Generative Pretrained Transformer (ChatGPT) on examinations administered to physical medicine and rehabilitation (PM&R) residents.</p><p><strong>Design: </strong>Cross-sectional study.</p><p><strong>Setting: </strong>Tertiary-care training and research hospital, department of physical medicine and rehabilitation.</p><p><strong>Participants: </strong>ChatGPT-4o and PM&R residents.</p><p><strong>Intervention: </strong>ChatGPT was presented with questions from the annual nationwide in-training exams administered to PM&R residents at different postgraduate years. The exam is a national requirement for the majority of PM&R residents in Turkey and is administered annually.</p><p><strong>Main outcome measures: </strong>The responses to these multiple-choice questions were evaluated as correct or incorrect, and ChatGPT's performance was then compared to that of the residents of each postgraduate year (PGY) term. The time taken by ChatGPT to answer each question was also recorded. Additionally, its learning ability was assessed by reasking the questions it initially answered incorrectly, this time providing the correct answers to evaluate improvement.</p><p><strong>Results: </strong>ChatGPT received a score of 88 out of 100 points in the PGY1 exam, 84 points in the PGY2 exam, 78 points in the PGY3 exam, and 80 points in the PGY4 exam. When compared with the performance distribution of residents, ChatGPT ranked in the 40th-50th percentile for PGY1, 70th-80th percentile for PGY2, 30th-40th percentile for PGY3, and 40th-50th percentile for PGY4. It has been demonstrated that ChatGPT has achieved a learning rate of 65%.</p><p><strong>Conclusion: </strong>Despite the potential of ChatGPT to surpass PM&R physicians in terms of learning capabilities and extensive knowledge network, several functional limitations remain. In its current form, it is not capable of replacing a physician, especially in the field of PM&R, where clinical examination and patient interaction play a critical role.</p>","PeriodicalId":20354,"journal":{"name":"PM&R","volume":" ","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PM&R","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/pmrj.70032","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REHABILITATION","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Artificial intelligence has begun to replace human power in many areas today.

Objective: To assess the performance of Chat Generative Pretrained Transformer (ChatGPT) on examinations administered to physical medicine and rehabilitation (PM&R) residents.

Design: Cross-sectional study.

Setting: Tertiary-care training and research hospital, department of physical medicine and rehabilitation.

Participants: ChatGPT-4o and PM&R residents.

Intervention: ChatGPT was presented with questions from the annual nationwide in-training exams administered to PM&R residents at different postgraduate years. The exam is a national requirement for the majority of PM&R residents in Turkey and is administered annually.

Main outcome measures: The responses to these multiple-choice questions were evaluated as correct or incorrect, and ChatGPT's performance was then compared to that of the residents of each postgraduate year (PGY) term. The time taken by ChatGPT to answer each question was also recorded. Additionally, its learning ability was assessed by reasking the questions it initially answered incorrectly, this time providing the correct answers to evaluate improvement.

Results: ChatGPT received a score of 88 out of 100 points in the PGY1 exam, 84 points in the PGY2 exam, 78 points in the PGY3 exam, and 80 points in the PGY4 exam. When compared with the performance distribution of residents, ChatGPT ranked in the 40th-50th percentile for PGY1, 70th-80th percentile for PGY2, 30th-40th percentile for PGY3, and 40th-50th percentile for PGY4. It has been demonstrated that ChatGPT has achieved a learning rate of 65%.

Conclusion: Despite the potential of ChatGPT to surpass PM&R physicians in terms of learning capabilities and extensive knowledge network, several functional limitations remain. In its current form, it is not capable of replacing a physician, especially in the field of PM&R, where clinical examination and patient interaction play a critical role.

人工智能与物理医学和康复住院医师:ChatGPT能在临床考试中竞争吗?
背景:今天,人工智能已经开始在许多领域取代人力。目的:评估聊天生成预训练变压器(ChatGPT)在物理医学和康复(PM&R)住院医师考试中的表现。设计:横断面研究。单位:三级保健培训和研究医院,物理医学和康复科。参与者:chatgpt - 40和PM&R居民。干预:ChatGPT的问题来自全国年度培训考试,这些考试在不同的研究生年级进行,由PM&R住院医师参加。该考试是土耳其大多数PM&R居民的国家要求,每年进行一次。主要结果测量:对这些选择题的回答被评估为正确或不正确,然后将ChatGPT的表现与每个研究生学期(PGY)的居民进行比较。ChatGPT回答每个问题所花费的时间也被记录下来。此外,它的学习能力是通过重新提出它最初回答错误的问题来评估的,这次提供正确的答案来评估改进。结果:ChatGPT在PGY1考试中获得88分(满分100分),在PGY2考试中获得84分,在PGY3考试中获得78分,在PGY4考试中获得80分。与居民绩效分布相比,PGY1、PGY2、PGY3、PGY4分别处于第40 -50百分位、第70 -80百分位、第30 -40百分位。经过验证,ChatGPT的学习率达到了65%。结论:尽管ChatGPT在学习能力和广泛的知识网络方面有超过PM&R医生的潜力,但仍存在一些功能限制。在目前的形式下,它无法取代医生,特别是在PM&R领域,临床检查和患者互动起着至关重要的作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
PM&R
PM&R REHABILITATION-SPORT SCIENCES
CiteScore
4.30
自引率
4.80%
发文量
187
审稿时长
4-8 weeks
期刊介绍: Topics covered include acute and chronic musculoskeletal disorders and pain, neurologic conditions involving the central and peripheral nervous systems, rehabilitation of impairments associated with disabilities in adults and children, and neurophysiology and electrodiagnosis. PM&R emphasizes principles of injury, function, and rehabilitation, and is designed to be relevant to practitioners and researchers in a variety of medical and surgical specialties and rehabilitation disciplines including allied health.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信