Assessing knowledge about medical physics in language-generative AI with large language model: using the medical physicist exam.

IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
Radiological Physics and Technology Pub Date : 2024-12-01 Epub Date: 2024-09-10 DOI:10.1007/s12194-024-00838-2
Noriyuki Kadoya, Kazuhiro Arai, Shohei Tanaka, Yuto Kimura, Ryota Tozuka, Keisuke Yasui, Naoki Hayashi, Yoshiyuki Katsuta, Haruna Takahashi, Koki Inoue, Keiichi Jingu
{"title":"Assessing knowledge about medical physics in language-generative AI with large language model: using the medical physicist exam.","authors":"Noriyuki Kadoya, Kazuhiro Arai, Shohei Tanaka, Yuto Kimura, Ryota Tozuka, Keisuke Yasui, Naoki Hayashi, Yoshiyuki Katsuta, Haruna Takahashi, Koki Inoue, Keiichi Jingu","doi":"10.1007/s12194-024-00838-2","DOIUrl":null,"url":null,"abstract":"<p><p>This study aimed to evaluate the performance for answering the Japanese medical physicist examination and providing the benchmark of knowledge about medical physics in language-generative AI with large language model. We used questions from Japan's 2018, 2019, 2020, 2021 and 2022 medical physicist board examinations, which covered various question types, including multiple-choice questions, and mainly focused on general medicine and medical physics. ChatGPT-3.5 and ChatGPT-4.0 (OpenAI) were used. We compared the AI-based answers with the correct ones. The average accuracy rates were 42.2 ± 2.5% (ChatGPT-3.5) and 72.7 ± 2.6% (ChatGPT-4), showing that ChatGPT-4 was more accurate than ChatGPT-3.5 [all categories (except for radiation-related laws and recommendations/medical ethics): p value < 0.05]. Even with the ChatGPT model with higher accuracy, the accuracy rates were less than 60% in two categories; radiation metrology (55.6%), and radiation-related laws and recommendations/medical ethics (40.0%). These data provide the benchmark for knowledge about medical physics in ChatGPT and can be utilized as basic data for the development of various medical physics tools using ChatGPT (e.g., radiation therapy support tools with Japanese input).</p>","PeriodicalId":46252,"journal":{"name":"Radiological Physics and Technology","volume":" ","pages":"929-937"},"PeriodicalIF":1.7000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiological Physics and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s12194-024-00838-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/10 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

Abstract

This study aimed to evaluate the performance for answering the Japanese medical physicist examination and providing the benchmark of knowledge about medical physics in language-generative AI with large language model. We used questions from Japan's 2018, 2019, 2020, 2021 and 2022 medical physicist board examinations, which covered various question types, including multiple-choice questions, and mainly focused on general medicine and medical physics. ChatGPT-3.5 and ChatGPT-4.0 (OpenAI) were used. We compared the AI-based answers with the correct ones. The average accuracy rates were 42.2 ± 2.5% (ChatGPT-3.5) and 72.7 ± 2.6% (ChatGPT-4), showing that ChatGPT-4 was more accurate than ChatGPT-3.5 [all categories (except for radiation-related laws and recommendations/medical ethics): p value < 0.05]. Even with the ChatGPT model with higher accuracy, the accuracy rates were less than 60% in two categories; radiation metrology (55.6%), and radiation-related laws and recommendations/medical ethics (40.0%). These data provide the benchmark for knowledge about medical physics in ChatGPT and can be utilized as basic data for the development of various medical physics tools using ChatGPT (e.g., radiation therapy support tools with Japanese input).

利用大型语言模型评估语言生成人工智能中的医学物理知识:使用医学物理学家考试。
本研究旨在评估日本医学物理学家考试的答题性能,并为具有大语言模型的语言生成人工智能提供医学物理知识基准。我们使用了日本 2018 年、2019 年、2020 年、2021 年和 2022 年医学物理学家考试的试题,这些试题涵盖了包括选择题在内的各种题型,主要集中在普通医学和医学物理方面。我们使用了 ChatGPT-3.5 和 ChatGPT-4.0(OpenAI)。我们将基于人工智能的答案与正确答案进行了比较。平均正确率为 42.2 ± 2.5%(ChatGPT-3.5)和 72.7 ± 2.6%(ChatGPT-4),显示 ChatGPT-4 比 ChatGPT-3.5 更准确[所有类别(辐射相关法律和建议/医学伦理除外):p 值
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Radiological Physics and Technology
Radiological Physics and Technology RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-
CiteScore
3.00
自引率
12.50%
发文量
40
期刊介绍: The purpose of the journal Radiological Physics and Technology is to provide a forum for sharing new knowledge related to research and development in radiological science and technology, including medical physics and radiological technology in diagnostic radiology, nuclear medicine, and radiation therapy among many other radiological disciplines, as well as to contribute to progress and improvement in medical practice and patient health care.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信