GPT-4 V 对日本全国临床工程师执业资格考试的反应分析

IF 3.5 3区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Journal of Medical Systems Pub Date : 2024-09-11 DOI:10.1007/s10916-024-02103-w

Kai Ishida, Naoya Arisaka, Kiyotaka Fujii

{"title":"GPT-4 V 对日本全国临床工程师执业资格考试的反应分析","authors":"Kai Ishida, Naoya Arisaka, Kiyotaka Fujii","doi":"10.1007/s10916-024-02103-w","DOIUrl":null,"url":null,"abstract":"<p>Chat Generative Pretrained Transformer (ChatGPT; OpenAI) is a state-of-the-art large language model that can simulate human-like conversations based on user input. We evaluated the performance of GPT-4 V in the Japanese National Clinical Engineer Licensing Examination using 2,155 questions from 2012 to 2023. The average correct answer rate for all questions was 86.0%. In particular, clinical medicine, basic medicine, medical materials, biological properties, and mechanical engineering achieved a correct response rate of ≥ 90%. Conversely, medical device safety management, electrical and electronic engineering, and extracorporeal circulation obtained low correct answer rates ranging from 64.8% to 76.5%. The correct answer rates for questions that included figures/tables, required numerical calculation, figure/table ∩ calculation, and knowledge of Japanese Industrial Standards were 55.2%, 85.8%, 64.2% and 31.0%, respectively. The reason for the low correct answer rates is that ChatGPT lacked recognition of the images and knowledge of standards and laws. This study concludes that careful attention is required when using ChatGPT because several of its explanations lack the correct description.</p>","PeriodicalId":16338,"journal":{"name":"Journal of Medical Systems","volume":"12 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analysis of Responses of GPT-4 V to the Japanese National Clinical Engineer Licensing Examination\",\"authors\":\"Kai Ishida, Naoya Arisaka, Kiyotaka Fujii\",\"doi\":\"10.1007/s10916-024-02103-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Chat Generative Pretrained Transformer (ChatGPT; OpenAI) is a state-of-the-art large language model that can simulate human-like conversations based on user input. We evaluated the performance of GPT-4 V in the Japanese National Clinical Engineer Licensing Examination using 2,155 questions from 2012 to 2023. The average correct answer rate for all questions was 86.0%. In particular, clinical medicine, basic medicine, medical materials, biological properties, and mechanical engineering achieved a correct response rate of ≥ 90%. Conversely, medical device safety management, electrical and electronic engineering, and extracorporeal circulation obtained low correct answer rates ranging from 64.8% to 76.5%. The correct answer rates for questions that included figures/tables, required numerical calculation, figure/table ∩ calculation, and knowledge of Japanese Industrial Standards were 55.2%, 85.8%, 64.2% and 31.0%, respectively. The reason for the low correct answer rates is that ChatGPT lacked recognition of the images and knowledge of standards and laws. This study concludes that careful attention is required when using ChatGPT because several of its explanations lack the correct description.</p>\",\"PeriodicalId\":16338,\"journal\":{\"name\":\"Journal of Medical Systems\",\"volume\":\"12 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Medical Systems\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s10916-024-02103-w\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10916-024-02103-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

聊天生成预训练转换器（ChatGPT；OpenAI）是一种先进的大型语言模型，可根据用户输入模拟类似人类的对话。在日本全国临床工程师执业资格考试中，我们使用 2012 年至 2023 年的 2,155 道题目对 GPT-4 V 的性能进行了评估。所有问题的平均正确率为 86.0%。其中，临床医学、基础医学、医用材料、生物特性和机械工程的正确答题率≥90%。相反，医疗设备安全管理、电子电气工程和体外循环的正确答题率较低，从 64.8% 到 76.5%不等。包含数字/表格、需要数字计算、数字/表格∩计算和日本工业标准知识的题目的正确答对率分别为 55.2%、85.8%、64.2% 和 31.0%。答对率低的原因是 ChatGPT 缺乏对图像的识别以及标准和法律知识。本研究的结论是，在使用 ChatGPT 时需要小心谨慎，因为它的一些解释缺乏正确的描述。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Analysis of Responses of GPT-4 V to the Japanese National Clinical Engineer Licensing Examination

查看原文本刊更多论文

Analysis of Responses of GPT-4 V to the Japanese National Clinical Engineer Licensing Examination

Chat Generative Pretrained Transformer (ChatGPT; OpenAI) is a state-of-the-art large language model that can simulate human-like conversations based on user input. We evaluated the performance of GPT-4 V in the Japanese National Clinical Engineer Licensing Examination using 2,155 questions from 2012 to 2023. The average correct answer rate for all questions was 86.0%. In particular, clinical medicine, basic medicine, medical materials, biological properties, and mechanical engineering achieved a correct response rate of ≥ 90%. Conversely, medical device safety management, electrical and electronic engineering, and extracorporeal circulation obtained low correct answer rates ranging from 64.8% to 76.5%. The correct answer rates for questions that included figures/tables, required numerical calculation, figure/table ∩ calculation, and knowledge of Japanese Industrial Standards were 55.2%, 85.8%, 64.2% and 31.0%, respectively. The reason for the low correct answer rates is that ChatGPT lacked recognition of the images and knowledge of standards and laws. This study concludes that careful attention is required when using ChatGPT because several of its explanations lack the correct description.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Medical Systems 医学-卫生保健

CiteScore

11.60

自引率

1.90%

发文量

审稿时长

4.8 months

期刊介绍： Journal of Medical Systems provides a forum for the presentation and discussion of the increasingly extensive applications of new systems techniques and methods in hospital clinic and physician''s office administration; pathology radiology and pharmaceutical delivery systems; medical records storage and retrieval; and ancillary patient-support systems. The journal publishes informative articles essays and studies across the entire scale of medical systems from large hospital programs to novel small-scale medical services. Education is an integral part of this amalgamation of sciences and selected articles are published in this area. Since existing medical systems are constantly being modified to fit particular circumstances and to solve specific problems the journal includes a special section devoted to status reports on current installations.