日本放射诊断学、核医学和介入放射学专业委员会考试中视觉语言模型的诊断准确性

medRxiv - Radiology and Imaging Pub Date : 2024-05-31 DOI:10.1101/2024.05.31.24308072

Tatsushi Oura, Hiroyuki Tatekawa, Daisuke Horiuchi, Shu Matsushita, Hirotaka Takita, Natsuko Atsukawa, Yasuhito Mitsuyama, Atsushi Yoshida, Kazuki Murai, Rikako Tanaka, Taro Shimono, Akira Yamamoto, Yukio Miki, Daiju Ueda

{"title":"日本放射诊断学、核医学和介入放射学专业委员会考试中视觉语言模型的诊断准确性","authors":"Tatsushi Oura, Hiroyuki Tatekawa, Daisuke Horiuchi, Shu Matsushita, Hirotaka Takita, Natsuko Atsukawa, Yasuhito Mitsuyama, Atsushi Yoshida, Kazuki Murai, Rikako Tanaka, Taro Shimono, Akira Yamamoto, Yukio Miki, Daiju Ueda","doi":"10.1101/2024.05.31.24308072","DOIUrl":null,"url":null,"abstract":"<strong>Purpose</strong> The performance of vision-language models (VLMs) with image interpretation capabilities, such as GPT-4 omni (GPT-4o), GPT-4 vision (GPT-4V), and Claude-3, has not been compared and remains unexplored in specialized radiological fields, including nuclear medicine and interventional radiology. This study aimed to evaluate and compare the diagnostic accuracy of various VLMs, including GPT-4 + GPT-4V, GPT-4o, Claude-3 Sonnet, and Claude-3 Opus, using Japanese diagnostic radiology, nuclear medicine, and interventional radiology (JDR, JNM, and JIR, respectively) board certification tests.","PeriodicalId":501358,"journal":{"name":"medRxiv - Radiology and Imaging","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diagnostic Accuracy of Vision-Language Models on Japanese Diagnostic Radiology, Nuclear Medicine, and Interventional Radiology Specialty Board Examinations\",\"authors\":\"Tatsushi Oura, Hiroyuki Tatekawa, Daisuke Horiuchi, Shu Matsushita, Hirotaka Takita, Natsuko Atsukawa, Yasuhito Mitsuyama, Atsushi Yoshida, Kazuki Murai, Rikako Tanaka, Taro Shimono, Akira Yamamoto, Yukio Miki, Daiju Ueda\",\"doi\":\"10.1101/2024.05.31.24308072\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<strong>Purpose</strong> The performance of vision-language models (VLMs) with image interpretation capabilities, such as GPT-4 omni (GPT-4o), GPT-4 vision (GPT-4V), and Claude-3, has not been compared and remains unexplored in specialized radiological fields, including nuclear medicine and interventional radiology. This study aimed to evaluate and compare the diagnostic accuracy of various VLMs, including GPT-4 + GPT-4V, GPT-4o, Claude-3 Sonnet, and Claude-3 Opus, using Japanese diagnostic radiology, nuclear medicine, and interventional radiology (JDR, JNM, and JIR, respectively) board certification tests.\",\"PeriodicalId\":501358,\"journal\":{\"name\":\"medRxiv - Radiology and Imaging\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv - Radiology and Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.05.31.24308072\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Radiology and Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.05.31.24308072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目的具有图像解读功能的视觉语言模型（VLM），如 GPT-4 omni（GPT-4o）、GPT-4 vision（GPT-4V）和 Claude-3 的性能尚未进行过比较，在核医学和介入放射学等专业放射学领域也尚未进行过探索。本研究旨在使用日本放射诊断学、核医学和介入放射学（分别为 JDR、JNM 和 JIR）委员会认证测试，评估和比较各种 VLM（包括 GPT-4 + GPT-4V、GPT-4o、Claude-3 Sonnet 和 Claude-3 Opus）的诊断准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Diagnostic Accuracy of Vision-Language Models on Japanese Diagnostic Radiology, Nuclear Medicine, and Interventional Radiology Specialty Board Examinations

Purpose The performance of vision-language models (VLMs) with image interpretation capabilities, such as GPT-4 omni (GPT-4o), GPT-4 vision (GPT-4V), and Claude-3, has not been compared and remains unexplored in specialized radiological fields, including nuclear medicine and interventional radiology. This study aimed to evaluate and compare the diagnostic accuracy of various VLMs, including GPT-4 + GPT-4V, GPT-4o, Claude-3 Sonnet, and Claude-3 Opus, using Japanese diagnostic radiology, nuclear medicine, and interventional radiology (JDR, JNM, and JIR, respectively) board certification tests.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

medRxiv - Radiology and Imaging

自引率

0.00%

发文量