Evaluating the interpretation of neck CT/MRI scans by ChatGPT-4V for detecting primary oropharyngeal squamous cell carcinoma: An exploratory study.

IF 2.4 4区医学 Q2 OTORHINOLARYNGOLOGY

European Annals of Otorhinolaryngology-Head and Neck Diseases Pub Date : 2026-04-29 DOI:10.1016/j.anorl.2026.02.012

B Schmidl, R Walter, C C Hoch, T Huetten, S Pigorsch, F Stögbauer, T Hussain, B Wollenberg, M Wirth

{"title":"Evaluating the interpretation of neck CT/MRI scans by ChatGPT-4V for detecting primary oropharyngeal squamous cell carcinoma: An exploratory study.","authors":"B Schmidl, R Walter, C C Hoch, T Huetten, S Pigorsch, F Stögbauer, T Hussain, B Wollenberg, M Wirth","doi":"10.1016/j.anorl.2026.02.012","DOIUrl":null,"url":null,"abstract":"Objectives: Early and accurate detection of head and neck squamous cell carcinoma and the subset of oropharyngeal squamous cell carcinoma (OPSCC) is essential for the therapy and the prognosis of patients. Computer tomography (CT) is the primary imaging modality and is currently evaluated manually by radiologists and head and neck oncologists. Since image recognition in the form of artificial intelligence (AI) was introduced recently with the large language model (LLM) ChatGPT-4V, this exploratory study for the first time evaluates the application of image recognition by ChatGPT in interpreting neck CT and MRI scans for OPSCC detection, and corresponding images without any oropharyngeal lesion.Materials and methods: The most likely diagnosis based on the CT images for 100 CT cases (50 OPSCC, 50 without lesion) and the available corresponding 62 MRI cases (31 OPSCC, 31 without an oropharyngeal lesion) by ChatGPT-4V was rated by two independent reviewers and the overall performance was evaluated in terms of accuracy, sensitivity, and specificity.Results: In this study, ChatGPT-4V reached a sensitivity of 72% and a specificity of 78% in identifying OPSCC from CT images. For MRI scans, sensitivity was 80.6% and specificity 83.9%. Human papillomavirus-positive and more advanced lesions were detected more reliably.Discussion: In this exploratory study of CT and MRI neck scans of the oropharynx, ChatGPT-4V demonstrated a mediocre performance for detecting OPSCC. Continued research and advancements in AI are essential to improve the reliability and clinical utility of LLMs for the interpretation of neck scans.","PeriodicalId":48834,"journal":{"name":"European Annals of Otorhinolaryngology-Head and Neck Diseases","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2026-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Annals of Otorhinolaryngology-Head and Neck Diseases","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.anorl.2026.02.012","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: Early and accurate detection of head and neck squamous cell carcinoma and the subset of oropharyngeal squamous cell carcinoma (OPSCC) is essential for the therapy and the prognosis of patients. Computer tomography (CT) is the primary imaging modality and is currently evaluated manually by radiologists and head and neck oncologists. Since image recognition in the form of artificial intelligence (AI) was introduced recently with the large language model (LLM) ChatGPT-4V, this exploratory study for the first time evaluates the application of image recognition by ChatGPT in interpreting neck CT and MRI scans for OPSCC detection, and corresponding images without any oropharyngeal lesion.

Materials and methods: The most likely diagnosis based on the CT images for 100 CT cases (50 OPSCC, 50 without lesion) and the available corresponding 62 MRI cases (31 OPSCC, 31 without an oropharyngeal lesion) by ChatGPT-4V was rated by two independent reviewers and the overall performance was evaluated in terms of accuracy, sensitivity, and specificity.

Results: In this study, ChatGPT-4V reached a sensitivity of 72% and a specificity of 78% in identifying OPSCC from CT images. For MRI scans, sensitivity was 80.6% and specificity 83.9%. Human papillomavirus-positive and more advanced lesions were detected more reliably.

Discussion: In this exploratory study of CT and MRI neck scans of the oropharynx, ChatGPT-4V demonstrated a mediocre performance for detecting OPSCC. Continued research and advancements in AI are essential to improve the reliability and clinical utility of LLMs for the interpretation of neck scans.

查看原文本刊更多论文

评价ChatGPT-4V颈部CT/MRI扫描对原发性口咽鳞状细胞癌的解释：一项探索性研究。

目的：头颈部鳞状细胞癌和口咽鳞状细胞癌（OPSCC）亚群的早期准确检测对患者的治疗和预后至关重要。计算机断层扫描（CT）是主要的成像方式，目前由放射科医生和头颈部肿瘤学家手动评估。由于人工智能（AI）形式的图像识别是最近通过大语言模型（LLM） ChatGPT- 4v引入的，本探索性研究首次评估了ChatGPT图像识别在解释颈部CT和MRI扫描的OPSCC检测中的应用，以及没有任何口咽病变的相应图像。材料与方法：ChatGPT-4V对100例CT（50例OPSCC， 50例无病变）和62例MRI（31例OPSCC， 31例无口咽病变）的CT图像进行最可能的诊断，由两位独立的审查员进行评分，并从准确性、敏感性和特异性方面对总体表现进行评估。结果：在本研究中，ChatGPT-4V从CT图像中识别OPSCC的灵敏度为72%，特异性为78%。MRI扫描的敏感性为80.6%，特异性为83.9%。人乳头瘤病毒阳性和更晚期的病变检测更可靠。讨论：在这项对口咽部CT和MRI颈部扫描的探索性研究中，ChatGPT-4V在检测OPSCC方面表现一般。人工智能的持续研究和进步对于提高llm解释颈部扫描的可靠性和临床实用性至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

European Annals of Otorhinolaryngology-Head and Neck Diseases OTORHINOLARYNGOLOGY-

CiteScore

3.70

自引率

28.00%

发文量

审稿时长

12 days

期刊介绍： European Annals of Oto-rhino-laryngology, Head and Neck diseases heir of one of the oldest otorhinolaryngology journals in Europe is the official organ of the French Society of Otorhinolaryngology (SFORL) and the the International Francophone Society of Otorhinolaryngology (SIFORL). Today six annual issues provide original peer reviewed clinical and research articles, epidemiological studies, new methodological clinical approaches and review articles giving most up-to-date insights in all areas of otology, laryngology rhinology, head and neck surgery. The European Annals also publish the SFORL guidelines and recommendations.The journal is a unique two-armed publication: the European Annals (ANORL) is an English language well referenced online journal (e-only) whereas the Annales Françaises d’ORL (AFORL), mail-order paper and online edition in French language are aimed at the French-speaking community. French language teams must submit their articles in French to the AFORL site. Federating journal in its field, the European Annals has an Editorial board of experts with international reputation that allow to make an important contribution to communication on new research data and clinical practice by publishing high-quality articles.