利用开放式问题和图像评估ChatGPT在放射肿瘤学临床决策中的作用。

IF 3.5 3区医学 Q2 ONCOLOGY

Practical Radiation Oncology Pub Date : 2025-04-29 DOI:10.1016/j.prro.2025.04.009

Wei-Kai Chuang MD , Yung-Shuo Kao MD , Yen-Ting Liu MD , Cho-Yin Lee MD, PhD

{"title":"利用开放式问题和图像评估ChatGPT在放射肿瘤学临床决策中的作用。","authors":"Wei-Kai Chuang MD , Yung-Shuo Kao MD , Yen-Ting Liu MD , Cho-Yin Lee MD, PhD","doi":"10.1016/j.prro.2025.04.009","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>This study assesses the practicality and correctness of Chat Generative Pre-trained Transformer (ChatGPT)-4 and 4O’s answers to clinical inquiries in radiation oncology<span>, and evaluates ChatGPT-4O for staging nasopharyngeal carcinoma (NPC) cases with magnetic resonance (MR) images.</span></div></div><div><h3>Methods and Materials</h3><div>A total of 164 open-ended questions covering representative professional domains (Clinical_G: knowledge on standardized guidelines; Clinical_C: complex clinical scenarios; Nursing: nursing and health education; and Technology: radiation technology and dosimetry) were prospectively formulated by experts and presented to ChatGPT-4 and 4O. Each ChatGPT’s answer was graded as 1 (Directly practical for clinical decision-making), 2 (Correct but inadequate), 3 (Mixed with correct and incorrect information), or 4 (Completely incorrect). ChatGPT-4O was presented with the representative diagnostic MR images of 20 patients with NPC across different T stages, and asked to determine the T stage of each case.</div></div><div><h3>Results</h3><div>The proportions of ChatGPT’s answers that were practical (grade 1) varied across professional domains (<em>P</em> < .01), higher in Nursing (GPT-4: 91.9%; GPT-4O: 94.6%) and Clinical_G (GPT-4: 82.2%; GPT-4O: 88.9%) domains than in Clinical_C (GPT-4: 54.1%; GPT-4O: 62.2%) and Technology (GPT-4: 64.4%; GPT-4O: 77.8%) domains. The proportions of correct (grade 1+2) answers (GPT-4: 89.6%; GPT-4O: 98.8%; <em>P</em> < .01) were universally high across all professional domains. However, ChatGPT-4O failed to stage NPC cases via MR images, indiscriminately assigning T4 to all actually non-T4 cases (<em>κ</em> = 0; 95% CI, −0.253 to 0.253).</div></div><div><h3>Conclusions</h3><div>ChatGPT could be a safe clinical decision-support tool in radiation oncology, because it correctly answered the vast majority of clinical inquiries across professional domains. However, its clinical practicality should be cautiously weighted particularly in the Clinical_C and Technology domains. ChatGPT-4O is not yet mature to interpret diagnostic images for cancer staging.</div></div>","PeriodicalId":54245,"journal":{"name":"Practical Radiation Oncology","volume":"15 5","pages":"Pages e412-e423"},"PeriodicalIF":3.5000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing ChatGPT for Clinical Decision-Making in Radiation Oncology, With Open-Ended Questions and Images\",\"authors\":\"Wei-Kai Chuang MD , Yung-Shuo Kao MD , Yen-Ting Liu MD , Cho-Yin Lee MD, PhD\",\"doi\":\"10.1016/j.prro.2025.04.009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Purpose</h3><div>This study assesses the practicality and correctness of Chat Generative Pre-trained Transformer (ChatGPT)-4 and 4O’s answers to clinical inquiries in radiation oncology<span>, and evaluates ChatGPT-4O for staging nasopharyngeal carcinoma (NPC) cases with magnetic resonance (MR) images.</span></div></div><div><h3>Methods and Materials</h3><div>A total of 164 open-ended questions covering representative professional domains (Clinical_G: knowledge on standardized guidelines; Clinical_C: complex clinical scenarios; Nursing: nursing and health education; and Technology: radiation technology and dosimetry) were prospectively formulated by experts and presented to ChatGPT-4 and 4O. Each ChatGPT’s answer was graded as 1 (Directly practical for clinical decision-making), 2 (Correct but inadequate), 3 (Mixed with correct and incorrect information), or 4 (Completely incorrect). ChatGPT-4O was presented with the representative diagnostic MR images of 20 patients with NPC across different T stages, and asked to determine the T stage of each case.</div></div><div><h3>Results</h3><div>The proportions of ChatGPT’s answers that were practical (grade 1) varied across professional domains (<em>P</em> < .01), higher in Nursing (GPT-4: 91.9%; GPT-4O: 94.6%) and Clinical_G (GPT-4: 82.2%; GPT-4O: 88.9%) domains than in Clinical_C (GPT-4: 54.1%; GPT-4O: 62.2%) and Technology (GPT-4: 64.4%; GPT-4O: 77.8%) domains. The proportions of correct (grade 1+2) answers (GPT-4: 89.6%; GPT-4O: 98.8%; <em>P</em> < .01) were universally high across all professional domains. However, ChatGPT-4O failed to stage NPC cases via MR images, indiscriminately assigning T4 to all actually non-T4 cases (<em>κ</em> = 0; 95% CI, −0.253 to 0.253).</div></div><div><h3>Conclusions</h3><div>ChatGPT could be a safe clinical decision-support tool in radiation oncology, because it correctly answered the vast majority of clinical inquiries across professional domains. However, its clinical practicality should be cautiously weighted particularly in the Clinical_C and Technology domains. ChatGPT-4O is not yet mature to interpret diagnostic images for cancer staging.</div></div>\",\"PeriodicalId\":54245,\"journal\":{\"name\":\"Practical Radiation Oncology\",\"volume\":\"15 5\",\"pages\":\"Pages e412-e423\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Practical Radiation Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1879850025001158\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Practical Radiation Oncology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1879850025001158","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

目的：本研究评估ChatGPT-4和40o在放射肿瘤学中回答临床问题的实用性和正确性，并评估chatgpt - 40o在MR影像鼻咽癌分期中的应用价值。方法：164个涵盖代表性专业领域的开放式问题(Clinical_G：标准化指南知识；Clinical_C：复杂临床场景；护理学：护理与健康教育；技术：辐射技术和剂量学)由专家前瞻性制定，并提交给ChatGPT-4和40。每个ChatGPT的答案分为1（直接适用于临床决策），2（正确但不充分），3（混合正确和错误的信息）或4（完全错误）。chatgpt - 40展示了20名不同T期鼻咽癌患者的代表性诊断MR图像，并要求确定每个病例的T期。结果：ChatGPT的答案中实用（1级）的比例在不同的专业领域有所不同(结论：ChatGPT可以成为放射肿瘤学中安全的临床决策支持工具，因为它正确回答了绝大多数专业领域的临床询问。然而，其临床实用性应谨慎加权，特别是在临床和技术领域。chatgpt - 40在解释癌症分期的诊断图像方面还不成熟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Assessing ChatGPT for Clinical Decision-Making in Radiation Oncology, With Open-Ended Questions and Images

Purpose

This study assesses the practicality and correctness of Chat Generative Pre-trained Transformer (ChatGPT)-4 and 4O’s answers to clinical inquiries in radiation oncology, and evaluates ChatGPT-4O for staging nasopharyngeal carcinoma (NPC) cases with magnetic resonance (MR) images.

Methods and Materials

A total of 164 open-ended questions covering representative professional domains (Clinical_G: knowledge on standardized guidelines; Clinical_C: complex clinical scenarios; Nursing: nursing and health education; and Technology: radiation technology and dosimetry) were prospectively formulated by experts and presented to ChatGPT-4 and 4O. Each ChatGPT’s answer was graded as 1 (Directly practical for clinical decision-making), 2 (Correct but inadequate), 3 (Mixed with correct and incorrect information), or 4 (Completely incorrect). ChatGPT-4O was presented with the representative diagnostic MR images of 20 patients with NPC across different T stages, and asked to determine the T stage of each case.

Results

The proportions of ChatGPT’s answers that were practical (grade 1) varied across professional domains (P < .01), higher in Nursing (GPT-4: 91.9%; GPT-4O: 94.6%) and Clinical_G (GPT-4: 82.2%; GPT-4O: 88.9%) domains than in Clinical_C (GPT-4: 54.1%; GPT-4O: 62.2%) and Technology (GPT-4: 64.4%; GPT-4O: 77.8%) domains. The proportions of correct (grade 1+2) answers (GPT-4: 89.6%; GPT-4O: 98.8%; P < .01) were universally high across all professional domains. However, ChatGPT-4O failed to stage NPC cases via MR images, indiscriminately assigning T4 to all actually non-T4 cases (κ = 0; 95% CI, −0.253 to 0.253).

Conclusions

ChatGPT could be a safe clinical decision-support tool in radiation oncology, because it correctly answered the vast majority of clinical inquiries across professional domains. However, its clinical practicality should be cautiously weighted particularly in the Clinical_C and Technology domains. ChatGPT-4O is not yet mature to interpret diagnostic images for cancer staging.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Practical Radiation Oncology Medicine-Radiology, Nuclear Medicine and Imaging

CiteScore

5.20

自引率

6.10%

发文量

177

审稿时长

34 days

期刊介绍： The overarching mission of Practical Radiation Oncology is to improve the quality of radiation oncology practice. PRO''s purpose is to document the state of current practice, providing background for those in training and continuing education for practitioners, through discussion and illustration of new techniques, evaluation of current practices, and publication of case reports. PRO strives to provide its readers content that emphasizes knowledge "with a purpose." The content of PRO includes: Original articles focusing on patient safety, quality measurement, or quality improvement initiatives Original articles focusing on imaging, contouring, target delineation, simulation, treatment planning, immobilization, organ motion, and other practical issues ASTRO guidelines, position papers, and consensus statements Essays that highlight enriching personal experiences in caring for cancer patients and their families.