Use of ChatGPT for patient education involving HPV-associated oropharyngeal cancer

IF 1.8 4区医学 Q2 OTORHINOLARYNGOLOGY

American Journal of Otolaryngology Pub Date : 2025-04-21 DOI:10.1016/j.amjoto.2025.104642

Terral A. Patel , Gillian Michaelson , Zoey Morton , Alexandria Harris , Brandon Smith , Richard Bourguillon , Eric Wu , Arturo Eguia , Jessica H. Maxwell

{"title":"Use of ChatGPT for patient education involving HPV-associated oropharyngeal cancer","authors":"Terral A. Patel , Gillian Michaelson , Zoey Morton , Alexandria Harris , Brandon Smith , Richard Bourguillon , Eric Wu , Arturo Eguia , Jessica H. Maxwell","doi":"10.1016/j.amjoto.2025.104642","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>This study aims to investigate the ability of ChatGPT to generate reliably accurate responses to patient-based queries specifically regarding oropharyngeal squamous cell carcinoma (OPSCC) of the head and neck.</div></div><div><h3>Study design</h3><div>Retrospective review of published abstracts.</div></div><div><h3>Setting</h3><div>Publicly available generative artificial intelligence.</div></div><div><h3>Methods</h3><div>ChatGPT 3.5 (May 2024) was queried with a set of 30 questions pertaining to HPV-associated oropharyngeal cancer that the average patient may ask. This set of questions was queried a total of four times preceded by a different prompt. The answer prompts for each question set were reviewed and graded on a four-part Likert scale. A Flesch-Kincaid reading level was also calculated for each prompt.</div></div><div><h3>Results</h3><div>For all answer prompts (<em>n</em> = 120), 6.6 % were graded as mostly inaccurate, 7.5 % were graded as minorly inaccurate, 41.7 % were graded as accurate, and 44.2 % were graded as accurate and helpful. The average Flesch-Kincaid reading grade level was lowest for the responses without any prompt (11.77). Understandably, the highest grade levels were found in the physician-friend prompt (12.97). Of the 30 references, 25 (83.3 %) were found to be authentic published studies. Of the 25 authentic references, the answers accurately cited information found within the original source for 14 of the references (56 %).</div></div><div><h3>Conclusion</h3><div>ChatGPT was able to produce relatively accurate responses to example patient questions, but there was a high rate of false references. In addition, the reading level of the answer prompts was well above the Centers for Disease Control and Prevention (CDC) recommendations for the average patient.</div></div>","PeriodicalId":7591,"journal":{"name":"American Journal of Otolaryngology","volume":"46 4","pages":"Article 104642"},"PeriodicalIF":1.8000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Otolaryngology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0196070925000456","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective

This study aims to investigate the ability of ChatGPT to generate reliably accurate responses to patient-based queries specifically regarding oropharyngeal squamous cell carcinoma (OPSCC) of the head and neck.

Study design

Retrospective review of published abstracts.

Setting

Publicly available generative artificial intelligence.

Methods

ChatGPT 3.5 (May 2024) was queried with a set of 30 questions pertaining to HPV-associated oropharyngeal cancer that the average patient may ask. This set of questions was queried a total of four times preceded by a different prompt. The answer prompts for each question set were reviewed and graded on a four-part Likert scale. A Flesch-Kincaid reading level was also calculated for each prompt.

Results

For all answer prompts (n = 120), 6.6 % were graded as mostly inaccurate, 7.5 % were graded as minorly inaccurate, 41.7 % were graded as accurate, and 44.2 % were graded as accurate and helpful. The average Flesch-Kincaid reading grade level was lowest for the responses without any prompt (11.77). Understandably, the highest grade levels were found in the physician-friend prompt (12.97). Of the 30 references, 25 (83.3 %) were found to be authentic published studies. Of the 25 authentic references, the answers accurately cited information found within the original source for 14 of the references (56 %).

Conclusion

ChatGPT was able to produce relatively accurate responses to example patient questions, but there was a high rate of false references. In addition, the reading level of the answer prompts was well above the Centers for Disease Control and Prevention (CDC) recommendations for the average patient.

查看原文本刊更多论文

使用ChatGPT对hpv相关口咽癌患者进行教育

目的：本研究旨在探讨ChatGPT在头颈部口咽鳞状细胞癌（OPSCC）患者查询时产生可靠准确应答的能力。研究设计对已发表摘要进行回顾性分析。公开可用的生成人工智能。方法对schatgpt 3.5（2024年5月）进行问卷调查，包括30个与hpv相关口咽癌相关的问题，这些问题是普通患者可能会问的。这组问题总共被查询了四次，之前有一个不同的提示。每个问题的答案提示都经过审查，并按照四部分李克特量表进行评分。同时还计算了每个提示的flesch - kinkaid阅读水平。结果120例答题提示中，大部分不准确的占6.6%，轻微不准确的占7.5%，准确的占41.7%，准确有用的占44.2%。在没有任何提示的回答中，Flesch-Kincaid阅读成绩的平均水平最低（11.77）。可以理解的是，在“医生-朋友”提示中得分最高（12.97）。在30篇参考文献中，25篇（83.3%）被认为是可信的已发表研究。在25个真实的参考文献中，14个（56%）的答案准确地引用了原始参考文献中的信息。结论chatgpt能较准确地回答示例患者的问题，但存在较高的假参考率。此外，答案提示的阅读水平远高于疾病控制和预防中心（CDC）对普通患者的建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

American Journal of Otolaryngology 医学-耳鼻喉科学

CiteScore

4.40

自引率

4.00%

发文量

378

审稿时长

41 days

期刊介绍： Be fully informed about developments in otology, neurotology, audiology, rhinology, allergy, laryngology, speech science, bronchoesophagology, facial plastic surgery, and head and neck surgery. Featured sections include original contributions, grand rounds, current reviews, case reports and socioeconomics.