自定义聊天机器人的可读性与GPT-4对耳鼻喉科相关患者问题的反应。

IF 1.7 4区医学 Q2 OTORHINOLARYNGOLOGY

American Journal of Otolaryngology Pub Date : 2025-09-01 Epub Date: 2025-08-06 DOI:10.1016/j.amjoto.2025.104717

Yossef Alsabawi, Pompeyo R Quesada, David T Rouse

{"title":"自定义聊天机器人的可读性与GPT-4对耳鼻喉科相关患者问题的反应。","authors":"Yossef Alsabawi, Pompeyo R Quesada, David T Rouse","doi":"10.1016/j.amjoto.2025.104717","DOIUrl":null,"url":null,"abstract":"Background: Low health literacy among patients hinders comprehension of care instructions and worsens outcomes, yet most otolaryngology patient materials and chatbot responses to medical inquiries exceed the recommended reading level of sixth- to eighth-grade. Whether chatbots can be pre-programmed to provide accurate, plain-language responses has yet to be studied. This study aims to compare response readability of a GPT model customized for plain language with GPT-4 when answering common otolaryngology patient questions.Methods: A custom GPT was created and provided thirty-three questions from Polat et al. (Int J Pediatr Otorhinolaryngol., 2024), and their GPT-4 answers were reused with permission. Questions were grouped by theme. Readability was calculated with Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) via online calculator. A board-certified, practicing otolaryngologist assessed content similarity and accuracy. The primary outcome was readability, measured by FKGL (0-18; equivalent to United States grade level) and FRE (0-100; higher scores indicate greater readability).Results: The custom GPT reduced FKGL by an average of 4.2 grade levels (95 % confidence interval [CI]: 3.2, 5.1; p < 0.001) and increased FRE by an average of 17.3 points (95 % CI: 12.5, 21.7; p < 0.001). Improvements remained significant in three of four theme subgroups (p < 0.05). Readability was consistent across question types, and variances were equal between models. Expert review confirmed overall accuracy and content similarity.Conclusion: Preprogramming a custom GPT to generate plain-language instructions yields outputs that meet Centers for Medicare & Medicaid Services readability targets without significantly compromising content quality. Tailored chatbots could enhance patient communication in otolaryngology clinics and other medical settings.","PeriodicalId":7591,"journal":{"name":"American Journal of Otolaryngology","volume":"46 5","pages":"104717"},"PeriodicalIF":1.7000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Readability of custom chatbot vs. GPT-4 responses to otolaryngology-related patient questions.\",\"authors\":\"Yossef Alsabawi, Pompeyo R Quesada, David T Rouse\",\"doi\":\"10.1016/j.amjoto.2025.104717\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Low health literacy among patients hinders comprehension of care instructions and worsens outcomes, yet most otolaryngology patient materials and chatbot responses to medical inquiries exceed the recommended reading level of sixth- to eighth-grade. Whether chatbots can be pre-programmed to provide accurate, plain-language responses has yet to be studied. This study aims to compare response readability of a GPT model customized for plain language with GPT-4 when answering common otolaryngology patient questions.Methods: A custom GPT was created and provided thirty-three questions from Polat et al. (Int J Pediatr Otorhinolaryngol., 2024), and their GPT-4 answers were reused with permission. Questions were grouped by theme. Readability was calculated with Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) via online calculator. A board-certified, practicing otolaryngologist assessed content similarity and accuracy. The primary outcome was readability, measured by FKGL (0-18; equivalent to United States grade level) and FRE (0-100; higher scores indicate greater readability).Results: The custom GPT reduced FKGL by an average of 4.2 grade levels (95 % confidence interval [CI]: 3.2, 5.1; p < 0.001) and increased FRE by an average of 17.3 points (95 % CI: 12.5, 21.7; p < 0.001). Improvements remained significant in three of four theme subgroups (p < 0.05). Readability was consistent across question types, and variances were equal between models. Expert review confirmed overall accuracy and content similarity.Conclusion: Preprogramming a custom GPT to generate plain-language instructions yields outputs that meet Centers for Medicare & Medicaid Services readability targets without significantly compromising content quality. Tailored chatbots could enhance patient communication in otolaryngology clinics and other medical settings.\",\"PeriodicalId\":7591,\"journal\":{\"name\":\"American Journal of Otolaryngology\",\"volume\":\"46 5\",\"pages\":\"104717\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Otolaryngology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.amjoto.2025.104717\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/8/6 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"OTORHINOLARYNGOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Otolaryngology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.amjoto.2025.104717","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/6 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

背景：患者的低健康素养阻碍了对护理说明的理解，并恶化了结果，然而大多数耳鼻喉科患者资料和聊天机器人对医疗咨询的回应超过了六至八年级推荐的阅读水平。聊天机器人是否可以预先编程，以提供准确、简单的语言回应，目前还有待研究。本研究旨在比较针对普通语言定制的GPT模型与GPT-4在回答常见耳鼻喉科患者问题时的反应可读性。方法：创建自定义GPT，并提供Polat等人的33个问题。（2024），他们的GPT-4答案在得到许可的情况下被重复使用。问题按主题分组。可读性采用Flesch- kincaid Grade Level （FKGL）和Flesch Reading Ease （FRE）通过在线计算器计算。一位经过认证的执业耳鼻喉科医生评估了内容的相似性和准确性。主要指标为可读性，由FKGL (0-18；相当于美国年级水平)和FRE (0-100；分数越高说明可读性越好)。结果：自定义GPT平均降低了4.2个等级水平的FKGL(95%置信区间[CI]: 3.2, 5.1；p结论：预编程自定义GPT以生成简单的语言指令，产生满足医疗保险和医疗补助服务中心可读性目标的输出，而不会显著影响内容质量。量身定制的聊天机器人可以加强耳鼻喉科诊所和其他医疗机构的患者交流。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Readability of custom chatbot vs. GPT-4 responses to otolaryngology-related patient questions.

Background: Low health literacy among patients hinders comprehension of care instructions and worsens outcomes, yet most otolaryngology patient materials and chatbot responses to medical inquiries exceed the recommended reading level of sixth- to eighth-grade. Whether chatbots can be pre-programmed to provide accurate, plain-language responses has yet to be studied. This study aims to compare response readability of a GPT model customized for plain language with GPT-4 when answering common otolaryngology patient questions.

Methods: A custom GPT was created and provided thirty-three questions from Polat et al. (Int J Pediatr Otorhinolaryngol., 2024), and their GPT-4 answers were reused with permission. Questions were grouped by theme. Readability was calculated with Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) via online calculator. A board-certified, practicing otolaryngologist assessed content similarity and accuracy. The primary outcome was readability, measured by FKGL (0-18; equivalent to United States grade level) and FRE (0-100; higher scores indicate greater readability).

Results: The custom GPT reduced FKGL by an average of 4.2 grade levels (95 % confidence interval [CI]: 3.2, 5.1; p < 0.001) and increased FRE by an average of 17.3 points (95 % CI: 12.5, 21.7; p < 0.001). Improvements remained significant in three of four theme subgroups (p < 0.05). Readability was consistent across question types, and variances were equal between models. Expert review confirmed overall accuracy and content similarity.

Conclusion: Preprogramming a custom GPT to generate plain-language instructions yields outputs that meet Centers for Medicare & Medicaid Services readability targets without significantly compromising content quality. Tailored chatbots could enhance patient communication in otolaryngology clinics and other medical settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

American Journal of Otolaryngology 医学-耳鼻喉科学

CiteScore

4.40

自引率

4.00%

发文量

378

审稿时长

41 days

期刊介绍： Be fully informed about developments in otology, neurotology, audiology, rhinology, allergy, laryngology, speech science, bronchoesophagology, facial plastic surgery, and head and neck surgery. Featured sections include original contributions, grand rounds, current reviews, case reports and socioeconomics.