评估将 ChatGPT 作为 COVID-19 引起的嗅觉功能障碍的患者教育工具。

IF 1.8 Q2 OTORHINOLARYNGOLOGY

OTO Open Pub Date : 2024-09-15 eCollection Date: 2024-07-01 DOI:10.1002/oto2.70011

Elliott M Sina, Daniel J Campbell, Alexander Duffy, Shreya Mandloi, Peter Benedict, Douglas Farquhar, Aykut Unsal, Gurston Nyquist

{"title":"评估将 ChatGPT 作为 COVID-19 引起的嗅觉功能障碍的患者教育工具。","authors":"Elliott M Sina, Daniel J Campbell, Alexander Duffy, Shreya Mandloi, Peter Benedict, Douglas Farquhar, Aykut Unsal, Gurston Nyquist","doi":"10.1002/oto2.70011","DOIUrl":null,"url":null,"abstract":"Objective: While most patients with COVID-19-induced olfactory dysfunction (OD) recover spontaneously, those with persistent OD face significant physical and psychological sequelae. ChatGPT, an artificial intelligence chatbot, has grown as a tool for patient education. This study seeks to evaluate the quality of ChatGPT-generated responses for COVID-19 OD.Study design: Quantitative observational study.Setting: Publicly available online website.Methods: ChatGPT (GPT-4) was queried 4 times with 30 identical questions. Prior to questioning, Chat-GPT was \"prompted\" to respond (1) to a patient, (2) to an eighth grader, (3) with references, and (4) no prompt. Answer accuracy was independently scored by 4 rhinologists using the Global Quality Score (GCS, range: 1-5). Proportions of responses at incremental score thresholds were compared using χ 2 analysis. Flesch-Kincaid grade level was calculated for each answer. Relationship between prompt type and grade level was assessed via analysis of variance.Results: Across all graded responses (n = 480), 364 responses (75.8%) were \"at least good\" (GCS ≥ 4). Proportions of responses that were \"at least good\" (P < .0001) or \"excellent\" (GCS = 5) (P < .0001) differed by prompt; \"at least moderate\" (GCS ≥ 3) responses did not (P = .687). Eighth-grade level (14.06 ± 2.3) and patient-friendly (14.33 ± 2.0) responses were significantly lower mean grade level than no prompting (P < .0001).Conclusion: ChatGPT provides appropriate answers to most questions on COVID-19 OD regardless of prompting. However, prompting influences response quality and grade level. ChatGPT responds at grade levels above accepted recommendations for presenting medical information to patients. Currently, ChatGPT offers significant potential for patient education as an adjunct to the conventional patient-physician relationship.","PeriodicalId":19697,"journal":{"name":"OTO Open","volume":"8 3","pages":"e70011"},"PeriodicalIF":1.8000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11403001/pdf/","citationCount":"0","resultStr":"{\"title\":\"Evaluating ChatGPT as a Patient Education Tool for COVID-19-Induced Olfactory Dysfunction.\",\"authors\":\"Elliott M Sina, Daniel J Campbell, Alexander Duffy, Shreya Mandloi, Peter Benedict, Douglas Farquhar, Aykut Unsal, Gurston Nyquist\",\"doi\":\"10.1002/oto2.70011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Objective: While most patients with COVID-19-induced olfactory dysfunction (OD) recover spontaneously, those with persistent OD face significant physical and psychological sequelae. ChatGPT, an artificial intelligence chatbot, has grown as a tool for patient education. This study seeks to evaluate the quality of ChatGPT-generated responses for COVID-19 OD.Study design: Quantitative observational study.Setting: Publicly available online website.Methods: ChatGPT (GPT-4) was queried 4 times with 30 identical questions. Prior to questioning, Chat-GPT was \\\"prompted\\\" to respond (1) to a patient, (2) to an eighth grader, (3) with references, and (4) no prompt. Answer accuracy was independently scored by 4 rhinologists using the Global Quality Score (GCS, range: 1-5). Proportions of responses at incremental score thresholds were compared using χ 2 analysis. Flesch-Kincaid grade level was calculated for each answer. Relationship between prompt type and grade level was assessed via analysis of variance.Results: Across all graded responses (n = 480), 364 responses (75.8%) were \\\"at least good\\\" (GCS ≥ 4). Proportions of responses that were \\\"at least good\\\" (P < .0001) or \\\"excellent\\\" (GCS = 5) (P < .0001) differed by prompt; \\\"at least moderate\\\" (GCS ≥ 3) responses did not (P = .687). Eighth-grade level (14.06 ± 2.3) and patient-friendly (14.33 ± 2.0) responses were significantly lower mean grade level than no prompting (P < .0001).Conclusion: ChatGPT provides appropriate answers to most questions on COVID-19 OD regardless of prompting. However, prompting influences response quality and grade level. ChatGPT responds at grade levels above accepted recommendations for presenting medical information to patients. Currently, ChatGPT offers significant potential for patient education as an adjunct to the conventional patient-physician relationship.\",\"PeriodicalId\":19697,\"journal\":{\"name\":\"OTO Open\",\"volume\":\"8 3\",\"pages\":\"e70011\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11403001/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"OTO Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/oto2.70011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"OTORHINOLARYNGOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"OTO Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/oto2.70011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

目的：虽然大多数 COVID-19 引起的嗅觉功能障碍（OD）患者都能自愈，但那些持续存在 OD 的患者却面临着严重的生理和心理后遗症。人工智能聊天机器人 ChatGPT 已发展成为一种患者教育工具。本研究旨在评估 ChatGPT 为 COVID-19 OD 生成的回复质量：研究设计：定量观察研究：研究设计：定量观察研究：对 ChatGPT（GPT-4）进行了 4 次查询，共 30 个相同的问题。在提问之前，Chat-GPT 会被 "提示 "回答：(1) 病人；(2) 八年级学生；(3) 有参考资料；(4) 无提示。回答的准确性由 4 位鼻科专家使用 "全面质量评分"（GCS，范围：1-5）进行独立评分。采用 χ 2 分析法对增量分数阈值的回答比例进行比较。计算每个答案的 Flesch-Kincaid 等级。通过方差分析评估了提示类型与等级之间的关系：在所有分级回答（n = 480）中，364 个回答（75.8%）"至少良好"（GCS ≥ 4）。至少良好 "的回答比例（P P P = .687）。八年级水平（14.06 ± 2.3）和患者友好型（14.33 ± 2.0）回答的平均年级水平明显低于无提示（P 结论：无论提示与否，ChatGPT 都能为 COVID-19 OD 的大多数问题提供适当的答案。但是，提示会影响回答质量和年级。ChatGPT 的回答等级高于向患者提供医疗信息的公认建议等级。目前，ChatGPT 作为传统医患关系的辅助工具，为患者教育提供了巨大的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating ChatGPT as a Patient Education Tool for COVID-19-Induced Olfactory Dysfunction.

Objective: While most patients with COVID-19-induced olfactory dysfunction (OD) recover spontaneously, those with persistent OD face significant physical and psychological sequelae. ChatGPT, an artificial intelligence chatbot, has grown as a tool for patient education. This study seeks to evaluate the quality of ChatGPT-generated responses for COVID-19 OD.

Study design: Quantitative observational study.

Setting: Publicly available online website.

Methods: ChatGPT (GPT-4) was queried 4 times with 30 identical questions. Prior to questioning, Chat-GPT was "prompted" to respond (1) to a patient, (2) to an eighth grader, (3) with references, and (4) no prompt. Answer accuracy was independently scored by 4 rhinologists using the Global Quality Score (GCS, range: 1-5). Proportions of responses at incremental score thresholds were compared using χ ² analysis. Flesch-Kincaid grade level was calculated for each answer. Relationship between prompt type and grade level was assessed via analysis of variance.

Results: Across all graded responses (n = 480), 364 responses (75.8%) were "at least good" (GCS ≥ 4). Proportions of responses that were "at least good" (P < .0001) or "excellent" (GCS = 5) (P < .0001) differed by prompt; "at least moderate" (GCS ≥ 3) responses did not (P = .687). Eighth-grade level (14.06 ± 2.3) and patient-friendly (14.33 ± 2.0) responses were significantly lower mean grade level than no prompting (P < .0001).

Conclusion: ChatGPT provides appropriate answers to most questions on COVID-19 OD regardless of prompting. However, prompting influences response quality and grade level. ChatGPT responds at grade levels above accepted recommendations for presenting medical information to patients. Currently, ChatGPT offers significant potential for patient education as an adjunct to the conventional patient-physician relationship.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

OTO Open Medicine-Surgery

CiteScore

2.70

自引率

0.00%

发文量

115

审稿时长

15 weeks