Elliott M Sina, Daniel J Campbell, Alexander Duffy, Shreya Mandloi, Peter Benedict, Douglas Farquhar, Aykut Unsal, Gurston Nyquist
{"title":"评估将 ChatGPT 作为 COVID-19 引起的嗅觉功能障碍的患者教育工具。","authors":"Elliott M Sina, Daniel J Campbell, Alexander Duffy, Shreya Mandloi, Peter Benedict, Douglas Farquhar, Aykut Unsal, Gurston Nyquist","doi":"10.1002/oto2.70011","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>While most patients with COVID-19-induced olfactory dysfunction (OD) recover spontaneously, those with persistent OD face significant physical and psychological sequelae. ChatGPT, an artificial intelligence chatbot, has grown as a tool for patient education. This study seeks to evaluate the quality of ChatGPT-generated responses for COVID-19 OD.</p><p><strong>Study design: </strong>Quantitative observational study.</p><p><strong>Setting: </strong>Publicly available online website.</p><p><strong>Methods: </strong>ChatGPT (GPT-4) was queried 4 times with 30 identical questions. Prior to questioning, Chat-GPT was \"prompted\" to respond (1) to a patient, (2) to an eighth grader, (3) with references, and (4) no prompt. Answer accuracy was independently scored by 4 rhinologists using the Global Quality Score (GCS, range: 1-5). Proportions of responses at incremental score thresholds were compared using <i>χ</i> <sup>2</sup> analysis. Flesch-Kincaid grade level was calculated for each answer. Relationship between prompt type and grade level was assessed via analysis of variance.</p><p><strong>Results: </strong>Across all graded responses (n = 480), 364 responses (75.8%) were \"at least good\" (GCS ≥ 4). Proportions of responses that were \"at least good\" (<i>P</i> < .0001) or \"excellent\" (GCS = 5) (<i>P</i> < .0001) differed by prompt; \"at least moderate\" (GCS ≥ 3) responses did not (<i>P</i> = .687). Eighth-grade level (14.06 ± 2.3) and patient-friendly (14.33 ± 2.0) responses were significantly lower mean grade level than no prompting (<i>P</i> < .0001).</p><p><strong>Conclusion: </strong>ChatGPT provides appropriate answers to most questions on COVID-19 OD regardless of prompting. However, prompting influences response quality and grade level. ChatGPT responds at grade levels above accepted recommendations for presenting medical information to patients. Currently, ChatGPT offers significant potential for patient education as an adjunct to the conventional patient-physician relationship.</p>","PeriodicalId":19697,"journal":{"name":"OTO Open","volume":"8 3","pages":"e70011"},"PeriodicalIF":1.8000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11403001/pdf/","citationCount":"0","resultStr":"{\"title\":\"Evaluating ChatGPT as a Patient Education Tool for COVID-19-Induced Olfactory Dysfunction.\",\"authors\":\"Elliott M Sina, Daniel J Campbell, Alexander Duffy, Shreya Mandloi, Peter Benedict, Douglas Farquhar, Aykut Unsal, Gurston Nyquist\",\"doi\":\"10.1002/oto2.70011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>While most patients with COVID-19-induced olfactory dysfunction (OD) recover spontaneously, those with persistent OD face significant physical and psychological sequelae. ChatGPT, an artificial intelligence chatbot, has grown as a tool for patient education. This study seeks to evaluate the quality of ChatGPT-generated responses for COVID-19 OD.</p><p><strong>Study design: </strong>Quantitative observational study.</p><p><strong>Setting: </strong>Publicly available online website.</p><p><strong>Methods: </strong>ChatGPT (GPT-4) was queried 4 times with 30 identical questions. Prior to questioning, Chat-GPT was \\\"prompted\\\" to respond (1) to a patient, (2) to an eighth grader, (3) with references, and (4) no prompt. Answer accuracy was independently scored by 4 rhinologists using the Global Quality Score (GCS, range: 1-5). Proportions of responses at incremental score thresholds were compared using <i>χ</i> <sup>2</sup> analysis. Flesch-Kincaid grade level was calculated for each answer. Relationship between prompt type and grade level was assessed via analysis of variance.</p><p><strong>Results: </strong>Across all graded responses (n = 480), 364 responses (75.8%) were \\\"at least good\\\" (GCS ≥ 4). Proportions of responses that were \\\"at least good\\\" (<i>P</i> < .0001) or \\\"excellent\\\" (GCS = 5) (<i>P</i> < .0001) differed by prompt; \\\"at least moderate\\\" (GCS ≥ 3) responses did not (<i>P</i> = .687). Eighth-grade level (14.06 ± 2.3) and patient-friendly (14.33 ± 2.0) responses were significantly lower mean grade level than no prompting (<i>P</i> < .0001).</p><p><strong>Conclusion: </strong>ChatGPT provides appropriate answers to most questions on COVID-19 OD regardless of prompting. However, prompting influences response quality and grade level. ChatGPT responds at grade levels above accepted recommendations for presenting medical information to patients. Currently, ChatGPT offers significant potential for patient education as an adjunct to the conventional patient-physician relationship.</p>\",\"PeriodicalId\":19697,\"journal\":{\"name\":\"OTO Open\",\"volume\":\"8 3\",\"pages\":\"e70011\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11403001/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"OTO Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/oto2.70011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"OTORHINOLARYNGOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"OTO Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/oto2.70011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}
Evaluating ChatGPT as a Patient Education Tool for COVID-19-Induced Olfactory Dysfunction.
Objective: While most patients with COVID-19-induced olfactory dysfunction (OD) recover spontaneously, those with persistent OD face significant physical and psychological sequelae. ChatGPT, an artificial intelligence chatbot, has grown as a tool for patient education. This study seeks to evaluate the quality of ChatGPT-generated responses for COVID-19 OD.
Study design: Quantitative observational study.
Setting: Publicly available online website.
Methods: ChatGPT (GPT-4) was queried 4 times with 30 identical questions. Prior to questioning, Chat-GPT was "prompted" to respond (1) to a patient, (2) to an eighth grader, (3) with references, and (4) no prompt. Answer accuracy was independently scored by 4 rhinologists using the Global Quality Score (GCS, range: 1-5). Proportions of responses at incremental score thresholds were compared using χ2 analysis. Flesch-Kincaid grade level was calculated for each answer. Relationship between prompt type and grade level was assessed via analysis of variance.
Results: Across all graded responses (n = 480), 364 responses (75.8%) were "at least good" (GCS ≥ 4). Proportions of responses that were "at least good" (P < .0001) or "excellent" (GCS = 5) (P < .0001) differed by prompt; "at least moderate" (GCS ≥ 3) responses did not (P = .687). Eighth-grade level (14.06 ± 2.3) and patient-friendly (14.33 ± 2.0) responses were significantly lower mean grade level than no prompting (P < .0001).
Conclusion: ChatGPT provides appropriate answers to most questions on COVID-19 OD regardless of prompting. However, prompting influences response quality and grade level. ChatGPT responds at grade levels above accepted recommendations for presenting medical information to patients. Currently, ChatGPT offers significant potential for patient education as an adjunct to the conventional patient-physician relationship.