Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery.

IF 1.9 4区医学 Q2 OPHTHALMOLOGY

Seminars in Ophthalmology Pub Date : 2024-08-01 Epub Date: 2024-03-22 DOI:10.1080/08820538.2024.2326058

Samuel A Cohen, Arthur Brant, Ann Caroline Fisher, Suzann Pershing, Diana Do, Carolyn Pan

{"title":"Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery.","authors":"Samuel A Cohen, Arthur Brant, Ann Caroline Fisher, Suzann Pershing, Diana Do, Carolyn Pan","doi":"10.1080/08820538.2024.2326058","DOIUrl":null,"url":null,"abstract":"Purpose: Patients are using online search modalities to learn about their eye health. While Google remains the most popular search engine, the use of large language models (LLMs) like ChatGPT has increased. Cataract surgery is the most common surgical procedure in the US, and there is limited data on the quality of online information that populates after searches related to cataract surgery on search engines such as Google and LLM platforms such as ChatGPT. We identified the most common patient frequently asked questions (FAQs) about cataracts and cataract surgery and evaluated the accuracy, safety, and readability of the answers to these questions provided by both Google and ChatGPT. We demonstrated the utility of ChatGPT in writing notes and creating patient education materials.Methods: The top 20 FAQs related to cataracts and cataract surgery were recorded from Google. Responses to the questions provided by Google and ChatGPT were evaluated by a panel of ophthalmologists for accuracy and safety. Evaluators were also asked to distinguish between Google and LLM chatbot answers. Five validated readability indices were used to assess the readability of responses. ChatGPT was instructed to generate operative notes, post-operative instructions, and customizable patient education materials according to specific readability criteria.Results: Responses to 20 patient FAQs generated by ChatGPT were significantly longer and written at a higher reading level than responses provided by Google (p < .001), with an average grade level of 14.8 (college level). Expert reviewers were correctly able to distinguish between a human-reviewed and chatbot generated response an average of 31% of the time. Google answers contained incorrect or inappropriate material 27% of the time, compared with 6% of LLM generated answers (p < .001). When expert reviewers were asked to compare the responses directly, chatbot responses were favored (66%).Conclusions: When comparing the responses to patients' cataract FAQs provided by ChatGPT and Google, practicing ophthalmologists overwhelming preferred ChatGPT responses. LLM chatbot responses were less likely to contain inaccurate information. ChatGPT represents a viable information source for eye health for patients with higher health literacy. ChatGPT may also be used by ophthalmologists to create customizable patient education materials for patients with varying health literacy.","PeriodicalId":21702,"journal":{"name":"Seminars in Ophthalmology","volume":" ","pages":"472-479"},"PeriodicalIF":1.9000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seminars in Ophthalmology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/08820538.2024.2326058","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/22 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Patients are using online search modalities to learn about their eye health. While Google remains the most popular search engine, the use of large language models (LLMs) like ChatGPT has increased. Cataract surgery is the most common surgical procedure in the US, and there is limited data on the quality of online information that populates after searches related to cataract surgery on search engines such as Google and LLM platforms such as ChatGPT. We identified the most common patient frequently asked questions (FAQs) about cataracts and cataract surgery and evaluated the accuracy, safety, and readability of the answers to these questions provided by both Google and ChatGPT. We demonstrated the utility of ChatGPT in writing notes and creating patient education materials.

Methods: The top 20 FAQs related to cataracts and cataract surgery were recorded from Google. Responses to the questions provided by Google and ChatGPT were evaluated by a panel of ophthalmologists for accuracy and safety. Evaluators were also asked to distinguish between Google and LLM chatbot answers. Five validated readability indices were used to assess the readability of responses. ChatGPT was instructed to generate operative notes, post-operative instructions, and customizable patient education materials according to specific readability criteria.

Results: Responses to 20 patient FAQs generated by ChatGPT were significantly longer and written at a higher reading level than responses provided by Google (p < .001), with an average grade level of 14.8 (college level). Expert reviewers were correctly able to distinguish between a human-reviewed and chatbot generated response an average of 31% of the time. Google answers contained incorrect or inappropriate material 27% of the time, compared with 6% of LLM generated answers (p < .001). When expert reviewers were asked to compare the responses directly, chatbot responses were favored (66%).

Conclusions: When comparing the responses to patients' cataract FAQs provided by ChatGPT and Google, practicing ophthalmologists overwhelming preferred ChatGPT responses. LLM chatbot responses were less likely to contain inaccurate information. ChatGPT represents a viable information source for eye health for patients with higher health literacy. ChatGPT may also be used by ophthalmologists to create customizable patient education materials for patients with varying health literacy.

查看原文本刊更多论文

谷歌医生与 ChatGPT 医生：通过比较白内障和白内障手术患者常见问题回复的准确性、安全性和可读性，探索人工智能在眼科领域的应用。

目的：患者正在使用在线搜索模式来了解自己的眼睛健康状况。虽然谷歌仍然是最受欢迎的搜索引擎，但像 ChatGPT 这样的大型语言模型（LLM）的使用也在增加。白内障手术是美国最常见的外科手术，而在谷歌等搜索引擎和 ChatGPT 等 LLM 平台上搜索白内障手术相关信息后，有关在线信息质量的数据却很有限。我们确定了有关白内障和白内障手术的最常见患者常见问题（FAQ），并评估了谷歌和 ChatGPT 提供的这些问题答案的准确性、安全性和可读性。我们展示了 ChatGPT 在撰写笔记和创建患者教育材料方面的实用性：方法：我们从谷歌记录了与白内障和白内障手术相关的前 20 个常见问题。由眼科医生组成的小组对谷歌和 ChatGPT 提供的问题回复进行了准确性和安全性评估。评估人员还被要求区分谷歌和 LLM 聊天机器人的回答。我们使用了五个经过验证的可读性指数来评估回答的可读性。根据特定的可读性标准，指导 ChatGPT 生成手术注意事项、术后说明和可定制的患者教育材料：结果：由 ChatGPT 生成的 20 个患者常见问题的回复明显比谷歌提供的回复更长，阅读水平也更高（p p 结论）：在比较 ChatGPT 和谷歌提供的患者白内障常见问题回复时，绝大多数眼科医生更喜欢 ChatGPT 的回复。LLM聊天机器人的回复不太可能包含不准确的信息。对于健康素养较高的患者来说，ChatGPT 是一个可行的眼健康信息来源。眼科医生还可以利用 ChatGPT 为不同健康素养的患者创建可定制的患者教育材料。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Seminars in Ophthalmology OPHTHALMOLOGY-

CiteScore

3.20

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Seminars in Ophthalmology offers current, clinically oriented reviews on the diagnosis and treatment of ophthalmic disorders. Each issue focuses on a single topic, with a primary emphasis on appropriate surgical techniques.