作为弱视教育工具的聊天机器人和亚太视光学学会网站的性能

Journal of Pediatric Ophthalmology and Strabismus Pub Date : 2024-04-25 DOI:10.3928/01913913-20240409-01

Levent Doğan, MD, Gazi Bekir Özçakmakcı, MD, Ĭbrahim Edhem Yılmaz, MD

{"title":"作为弱视教育工具的聊天机器人和亚太视光学学会网站的性能","authors":"Levent Doğan, MD, Gazi Bekir Özçakmakcı, MD, Ĭbrahim Edhem Yılmaz, MD","doi":"10.3928/01913913-20240409-01","DOIUrl":null,"url":null,"abstract":"<section><h3>Purpose:</h3><p>To evaluate the understandability, actionability, and readability of responses provided by the website of the American Association for Pediatric Ophthalmology and Strabismus (AAPOS), ChatGPT-3.5, Bard, and Bing Chat about amblyopia and the appropriateness of the responses generated by the chatbots.</p></section><section><h3>Method:</h3><p>Twenty-five questions provided by the AAPOS website were directed three times to fresh ChatGPT-3.5, Bard, and Bing Chat interfaces. Two experienced pediatric ophthalmologists categorized the responses of the chatbots in terms of their appropriateness. Flesch Reading Ease (FRE), Flesch Kincaid Grade Level (FKGL), and Coleman-Liau Index (CLI) were used to evaluate the readability of the responses of the AAPOS website and chatbots. Furthermore, the understandability scores were evaluated using the Patient Education Materials Assessment Tool (PEMAT).</p></section><section><h3>Results:</h3><p>The appropriateness of the chatbots' responses was 84.0% for ChatGPT-3.5 and Bard and 80% for Bing Chat (<i>P</i> > .05). For understandability (mean PEMAT-U score AAPOS website: 81.5%, Bard: 77.6%, ChatGPT-3.5: 76.1%, and Bing Chat: 71.5%, <i>P</i> < .05) and actionability (mean PEMAT-A score AAPOS website: 74.6%, Bard: 69.2%, ChatGPT-3.5: 67.8%, and Bing Chat: 64.8%, <i>P</i> < .05), the AAPOs website scored better than the chat-bots. Three readability analyses showed that Bard had the highest mean score, followed by the AAPOS website, Bing Chat, and ChatGPT-3.5, and these scores were more challenging than the recommended level.</p></section><section><h3>Conclusions:</h3><p>Chatbots have the potential to provide detailed and appropriate responses at acceptable levels. The AAPOS website has the advantage of providing information that is more understandable and actionable. The AAPOS website and chatbots, especially Chat-GPT, provided difficult-to-read data for patient education regarding amblyopia.</p><p><strong>[<i>J Pediatr Ophthalmol Strabismus</i>. 20XX;X(X):XXX–XXX.]</strong></p></section>","PeriodicalId":519537,"journal":{"name":"Journal of Pediatric Ophthalmology and Strabismus","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Performance of Chatbots and the AAPOS Website as a Tool for Amblyopia Education\",\"authors\":\"Levent Doğan, MD, Gazi Bekir Özçakmakcı, MD, Ĭbrahim Edhem Yılmaz, MD\",\"doi\":\"10.3928/01913913-20240409-01\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<section><h3>Purpose:</h3><p>To evaluate the understandability, actionability, and readability of responses provided by the website of the American Association for Pediatric Ophthalmology and Strabismus (AAPOS), ChatGPT-3.5, Bard, and Bing Chat about amblyopia and the appropriateness of the responses generated by the chatbots.</p></section><section><h3>Method:</h3><p>Twenty-five questions provided by the AAPOS website were directed three times to fresh ChatGPT-3.5, Bard, and Bing Chat interfaces. Two experienced pediatric ophthalmologists categorized the responses of the chatbots in terms of their appropriateness. Flesch Reading Ease (FRE), Flesch Kincaid Grade Level (FKGL), and Coleman-Liau Index (CLI) were used to evaluate the readability of the responses of the AAPOS website and chatbots. Furthermore, the understandability scores were evaluated using the Patient Education Materials Assessment Tool (PEMAT).</p></section><section><h3>Results:</h3><p>The appropriateness of the chatbots' responses was 84.0% for ChatGPT-3.5 and Bard and 80% for Bing Chat (<i>P</i> > .05). For understandability (mean PEMAT-U score AAPOS website: 81.5%, Bard: 77.6%, ChatGPT-3.5: 76.1%, and Bing Chat: 71.5%, <i>P</i> < .05) and actionability (mean PEMAT-A score AAPOS website: 74.6%, Bard: 69.2%, ChatGPT-3.5: 67.8%, and Bing Chat: 64.8%, <i>P</i> < .05), the AAPOs website scored better than the chat-bots. Three readability analyses showed that Bard had the highest mean score, followed by the AAPOS website, Bing Chat, and ChatGPT-3.5, and these scores were more challenging than the recommended level.</p></section><section><h3>Conclusions:</h3><p>Chatbots have the potential to provide detailed and appropriate responses at acceptable levels. The AAPOS website has the advantage of providing information that is more understandable and actionable. The AAPOS website and chatbots, especially Chat-GPT, provided difficult-to-read data for patient education regarding amblyopia.</p><p><strong>[<i>J Pediatr Ophthalmol Strabismus</i>. 20XX;X(X):XXX–XXX.]</strong></p></section>\",\"PeriodicalId\":519537,\"journal\":{\"name\":\"Journal of Pediatric Ophthalmology and Strabismus\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Pediatric Ophthalmology and Strabismus\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3928/01913913-20240409-01\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pediatric Ophthalmology and Strabismus","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3928/01913913-20240409-01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

目的：评估美国小儿眼科和斜视协会（AAPOS）网站、ChatGPT-3.5、Bard 和 Bing Chat 提供的有关弱视的回复的可理解性、可操作性和可读性，以及聊天机器人生成的回复的适当性。方法：将 AAPOS 网站提供的 25 个问题三次定向到新鲜的 ChatGPT-3.5、Bard 和 Bing Chat 界面。两位经验丰富的小儿眼科专家根据聊天机器人回复的适当性进行了分类。Flesch Reading Ease (FRE)、Flesch Kincaid Grade Level (FKGL)和Coleman-Liau Index (CLI)被用来评估AAPOS网站和聊天机器人回复的可读性。结果：ChatGPT-3.5 和 Bard 聊天机器人回答的适当性分别为 84.0% 和 80%（P >.05）。在可理解性方面（PEMAT-U 平均得分 AAPOS 网站为 81.5%，Bard 为 77.5%，Bing Chat 为 81.5%），ChatGPT-3.5 为 81.5%：81.5%，Bard：77.6%，ChatGPT-3.5：76.1%，Bing Chat：71.5%，P < 0.05：71.5%，P <.05）和可操作性（PEMAT-A 平均得分 AAPOS 网站：74.6%，Bard：76.1%，Bing 聊天：71.5%，P <.05）：74.6%, Bard: 69.2%, ChatGPT-3.5: 67.8%, Bing Chat：64.8%，P <.05），AAPOs 网站的得分高于聊天机器人。三项可读性分析表明，Bard 的平均得分最高，其次是 AAPOS 网站、Bing Chat 和 ChatGPT-3.5，这些得分都比建议水平更具挑战性。亚太OS 网站的优势在于提供的信息更易于理解和操作。AAPOS网站和聊天机器人，尤其是聊天-GPT，为弱视患者教育提供了难以阅读的数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Performance of Chatbots and the AAPOS Website as a Tool for Amblyopia Education

Purpose:

To evaluate the understandability, actionability, and readability of responses provided by the website of the American Association for Pediatric Ophthalmology and Strabismus (AAPOS), ChatGPT-3.5, Bard, and Bing Chat about amblyopia and the appropriateness of the responses generated by the chatbots.

Method:

Twenty-five questions provided by the AAPOS website were directed three times to fresh ChatGPT-3.5, Bard, and Bing Chat interfaces. Two experienced pediatric ophthalmologists categorized the responses of the chatbots in terms of their appropriateness. Flesch Reading Ease (FRE), Flesch Kincaid Grade Level (FKGL), and Coleman-Liau Index (CLI) were used to evaluate the readability of the responses of the AAPOS website and chatbots. Furthermore, the understandability scores were evaluated using the Patient Education Materials Assessment Tool (PEMAT).

Results:

The appropriateness of the chatbots' responses was 84.0% for ChatGPT-3.5 and Bard and 80% for Bing Chat (P > .05). For understandability (mean PEMAT-U score AAPOS website: 81.5%, Bard: 77.6%, ChatGPT-3.5: 76.1%, and Bing Chat: 71.5%, P < .05) and actionability (mean PEMAT-A score AAPOS website: 74.6%, Bard: 69.2%, ChatGPT-3.5: 67.8%, and Bing Chat: 64.8%, P < .05), the AAPOs website scored better than the chat-bots. Three readability analyses showed that Bard had the highest mean score, followed by the AAPOS website, Bing Chat, and ChatGPT-3.5, and these scores were more challenging than the recommended level.

Conclusions:

Chatbots have the potential to provide detailed and appropriate responses at acceptable levels. The AAPOS website has the advantage of providing information that is more understandable and actionable. The AAPOS website and chatbots, especially Chat-GPT, provided difficult-to-read data for patient education regarding amblyopia.

[J Pediatr Ophthalmol Strabismus. 20XX;X(X):XXX–XXX.]

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Pediatric Ophthalmology and Strabismus

自引率

0.00%

发文量