Quality and Readability of Patient Educational Materials Generated by ChatGPT-4o for Pediatric Ophthalmologic Surgeries.

IF 0.9 4区医学 Q4 OPHTHALMOLOGY

Journal of Pediatric Ophthalmology & Strabismus Pub Date : 2025-05-27 DOI:10.3928/01913913-20250404-01

Albert Yang, Mark Reid, Angeline M Nguyen, Sudha Nallasamy, Federico G Velez, Alejandra G de Alba Campomanes, Melinda Y Chang

{"title":"Quality and Readability of Patient Educational Materials Generated by ChatGPT-4o for Pediatric Ophthalmologic Surgeries.","authors":"Albert Yang, Mark Reid, Angeline M Nguyen, Sudha Nallasamy, Federico G Velez, Alejandra G de Alba Campomanes, Melinda Y Chang","doi":"10.3928/01913913-20250404-01","DOIUrl":null,"url":null,"abstract":"Purpose: To evaluate the quality and readability of ChatGPT-4o-generated (ChatGPT) (OpenAI) patient education materials (PEMs) about pediatric ophthalmo-logic surgical procedures and compare these to PEMs from the American Association for Pediatric Ophthalmology and Strabismus (AAPOS) website.Methods: The authors prompted ChatGPT-4o to provide PEMs on four procedures-strabismus surgery without adjustable sutures, strabismus surgery with adjustable sutures, pediatric cataract surgery, and nasolacrimal duct probing. The prompt requested responses at a 6th grade level in both Spanish and English. English ChatGPT responses were compared to AAPOS PEMs on quality (using the Quality of Generated Language Outputs for Patients [QGLOP] scale) and readability. English and Spanish ChatGPT responses were also compared on quality and readability.Results: Based on average scores from the four procedures, AAPOS PEMs were superior to English ChatGPT responses on the accuracy, currency, and tone sub-scales of the QGLOP score (4.0 ± 0 vs 2.79 ± 0.79, P = .0021; 3.79 ± 0.26 vs 3.38 ± 0.71, P = .033; 4.0 ± 0 vs 3.42 ± 0.69, P = .042, respectively). There was no significant difference in readability between AAPOS PEMs and English ChatGPT responses. English and Spanish Chat- GPT responses did not significantly differ on quality or readability.Conclusions: ChatGPT-4o-generated PEMs on pediatric ophthalmologic surgical conditions are currently inferior in quality to PEMs on the AAPOS website. However, because ChatGPT is continually being updated and trained, this study should be repeated in the future to determine whether metrics improve over time. [J Pediatr Ophthalmol Strabismus. 20XX;X(X):XXX-XXX.].","PeriodicalId":50095,"journal":{"name":"Journal of Pediatric Ophthalmology & Strabismus","volume":" ","pages":"1-7"},"PeriodicalIF":0.9000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pediatric Ophthalmology & Strabismus","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3928/01913913-20250404-01","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: To evaluate the quality and readability of ChatGPT-4o-generated (ChatGPT) (OpenAI) patient education materials (PEMs) about pediatric ophthalmo-logic surgical procedures and compare these to PEMs from the American Association for Pediatric Ophthalmology and Strabismus (AAPOS) website.

Methods: The authors prompted ChatGPT-4o to provide PEMs on four procedures-strabismus surgery without adjustable sutures, strabismus surgery with adjustable sutures, pediatric cataract surgery, and nasolacrimal duct probing. The prompt requested responses at a 6th grade level in both Spanish and English. English ChatGPT responses were compared to AAPOS PEMs on quality (using the Quality of Generated Language Outputs for Patients [QGLOP] scale) and readability. English and Spanish ChatGPT responses were also compared on quality and readability.

Results: Based on average scores from the four procedures, AAPOS PEMs were superior to English ChatGPT responses on the accuracy, currency, and tone sub-scales of the QGLOP score (4.0 ± 0 vs 2.79 ± 0.79, P = .0021; 3.79 ± 0.26 vs 3.38 ± 0.71, P = .033; 4.0 ± 0 vs 3.42 ± 0.69, P = .042, respectively). There was no significant difference in readability between AAPOS PEMs and English ChatGPT responses. English and Spanish Chat- GPT responses did not significantly differ on quality or readability.

Conclusions: ChatGPT-4o-generated PEMs on pediatric ophthalmologic surgical conditions are currently inferior in quality to PEMs on the AAPOS website. However, because ChatGPT is continually being updated and trained, this study should be repeated in the future to determine whether metrics improve over time. [J Pediatr Ophthalmol Strabismus. 20XX;X(X):XXX-XXX.].

查看原文本刊更多论文

chatgpt - 40生成的儿童眼科手术患者教材的质量和可读性

目的：评价ChatGPT- 40生成的（ChatGPT）（OpenAI）儿童眼逻辑外科手术患者教育材料（PEMs）的质量和可读性，并将其与美国儿童眼科和斜视协会（AAPOS）网站上的PEMs进行比较。方法：作者提示chatgpt - 40为四种手术提供PEMs：无可调节缝合线斜视手术、带可调节缝合线斜视手术、儿童白内障手术和鼻泪管探查。提示要求用西班牙语和英语回答六年级水平的问题。将英语ChatGPT回答与AAPOS PEMs在质量（使用患者生成语言输出质量[QGLOP]量表）和可读性方面进行比较。英语和西班牙语ChatGPT的回答也在质量和可读性上进行了比较。结果：基于四种方法的平均得分，AAPOS PEMs在QGLOP评分的准确性、通用性和语气分量表上优于英语ChatGPT(4.0±0 vs 2.79±0.79,P = 0.0021；3.79±0.26 vs 3.38±0.71,P = 0.033；4.0±0 vs 3.42±0.69,P = 0.042)。AAPOS PEMs与英文ChatGPT的可读性无显著差异。英语和西班牙语Chat- GPT的回答在质量和可读性上没有显著差异。结论：chatgpt - 40生成的儿童眼科手术条件的PEMs目前质量低于AAPOS网站上的PEMs。然而，由于ChatGPT是不断更新和训练的，因此应该在将来重复这项研究，以确定度量是否随着时间的推移而改进。[J].儿童眼斜视，2009；X(X):XXX-XXX。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Pediatric Ophthalmology & Strabismus 医学-小儿科

CiteScore

1.80

自引率

8.30%

发文量

115

审稿时长

>12 weeks

期刊介绍： The Journal of Pediatric Ophthalmology & Strabismus is a bimonthly peer-reviewed publication for pediatric ophthalmologists. The Journal has published original articles on the diagnosis, treatment, and prevention of eye disorders in the pediatric age group and the treatment of strabismus in all age groups for over 50 years.