Albert Yang, Mark Reid, Angeline M Nguyen, Sudha Nallasamy, Federico G Velez, Alejandra G de Alba Campomanes, Melinda Y Chang
{"title":"Quality and Readability of Patient Educational Materials Generated by ChatGPT-4o for Pediatric Ophthalmologic Surgeries.","authors":"Albert Yang, Mark Reid, Angeline M Nguyen, Sudha Nallasamy, Federico G Velez, Alejandra G de Alba Campomanes, Melinda Y Chang","doi":"10.3928/01913913-20250404-01","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To evaluate the quality and readability of ChatGPT-4o-generated (ChatGPT) (OpenAI) patient education materials (PEMs) about pediatric ophthalmo-logic surgical procedures and compare these to PEMs from the American Association for Pediatric Ophthalmology and Strabismus (AAPOS) website.</p><p><strong>Methods: </strong>The authors prompted ChatGPT-4o to provide PEMs on four procedures-strabismus surgery without adjustable sutures, strabismus surgery with adjustable sutures, pediatric cataract surgery, and nasolacrimal duct probing. The prompt requested responses at a 6th grade level in both Spanish and English. English ChatGPT responses were compared to AAPOS PEMs on quality (using the Quality of Generated Language Outputs for Patients [QGLOP] scale) and readability. English and Spanish ChatGPT responses were also compared on quality and readability.</p><p><strong>Results: </strong>Based on average scores from the four procedures, AAPOS PEMs were superior to English ChatGPT responses on the accuracy, currency, and tone sub-scales of the QGLOP score (4.0 ± 0 vs 2.79 ± 0.79, <i>P</i> = .0021; 3.79 ± 0.26 vs 3.38 ± 0.71, <i>P</i> = .033; 4.0 ± 0 vs 3.42 ± 0.69, <i>P</i> = .042, respectively). There was no significant difference in readability between AAPOS PEMs and English ChatGPT responses. English and Spanish Chat- GPT responses did not significantly differ on quality or readability.</p><p><strong>Conclusions: </strong>ChatGPT-4o-generated PEMs on pediatric ophthalmologic surgical conditions are currently inferior in quality to PEMs on the AAPOS website. However, because ChatGPT is continually being updated and trained, this study should be repeated in the future to determine whether metrics improve over time. <b>[<i>J Pediatr Ophthalmol Strabismus</i>. 20XX;X(X):XXX-XXX.]</b>.</p>","PeriodicalId":50095,"journal":{"name":"Journal of Pediatric Ophthalmology & Strabismus","volume":" ","pages":"1-7"},"PeriodicalIF":0.9000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pediatric Ophthalmology & Strabismus","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3928/01913913-20250404-01","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: To evaluate the quality and readability of ChatGPT-4o-generated (ChatGPT) (OpenAI) patient education materials (PEMs) about pediatric ophthalmo-logic surgical procedures and compare these to PEMs from the American Association for Pediatric Ophthalmology and Strabismus (AAPOS) website.
Methods: The authors prompted ChatGPT-4o to provide PEMs on four procedures-strabismus surgery without adjustable sutures, strabismus surgery with adjustable sutures, pediatric cataract surgery, and nasolacrimal duct probing. The prompt requested responses at a 6th grade level in both Spanish and English. English ChatGPT responses were compared to AAPOS PEMs on quality (using the Quality of Generated Language Outputs for Patients [QGLOP] scale) and readability. English and Spanish ChatGPT responses were also compared on quality and readability.
Results: Based on average scores from the four procedures, AAPOS PEMs were superior to English ChatGPT responses on the accuracy, currency, and tone sub-scales of the QGLOP score (4.0 ± 0 vs 2.79 ± 0.79, P = .0021; 3.79 ± 0.26 vs 3.38 ± 0.71, P = .033; 4.0 ± 0 vs 3.42 ± 0.69, P = .042, respectively). There was no significant difference in readability between AAPOS PEMs and English ChatGPT responses. English and Spanish Chat- GPT responses did not significantly differ on quality or readability.
Conclusions: ChatGPT-4o-generated PEMs on pediatric ophthalmologic surgical conditions are currently inferior in quality to PEMs on the AAPOS website. However, because ChatGPT is continually being updated and trained, this study should be repeated in the future to determine whether metrics improve over time. [J Pediatr Ophthalmol Strabismus. 20XX;X(X):XXX-XXX.].
期刊介绍:
The Journal of Pediatric Ophthalmology & Strabismus is a bimonthly peer-reviewed publication for pediatric ophthalmologists. The Journal has published original articles on the diagnosis, treatment, and prevention of eye disorders in the pediatric age group and the treatment of strabismus in all age groups for over 50 years.