Prashant D Tailor, Haley S D'Souza, Clara M Castillejo Becerra, Heidi M Dahl, Neil R Patel, Tyler M Kaplan, Darrell Kohli, Erick D Bothun, Brian G Mohney, Andrea A Tooley, Keith H Baratz, Raymond Iezzi, Andrew J Barkmeier, Sophie J Bakri, Gavin W Roddy, David Hodge, Arthur J Sit, Matthew R Starr, John J Chen
{"title":"Evaluation of AI Summaries on Interdisciplinary Understanding of Ophthalmology Notes.","authors":"Prashant D Tailor, Haley S D'Souza, Clara M Castillejo Becerra, Heidi M Dahl, Neil R Patel, Tyler M Kaplan, Darrell Kohli, Erick D Bothun, Brian G Mohney, Andrea A Tooley, Keith H Baratz, Raymond Iezzi, Andrew J Barkmeier, Sophie J Bakri, Gavin W Roddy, David Hodge, Arthur J Sit, Matthew R Starr, John J Chen","doi":"10.1001/jamaophthalmol.2025.0351","DOIUrl":null,"url":null,"abstract":"<p><strong>Importance: </strong>Specialized ophthalmology terminology limits comprehension for nonophthalmology clinicians and professionals, hindering interdisciplinary communication and patient care. The clinical implementation of large language models (LLMs) into practice has to date been relatively unexplored.</p><p><strong>Objective: </strong>To evaluate LLM-generated plain language summaries (PLSs) integrated into standard ophthalmology notes (SONs) in improving diagnostic understanding, satisfaction, and clarity.</p><p><strong>Design, setting, and participants: </strong>Randomized quality improvement study conducted from February 1, 2024, to May 31, 2024, including data from inpatient and outpatient encounters in a single tertiary academic center. Participants were nonophthalmology clinicians and professionals and ophthalmologists. The single inclusion criterion was any encounter note generated by an ophthalmologist during the study dates. Exclusion criteria were (1) lack of established nonophthalmology clinicians and professionals for outpatient encounters and (2) procedure-only patient encounters.</p><p><strong>Intervention: </strong>Addition of LLM-generated plain language summaries to ophthalmology notes.</p><p><strong>Main outcomes and measures: </strong>The primary outcome was survey responses from nonophthalmology clinicians and professionals assessing understanding, satisfaction, and clarity of ophthalmology notes. Secondary outcomes were survey responses from ophthalmologists evaluating PLS in terms of clinical workflow and accuracy, objective measures of semantic quality, and safety analysis.</p><p><strong>Results: </strong>A total of 362 (85%) nonophthalmology clinicians and professionals (33.0% response rate) preferred the PLS to SON. Demographic data on age, race and ethnicity, and sex were not collected. Nonophthalmology clinicians and professionals reported enhanced diagnostic understanding (percentage point increase, 9.0; 95% CI, 0.3-18.2; P = .01), increased note detail satisfaction (percentage point increase, 21.5; 95% CI, 11.4-31.5; P < .001), and improved explanation clarity (percentage point increase, 23.0; 95% CI, 12.0-33.1; P < .001) for notes containing a PLS. The addition of a PLS was associated with reduced comprehension gaps between clinicians who were comfortable and uncomfortable with ophthalmology terminology (from 26.1% [95% CI, 13.7%-38.6%; P < .001] to 14.4% [95% CI, 4.3%-24.6%; P > .06]). PLS semantic analysis found high meaning preservation (bidirectional encoder representations from transformers score mean F1 score: 0.85) with greater readability than SONs (Flesch Reading Ease: 51.8 vs 43.6; Flesch-Kincaid Grade Level: 10.7 vs 11.9). Ophthalmologists (n = 489; 84% response rate) reported high PLS accuracy (90% [320 of 355] a great deal) with minimal review time burden (94.9% [464 of 489] ≤1 minute). PLS error rate on ophthalmologist review was 26% (126 of 489). A total of 83.9% (104 of 126) of errors were deemed low risk for harm and none had a risk of severe harm or death.</p><p><strong>Conclusions and relevance: </strong>In this study, use of LLM-generated PLSs was associated with enhanced comprehension and satisfaction among nonophthalmology clinicians and professionals, which might aid interdisciplinary communication. Careful implementation and safety monitoring are recommended for clinical integration given the persistence of errors despite physician review.</p>","PeriodicalId":14518,"journal":{"name":"JAMA ophthalmology","volume":" ","pages":""},"PeriodicalIF":7.8000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11969348/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMA ophthalmology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1001/jamaophthalmol.2025.0351","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Importance: Specialized ophthalmology terminology limits comprehension for nonophthalmology clinicians and professionals, hindering interdisciplinary communication and patient care. The clinical implementation of large language models (LLMs) into practice has to date been relatively unexplored.
Objective: To evaluate LLM-generated plain language summaries (PLSs) integrated into standard ophthalmology notes (SONs) in improving diagnostic understanding, satisfaction, and clarity.
Design, setting, and participants: Randomized quality improvement study conducted from February 1, 2024, to May 31, 2024, including data from inpatient and outpatient encounters in a single tertiary academic center. Participants were nonophthalmology clinicians and professionals and ophthalmologists. The single inclusion criterion was any encounter note generated by an ophthalmologist during the study dates. Exclusion criteria were (1) lack of established nonophthalmology clinicians and professionals for outpatient encounters and (2) procedure-only patient encounters.
Intervention: Addition of LLM-generated plain language summaries to ophthalmology notes.
Main outcomes and measures: The primary outcome was survey responses from nonophthalmology clinicians and professionals assessing understanding, satisfaction, and clarity of ophthalmology notes. Secondary outcomes were survey responses from ophthalmologists evaluating PLS in terms of clinical workflow and accuracy, objective measures of semantic quality, and safety analysis.
Results: A total of 362 (85%) nonophthalmology clinicians and professionals (33.0% response rate) preferred the PLS to SON. Demographic data on age, race and ethnicity, and sex were not collected. Nonophthalmology clinicians and professionals reported enhanced diagnostic understanding (percentage point increase, 9.0; 95% CI, 0.3-18.2; P = .01), increased note detail satisfaction (percentage point increase, 21.5; 95% CI, 11.4-31.5; P < .001), and improved explanation clarity (percentage point increase, 23.0; 95% CI, 12.0-33.1; P < .001) for notes containing a PLS. The addition of a PLS was associated with reduced comprehension gaps between clinicians who were comfortable and uncomfortable with ophthalmology terminology (from 26.1% [95% CI, 13.7%-38.6%; P < .001] to 14.4% [95% CI, 4.3%-24.6%; P > .06]). PLS semantic analysis found high meaning preservation (bidirectional encoder representations from transformers score mean F1 score: 0.85) with greater readability than SONs (Flesch Reading Ease: 51.8 vs 43.6; Flesch-Kincaid Grade Level: 10.7 vs 11.9). Ophthalmologists (n = 489; 84% response rate) reported high PLS accuracy (90% [320 of 355] a great deal) with minimal review time burden (94.9% [464 of 489] ≤1 minute). PLS error rate on ophthalmologist review was 26% (126 of 489). A total of 83.9% (104 of 126) of errors were deemed low risk for harm and none had a risk of severe harm or death.
Conclusions and relevance: In this study, use of LLM-generated PLSs was associated with enhanced comprehension and satisfaction among nonophthalmology clinicians and professionals, which might aid interdisciplinary communication. Careful implementation and safety monitoring are recommended for clinical integration given the persistence of errors despite physician review.
期刊介绍:
JAMA Ophthalmology, with a rich history of continuous publication since 1869, stands as a distinguished international, peer-reviewed journal dedicated to ophthalmology and visual science. In 2019, the journal proudly commemorated 150 years of uninterrupted service to the field. As a member of the esteemed JAMA Network, a consortium renowned for its peer-reviewed general medical and specialty publications, JAMA Ophthalmology upholds the highest standards of excellence in disseminating cutting-edge research and insights. Join us in celebrating our legacy and advancing the frontiers of ophthalmology and visual science.