Evaluation of AI Summaries on Interdisciplinary Understanding of Ophthalmology Notes.

IF 7.8 1区医学 Q1 OPHTHALMOLOGY

JAMA ophthalmology Pub Date : 2025-05-01 DOI:10.1001/jamaophthalmol.2025.0351

Prashant D Tailor, Haley S D'Souza, Clara M Castillejo Becerra, Heidi M Dahl, Neil R Patel, Tyler M Kaplan, Darrell Kohli, Erick D Bothun, Brian G Mohney, Andrea A Tooley, Keith H Baratz, Raymond Iezzi, Andrew J Barkmeier, Sophie J Bakri, Gavin W Roddy, David Hodge, Arthur J Sit, Matthew R Starr, John J Chen

{"title":"Evaluation of AI Summaries on Interdisciplinary Understanding of Ophthalmology Notes.","authors":"Prashant D Tailor, Haley S D'Souza, Clara M Castillejo Becerra, Heidi M Dahl, Neil R Patel, Tyler M Kaplan, Darrell Kohli, Erick D Bothun, Brian G Mohney, Andrea A Tooley, Keith H Baratz, Raymond Iezzi, Andrew J Barkmeier, Sophie J Bakri, Gavin W Roddy, David Hodge, Arthur J Sit, Matthew R Starr, John J Chen","doi":"10.1001/jamaophthalmol.2025.0351","DOIUrl":null,"url":null,"abstract":"Importance: Specialized ophthalmology terminology limits comprehension for nonophthalmology clinicians and professionals, hindering interdisciplinary communication and patient care. The clinical implementation of large language models (LLMs) into practice has to date been relatively unexplored.Objective: To evaluate LLM-generated plain language summaries (PLSs) integrated into standard ophthalmology notes (SONs) in improving diagnostic understanding, satisfaction, and clarity.Design, setting, and participants: Randomized quality improvement study conducted from February 1, 2024, to May 31, 2024, including data from inpatient and outpatient encounters in a single tertiary academic center. Participants were nonophthalmology clinicians and professionals and ophthalmologists. The single inclusion criterion was any encounter note generated by an ophthalmologist during the study dates. Exclusion criteria were (1) lack of established nonophthalmology clinicians and professionals for outpatient encounters and (2) procedure-only patient encounters.Intervention: Addition of LLM-generated plain language summaries to ophthalmology notes.Main outcomes and measures: The primary outcome was survey responses from nonophthalmology clinicians and professionals assessing understanding, satisfaction, and clarity of ophthalmology notes. Secondary outcomes were survey responses from ophthalmologists evaluating PLS in terms of clinical workflow and accuracy, objective measures of semantic quality, and safety analysis.Results: A total of 362 (85%) nonophthalmology clinicians and professionals (33.0% response rate) preferred the PLS to SON. Demographic data on age, race and ethnicity, and sex were not collected. Nonophthalmology clinicians and professionals reported enhanced diagnostic understanding (percentage point increase, 9.0; 95% CI, 0.3-18.2; P = .01), increased note detail satisfaction (percentage point increase, 21.5; 95% CI, 11.4-31.5; P < .001), and improved explanation clarity (percentage point increase, 23.0; 95% CI, 12.0-33.1; P < .001) for notes containing a PLS. The addition of a PLS was associated with reduced comprehension gaps between clinicians who were comfortable and uncomfortable with ophthalmology terminology (from 26.1% [95% CI, 13.7%-38.6%; P < .001] to 14.4% [95% CI, 4.3%-24.6%; P > .06]). PLS semantic analysis found high meaning preservation (bidirectional encoder representations from transformers score mean F1 score: 0.85) with greater readability than SONs (Flesch Reading Ease: 51.8 vs 43.6; Flesch-Kincaid Grade Level: 10.7 vs 11.9). Ophthalmologists (n = 489; 84% response rate) reported high PLS accuracy (90% [320 of 355] a great deal) with minimal review time burden (94.9% [464 of 489] ≤1 minute). PLS error rate on ophthalmologist review was 26% (126 of 489). A total of 83.9% (104 of 126) of errors were deemed low risk for harm and none had a risk of severe harm or death.Conclusions and relevance: In this study, use of LLM-generated PLSs was associated with enhanced comprehension and satisfaction among nonophthalmology clinicians and professionals, which might aid interdisciplinary communication. Careful implementation and safety monitoring are recommended for clinical integration given the persistence of errors despite physician review.","PeriodicalId":14518,"journal":{"name":"JAMA ophthalmology","volume":" ","pages":"410-419"},"PeriodicalIF":7.8000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11969348/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMA ophthalmology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1001/jamaophthalmol.2025.0351","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Importance: Specialized ophthalmology terminology limits comprehension for nonophthalmology clinicians and professionals, hindering interdisciplinary communication and patient care. The clinical implementation of large language models (LLMs) into practice has to date been relatively unexplored.

Objective: To evaluate LLM-generated plain language summaries (PLSs) integrated into standard ophthalmology notes (SONs) in improving diagnostic understanding, satisfaction, and clarity.

Design, setting, and participants: Randomized quality improvement study conducted from February 1, 2024, to May 31, 2024, including data from inpatient and outpatient encounters in a single tertiary academic center. Participants were nonophthalmology clinicians and professionals and ophthalmologists. The single inclusion criterion was any encounter note generated by an ophthalmologist during the study dates. Exclusion criteria were (1) lack of established nonophthalmology clinicians and professionals for outpatient encounters and (2) procedure-only patient encounters.

Intervention: Addition of LLM-generated plain language summaries to ophthalmology notes.

Main outcomes and measures: The primary outcome was survey responses from nonophthalmology clinicians and professionals assessing understanding, satisfaction, and clarity of ophthalmology notes. Secondary outcomes were survey responses from ophthalmologists evaluating PLS in terms of clinical workflow and accuracy, objective measures of semantic quality, and safety analysis.

Results: A total of 362 (85%) nonophthalmology clinicians and professionals (33.0% response rate) preferred the PLS to SON. Demographic data on age, race and ethnicity, and sex were not collected. Nonophthalmology clinicians and professionals reported enhanced diagnostic understanding (percentage point increase, 9.0; 95% CI, 0.3-18.2; P = .01), increased note detail satisfaction (percentage point increase, 21.5; 95% CI, 11.4-31.5; P < .001), and improved explanation clarity (percentage point increase, 23.0; 95% CI, 12.0-33.1; P < .001) for notes containing a PLS. The addition of a PLS was associated with reduced comprehension gaps between clinicians who were comfortable and uncomfortable with ophthalmology terminology (from 26.1% [95% CI, 13.7%-38.6%; P < .001] to 14.4% [95% CI, 4.3%-24.6%; P > .06]). PLS semantic analysis found high meaning preservation (bidirectional encoder representations from transformers score mean F1 score: 0.85) with greater readability than SONs (Flesch Reading Ease: 51.8 vs 43.6; Flesch-Kincaid Grade Level: 10.7 vs 11.9). Ophthalmologists (n = 489; 84% response rate) reported high PLS accuracy (90% [320 of 355] a great deal) with minimal review time burden (94.9% [464 of 489] ≤1 minute). PLS error rate on ophthalmologist review was 26% (126 of 489). A total of 83.9% (104 of 126) of errors were deemed low risk for harm and none had a risk of severe harm or death.

Conclusions and relevance: In this study, use of LLM-generated PLSs was associated with enhanced comprehension and satisfaction among nonophthalmology clinicians and professionals, which might aid interdisciplinary communication. Careful implementation and safety monitoring are recommended for clinical integration given the persistence of errors despite physician review.

查看原文本刊更多论文

人工智能对眼科笔记跨学科理解的评价

重要性：眼科专业术语限制了非眼科临床医生和专业人员的理解，阻碍了跨学科的交流和患者护理。迄今为止，大型语言模型（llm）在临床实践中的实现还相对未被探索。目的：评价整合到标准眼科笔记（SONs）中的llm生成的简单语言摘要（pls）在提高诊断理解、满意度和清晰度方面的作用。设计、环境和参与者：从2024年2月1日至2024年5月31日进行的随机质量改进研究，包括来自单个三级学术中心的住院和门诊患者的数据。参与者是非眼科临床医生、专业人员和眼科医生。单一纳入标准是在研究期间眼科医生产生的任何接触记录。排除标准是(1)缺乏门诊就诊的非眼科临床医生和专业人员；(2)仅接受手术的患者。干预措施：在眼科笔记中加入法学硕士生成的简单语言摘要。主要结果和措施：主要结果是来自非眼科临床医生和专业人员的调查反应，评估对眼科记录的理解、满意度和清晰度。次要结果是眼科医生在临床工作流程和准确性、语义质量的客观测量和安全性分析方面评估PLS的调查反应。结果：362名（85%）非眼科临床医生和专业人员（33.0%）更倾向于PLS而不是SON。没有收集有关年龄、种族和民族以及性别的人口统计数据。非眼科临床医生和专业人员报告了诊断理解能力的提高(增加了9.0个百分点；95% ci, 0.3-18.2；P = 0.01)，增加了笔记细节满意度(百分点增加，21.5；95% ci, 11.4-31.5；P 0。06])。PLS语义分析发现，高意义保存（互感器的双向编码器表示得分平均F1得分：0.85）的可读性高于SONs (Flesch Reading Ease: 51.8 vs 43.6；flesch - kinkaid等级：10.7 vs 11.9)。眼科医生(n = 489；84%的反应率)报告了高PLS准确性（90%[355中的320]非常多）和最小的检查时间负担（94.9%[489中的464]≤1分钟）。眼科医生复查的PLS错误率为26%（489人中的126人）。共有83.9%（126例中的104例）的错误被认为是低伤害风险，没有严重伤害或死亡风险。结论和相关性：在本研究中，使用法学硕士生成的pls与非眼科临床医生和专业人员的理解和满意度提高有关，这可能有助于跨学科交流。尽管有医生复查，但由于错误的持续存在，建议对临床整合进行仔细实施和安全监测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JAMA ophthalmology OPHTHALMOLOGY-

CiteScore

13.20

自引率

3.70%

发文量

340

期刊介绍： JAMA Ophthalmology, with a rich history of continuous publication since 1869, stands as a distinguished international, peer-reviewed journal dedicated to ophthalmology and visual science. In 2019, the journal proudly commemorated 150 years of uninterrupted service to the field. As a member of the esteemed JAMA Network, a consortium renowned for its peer-reviewed general medical and specialty publications, JAMA Ophthalmology upholds the highest standards of excellence in disseminating cutting-edge research and insights. Join us in celebrating our legacy and advancing the frontiers of ophthalmology and visual science.