Better understanding: can a large language model safely improve readability of patient information leaflets in interventional radiology?

IF 4.4 Q1 HEALTH CARE SCIENCES & SERVICES

BMJ Health & Care Informatics Pub Date : 2025-10-05 DOI:10.1136/bmjhci-2025-101512

William Clackett, Ian A Zealley, Zelei Yang, Ghali Salahia, Richard D White

{"title":"Better understanding: can a large language model safely improve readability of patient information leaflets in interventional radiology?","authors":"William Clackett, Ian A Zealley, Zelei Yang, Ghali Salahia, Richard D White","doi":"10.1136/bmjhci-2025-101512","DOIUrl":null,"url":null,"abstract":"Objectives: This study aimed to evaluate the feasibility of using a large language model (LLM) to generate patient information leaflets (PILs) with improved readability based on PILs in the field of interventional radiology.Methods: PILs were acquired from the Cardiovascular and Interventional Radiology Society of Europe website, reformatted, and uploaded to the GPT-4 user interface with a prompt aimed to simplify the language. Automated readability metrics were used to evaluate the readability of original and LLM-modified PILs. Factual accuracy was assessed by human evaluation from three consultant interventional radiologists using an agreed marking scheme.Results: LLM-modified PILs had significantly lower mean reading grade (9.5±0.5) compared with original PILs (11.1±0.1) (p<0.01). However, the recommended reading grade of 6 (expected to be understood by 11- to 12-year-old children) was not achieved. Human evaluation revealed that most LLM-modified PILs had minor concerns regarding factual accuracy, but no errors that could result in serious patient harm were detected.Discussion: LLMs appear to be a powerful tool in improving the readability of PILs within the field of interventional radiology. However, clinical experts are still required in PIL development to ensure the factual accuracy of these augmented documents is not compromised.Conclusion: LLMs should be considered as a useful tool to assist with the development and revision of PILs in the field of interventional radiology.","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.4000,"publicationDate":"2025-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Health & Care Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjhci-2025-101512","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: This study aimed to evaluate the feasibility of using a large language model (LLM) to generate patient information leaflets (PILs) with improved readability based on PILs in the field of interventional radiology.

Methods: PILs were acquired from the Cardiovascular and Interventional Radiology Society of Europe website, reformatted, and uploaded to the GPT-4 user interface with a prompt aimed to simplify the language. Automated readability metrics were used to evaluate the readability of original and LLM-modified PILs. Factual accuracy was assessed by human evaluation from three consultant interventional radiologists using an agreed marking scheme.

Results: LLM-modified PILs had significantly lower mean reading grade (9.5±0.5) compared with original PILs (11.1±0.1) (p<0.01). However, the recommended reading grade of 6 (expected to be understood by 11- to 12-year-old children) was not achieved. Human evaluation revealed that most LLM-modified PILs had minor concerns regarding factual accuracy, but no errors that could result in serious patient harm were detected.

Discussion: LLMs appear to be a powerful tool in improving the readability of PILs within the field of interventional radiology. However, clinical experts are still required in PIL development to ensure the factual accuracy of these augmented documents is not compromised.

Conclusion: LLMs should be considered as a useful tool to assist with the development and revision of PILs in the field of interventional radiology.

查看原文本刊更多论文

更好的理解：大型语言模型能否安全地提高介入放射学患者信息单张的可读性？

目的：本研究旨在评估在介入放射学领域使用大型语言模型（LLM）生成可读性更高的患者信息传单（pil）的可行性。方法：从欧洲心血管与介入放射学会网站获取PILs，重新格式化，并上传到GPT-4用户界面，提示旨在简化语言。使用自动可读性指标来评估原始和llm修改的pil的可读性。事实准确性由三名介入放射科顾问使用商定的标记方案进行人类评估。结果：llm修饰的pil的平均阅读等级（9.5±0.5）明显低于原始pil(11.1±0.1)(p讨论：llm似乎是在介入放射学领域提高pil可读性的有力工具。然而，在PIL开发中仍然需要临床专家来确保这些增强文档的事实准确性不会受到损害。结论：在介入放射学领域，LLMs可作为一种有用的工具来协助制定和修订pil。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊