使用大型语言模型生成患者教育材料：范围审查。

Q2 Medicine

Acta Informatica Medica Pub Date : 2025-01-01 DOI:10.5455/aim.2024.33.4-10

Alhasan AlSammarraie, Mowafa Househ

{"title":"使用大型语言模型生成患者教育材料：范围审查。","authors":"Alhasan AlSammarraie, Mowafa Househ","doi":"10.5455/aim.2024.33.4-10","DOIUrl":null,"url":null,"abstract":"Background: Patient Education is a healthcare concept that involves educating the public with evidence-based medical information. This information surges their capabilities to promote a healthier life and better manage their conditions. LLM platforms have recently been introduced as powerful NLPs capable of producing human-sounding text and by extension patient education materials.Objective: This study aims to conduct a scoping review to systematically map the existing literature on the use of LLMs for generating patient education materials.Methods: The study followed JBI guidelines, searching five databases using set inclusion/exclusion criteria. A RAG-inspired framework was employed to extract the variables followed by a manual check to verify accuracy of extractions. In total, 21 variables were identified and grouped into five themes: Study Demographics, LLM Characteristics, Prompt-Related Variables, PEM Assessment, and Comparative Outcomes.Results: Results were reported from 69 studies. The United States contributed the largest number of studies. LLM models such as ChatGPT-4, ChatGPT-3.5, and Bard were the most investigated. Most studies evaluated the accuracy of LLM responses and the readability of LLM responses. Only 3 studies implemented external knowledge bases leveraging a RAG architecture. All studies except 3 conducted prompting in English. ChatGPT-4 was found to provide the most accurate responses in comparison with other models.Conclusion: This review examined studies comparing large language models for generating patient education materials. ChatGPT-3.5 and ChatGPT-4 were the most evaluated. Accuracy and readability of responses were the main metrics of evaluation, while few studies used assessment frameworks, retrieval-augmented methods, or explored non-English cases.","PeriodicalId":7074,"journal":{"name":"Acta Informatica Medica","volume":"33 1","pages":"4-10"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11986337/pdf/","citationCount":"0","resultStr":"{\"title\":\"The Use of Large Language Models in Generating Patient Education Materials: a Scoping Review.\",\"authors\":\"Alhasan AlSammarraie, Mowafa Househ\",\"doi\":\"10.5455/aim.2024.33.4-10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Patient Education is a healthcare concept that involves educating the public with evidence-based medical information. This information surges their capabilities to promote a healthier life and better manage their conditions. LLM platforms have recently been introduced as powerful NLPs capable of producing human-sounding text and by extension patient education materials.Objective: This study aims to conduct a scoping review to systematically map the existing literature on the use of LLMs for generating patient education materials.Methods: The study followed JBI guidelines, searching five databases using set inclusion/exclusion criteria. A RAG-inspired framework was employed to extract the variables followed by a manual check to verify accuracy of extractions. In total, 21 variables were identified and grouped into five themes: Study Demographics, LLM Characteristics, Prompt-Related Variables, PEM Assessment, and Comparative Outcomes.Results: Results were reported from 69 studies. The United States contributed the largest number of studies. LLM models such as ChatGPT-4, ChatGPT-3.5, and Bard were the most investigated. Most studies evaluated the accuracy of LLM responses and the readability of LLM responses. Only 3 studies implemented external knowledge bases leveraging a RAG architecture. All studies except 3 conducted prompting in English. ChatGPT-4 was found to provide the most accurate responses in comparison with other models.Conclusion: This review examined studies comparing large language models for generating patient education materials. ChatGPT-3.5 and ChatGPT-4 were the most evaluated. Accuracy and readability of responses were the main metrics of evaluation, while few studies used assessment frameworks, retrieval-augmented methods, or explored non-English cases.\",\"PeriodicalId\":7074,\"journal\":{\"name\":\"Acta Informatica Medica\",\"volume\":\"33 1\",\"pages\":\"4-10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11986337/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Informatica Medica\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5455/aim.2024.33.4-10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Informatica Medica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5455/aim.2024.33.4-10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Medicine","Score":null,"Total":0}

引用次数: 0

摘要

背景：患者教育是一个医疗保健概念，涉及以循证医学信息教育公众。这些信息增强了他们促进更健康生活和更好地管理自身状况的能力。法学硕士平台最近被引入为强大的nlp，能够产生听起来像人类的文本，并通过扩展患者教育材料。目的：本研究旨在进行范围审查，以系统地绘制有关使用法学硕士生成患者教育材料的现有文献。方法：本研究遵循JBI指南，按照设定的纳入/排除标准检索5个数据库。采用rag启发的框架提取变量，然后进行手动检查以验证提取的准确性。总共确定了21个变量，并将其分为五个主题：研究人口统计学、法学硕士特征、提示相关变量、PEM评估和比较结果。结果：报告了69项研究的结果。美国贡献了最多的研究。研究最多的是ChatGPT-4、ChatGPT-3.5和Bard等LLM模型。大多数研究评估了法学硕士回答的准确性和法学硕士回答的可读性。只有3个研究利用RAG架构实现了外部知识库。除3项研究外，其余研究均采用英文提示。与其他模型相比，ChatGPT-4提供了最准确的响应。结论：本综述考察了比较大型语言模型用于生成患者教育材料的研究。ChatGPT-3.5和ChatGPT-4评价最高。回答的准确性和可读性是评估的主要指标，而很少有研究使用评估框架、检索增强方法或探索非英语案例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

The Use of Large Language Models in Generating Patient Education Materials: a Scoping Review.

查看原文本刊更多论文

The Use of Large Language Models in Generating Patient Education Materials: a Scoping Review.

Background: Patient Education is a healthcare concept that involves educating the public with evidence-based medical information. This information surges their capabilities to promote a healthier life and better manage their conditions. LLM platforms have recently been introduced as powerful NLPs capable of producing human-sounding text and by extension patient education materials.

Objective: This study aims to conduct a scoping review to systematically map the existing literature on the use of LLMs for generating patient education materials.

Methods: The study followed JBI guidelines, searching five databases using set inclusion/exclusion criteria. A RAG-inspired framework was employed to extract the variables followed by a manual check to verify accuracy of extractions. In total, 21 variables were identified and grouped into five themes: Study Demographics, LLM Characteristics, Prompt-Related Variables, PEM Assessment, and Comparative Outcomes.

Results: Results were reported from 69 studies. The United States contributed the largest number of studies. LLM models such as ChatGPT-4, ChatGPT-3.5, and Bard were the most investigated. Most studies evaluated the accuracy of LLM responses and the readability of LLM responses. Only 3 studies implemented external knowledge bases leveraging a RAG architecture. All studies except 3 conducted prompting in English. ChatGPT-4 was found to provide the most accurate responses in comparison with other models.

Conclusion: This review examined studies comparing large language models for generating patient education materials. ChatGPT-3.5 and ChatGPT-4 were the most evaluated. Accuracy and readability of responses were the main metrics of evaluation, while few studies used assessment frameworks, retrieval-augmented methods, or explored non-English cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Acta Informatica Medica Medicine-Medicine (all)

CiteScore

2.90

自引率

0.00%

发文量