Large Language Model Architectures in Health Care: Scoping Review of Research Perspectives.

IF 5.8 2区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Journal of Medical Internet Research Pub Date : 2025-06-19 DOI:10.2196/70315

Florian Leiser, Richard Guse, Ali Sunyaev

{"title":"Large Language Model Architectures in Health Care: Scoping Review of Research Perspectives.","authors":"Florian Leiser, Richard Guse, Ali Sunyaev","doi":"10.2196/70315","DOIUrl":null,"url":null,"abstract":"Background: Large language models (LLMs) can support health care professionals in their daily work, for example, when writing and filing reports or communicating diagnoses. With the rise of LLMs, current research investigates how LLMs could be applied in medical practice and their benefits for physicians in clinical workflows. However, most studies neglect the importance of selecting suitable LLM architectures.Objective: In this literature review, we aim to provide insights on the different LLM model architecture families (ie, Bidirectional Encoder Representations from Transformers [BERT]-based or generative pretrained transformer [GPT]-based models) used in previous research. We report on the suitability and benefits of different LLM model architecture families for various research foci.Methods: To this end, we conduct a scoping review to identify which LLMs are used in health care. Our search included manuscripts from PubMed, arXiv, and medRxiv. We used open and selective coding to assess the 114 identified manuscripts regarding 11 dimensions related to usage and technical facets and the research focus of the manuscripts.Results: We identified 4 research foci that emerged previously in manuscripts, with LLM performance being the main focus. We found that GPT-based models are used for communicative purposes such as examination preparation or patient interaction. In contrast, BERT-based models are used for medical tasks such as knowledge discovery and model improvements.Conclusions: Our study suggests that GPT-based models are better suited for communicative purposes such as report generation or patient interaction. BERT-based models seem to be better suited for innovative applications such as classification or knowledge discovery. This could be due to the architectural differences where GPT processes language unidirectionally and BERT bidirectionally, allowing more in-depth understanding of the text. In addition, BERT-based models seem to allow more straightforward extensions of their models for domain-specific tasks that generally lead to better results. In summary, health care professionals should consider the benefits and differences of the LLM architecture families when selecting a suitable model for their intended purpose.","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e70315"},"PeriodicalIF":5.8000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12226782/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/70315","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Large language models (LLMs) can support health care professionals in their daily work, for example, when writing and filing reports or communicating diagnoses. With the rise of LLMs, current research investigates how LLMs could be applied in medical practice and their benefits for physicians in clinical workflows. However, most studies neglect the importance of selecting suitable LLM architectures.

Objective: In this literature review, we aim to provide insights on the different LLM model architecture families (ie, Bidirectional Encoder Representations from Transformers [BERT]-based or generative pretrained transformer [GPT]-based models) used in previous research. We report on the suitability and benefits of different LLM model architecture families for various research foci.

Methods: To this end, we conduct a scoping review to identify which LLMs are used in health care. Our search included manuscripts from PubMed, arXiv, and medRxiv. We used open and selective coding to assess the 114 identified manuscripts regarding 11 dimensions related to usage and technical facets and the research focus of the manuscripts.

Results: We identified 4 research foci that emerged previously in manuscripts, with LLM performance being the main focus. We found that GPT-based models are used for communicative purposes such as examination preparation or patient interaction. In contrast, BERT-based models are used for medical tasks such as knowledge discovery and model improvements.

Conclusions: Our study suggests that GPT-based models are better suited for communicative purposes such as report generation or patient interaction. BERT-based models seem to be better suited for innovative applications such as classification or knowledge discovery. This could be due to the architectural differences where GPT processes language unidirectionally and BERT bidirectionally, allowing more in-depth understanding of the text. In addition, BERT-based models seem to allow more straightforward extensions of their models for domain-specific tasks that generally lead to better results. In summary, health care professionals should consider the benefits and differences of the LLM architecture families when selecting a suitable model for their intended purpose.

查看原文本刊更多论文

医疗保健中的大型语言模型架构：研究视角的范围审查。

背景：大型语言模型（llm）可以支持医疗保健专业人员的日常工作，例如，在编写和归档报告或沟通诊断时。随着法学硕士的兴起，目前的研究调查了法学硕士如何在医疗实践中应用，以及它们在临床工作流程中对医生的好处。然而，大多数研究忽视了选择合适的LLM架构的重要性。目的：在这篇文献综述中，我们的目标是提供对先前研究中使用的不同LLM模型体系结构家族（即基于BERT的变压器双向编码器表示或基于生成预训练变压器的GPT模型）的见解。我们报告了不同LLM模型体系结构家族对各种研究焦点的适用性和益处。方法：为此，我们进行了范围审查，以确定哪些法学硕士在医疗保健中使用。我们的搜索包括PubMed、arXiv和medRxiv的手稿。我们使用开放和选择性编码对114份已确定的手稿进行评估，涉及11个维度，涉及使用和技术方面以及手稿的研究重点。结果：我们确定了先前在手稿中出现的4个研究焦点，LLM绩效是主要焦点。我们发现基于gpt的模型用于交流目的，如检查准备或患者互动。相比之下，基于bert的模型用于医学任务，如知识发现和模型改进。结论：我们的研究表明，基于gpt的模型更适合于沟通目的，如报告生成或患者互动。基于bert的模型似乎更适合于分类或知识发现等创新应用。这可能是由于GPT单向处理语言和BERT双向处理语言的体系结构差异，从而允许更深入地理解文本。此外，基于bert的模型似乎允许对特定领域的任务进行更直接的模型扩展，这通常会导致更好的结果。总之，医疗保健专业人员在为其预期目的选择合适的模型时，应该考虑LLM体系结构家族的优点和差异。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Medical Internet Research 医学-卫生保健

CiteScore

14.40

自引率

5.40%

发文量

654

审稿时长

1 months

期刊介绍： The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades. As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor. Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.