Bidirectional Encoder Representations from Transformers-like large language models in patient safety and pharmacovigilance: A comprehensive assessment of causal inference implications.

IF 2.8 4区医学 Q2 MEDICINE, RESEARCH & EXPERIMENTAL

Experimental Biology and Medicine Pub Date : 2023-11-01 Epub Date: 2023-12-12 DOI:10.1177/15353702231215895

Xingqiao Wang, Xiaowei Xu, Zhichao Liu, Weida Tong

{"title":"Bidirectional Encoder Representations from Transformers-like large language models in patient safety and pharmacovigilance: A comprehensive assessment of causal inference implications.","authors":"Xingqiao Wang, Xiaowei Xu, Zhichao Liu, Weida Tong","doi":"10.1177/15353702231215895","DOIUrl":null,"url":null,"abstract":"<p><p>Causality assessment is vital in patient safety and pharmacovigilance (PSPV) for safety signal detection, adverse reaction management, and regulatory submission. Large language models (LLMs), especially those designed with transformer architecture, are revolutionizing various fields, including PSPV. While attempts to utilize Bidirectional Encoder Representations from Transformers (BERT)-like LLMs for causal inference in PSPV are underway, a detailed evaluation of \"fit-for-purpose\" BERT-like model selection to enhance causal inference performance within PSPV applications remains absent. This study conducts an in-depth exploration of BERT-like LLMs, including generic pre-trained BERT LLMs, domain-specific pre-trained LLMs, and domain-specific pre-trained LLMs with safety knowledge-specific fine-tuning, for causal inference in PSPV. Our investigation centers around (1) the influence of data complexity and model architecture, (2) the correlation between the BERT size and its impact, and (3) the role of domain-specific training and fine-tuning on three publicly accessible PSPV data sets. The findings suggest that (1) BERT-like LLMs deliver consistent predictive power across varied data complexity levels, (2) the predictive performance and causal inference results do not directly correspond to the BERT-like model size, and (3) domain-specific pre-trained LLMs, with or without safety knowledge-specific fine-tuning, surpass generic pre-trained BERT models in causal inference. The findings are valuable to guide the future application of LLMs in a broad range of application.</p>","PeriodicalId":12163,"journal":{"name":"Experimental Biology and Medicine","volume":" ","pages":"1908-1917"},"PeriodicalIF":2.8000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10798182/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Experimental Biology and Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/15353702231215895","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/12 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

Causality assessment is vital in patient safety and pharmacovigilance (PSPV) for safety signal detection, adverse reaction management, and regulatory submission. Large language models (LLMs), especially those designed with transformer architecture, are revolutionizing various fields, including PSPV. While attempts to utilize Bidirectional Encoder Representations from Transformers (BERT)-like LLMs for causal inference in PSPV are underway, a detailed evaluation of "fit-for-purpose" BERT-like model selection to enhance causal inference performance within PSPV applications remains absent. This study conducts an in-depth exploration of BERT-like LLMs, including generic pre-trained BERT LLMs, domain-specific pre-trained LLMs, and domain-specific pre-trained LLMs with safety knowledge-specific fine-tuning, for causal inference in PSPV. Our investigation centers around (1) the influence of data complexity and model architecture, (2) the correlation between the BERT size and its impact, and (3) the role of domain-specific training and fine-tuning on three publicly accessible PSPV data sets. The findings suggest that (1) BERT-like LLMs deliver consistent predictive power across varied data complexity levels, (2) the predictive performance and causal inference results do not directly correspond to the BERT-like model size, and (3) domain-specific pre-trained LLMs, with or without safety knowledge-specific fine-tuning, surpass generic pre-trained BERT models in causal inference. The findings are valuable to guide the future application of LLMs in a broad range of application.

查看原文本刊更多论文

患者安全和药物警戒中类似变压器的大型语言模型的双向编码器表征：对因果推理影响的全面评估。

因果关系评估对于患者安全和药物警戒 (PSPV) 的安全信号检测、不良反应管理和监管提交至关重要。大型语言模型（LLM），尤其是采用变压器架构设计的大型语言模型，正在给包括 PSPV 在内的各个领域带来革命性的变化。虽然目前正在尝试利用类似于变压器双向编码器表示（BERT）的 LLM 在 PSPV 中进行因果推理，但仍缺乏对 "适合目的 "的 BERT 类模型选择进行详细评估，以提高 PSPV 应用中的因果推理性能。本研究深入探讨了用于 PSPV 因果推理的 BERT 类 LLM，包括通用预训练 BERT LLM、特定领域预训练 LLM 和具有特定安全知识微调的特定领域预训练 LLM。我们的研究围绕以下几个方面展开：(1) 数据复杂性和模型架构的影响；(2) BERT 大小与其影响之间的相关性；(3) 特定领域训练和微调在三个公开的 PSPV 数据集上的作用。研究结果表明：(1) 类 BERT LLM 在不同的数据复杂度水平上具有一致的预测能力；(2) 预测性能和因果推理结果与类 BERT 模型的大小并不直接对应；(3) 特定领域的预训练 LLM（无论是否进行了特定安全知识的微调）在因果推理方面超过了通用的预训练 BERT 模型。这些研究结果对于指导 LLM 今后在广泛应用中的应用具有重要价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Experimental Biology and Medicine 医学-医学：研究与实验

CiteScore

6.00

自引率

0.00%

发文量

157

审稿时长

1 months

期刊介绍： Experimental Biology and Medicine (EBM) is a global, peer-reviewed journal dedicated to the publication of multidisciplinary and interdisciplinary research in the biomedical sciences. EBM provides both research and review articles as well as meeting symposia and brief communications. Articles in EBM represent cutting edge research at the overlapping junctions of the biological, physical and engineering sciences that impact upon the health and welfare of the world''s population. Topics covered in EBM include: Anatomy/Pathology; Biochemistry and Molecular Biology; Bioimaging; Biomedical Engineering; Bionanoscience; Cell and Developmental Biology; Endocrinology and Nutrition; Environmental Health/Biomarkers/Precision Medicine; Genomics, Proteomics, and Bioinformatics; Immunology/Microbiology/Virology; Mechanisms of Aging; Neuroscience; Pharmacology and Toxicology; Physiology; Stem Cell Biology; Structural Biology; Systems Biology and Microphysiological Systems; and Translational Research.