MALADE: Orchestration of LLM-powered Agents with Retrieval Augmented Generation for Pharmacovigilance

arXiv - QuanBio - Quantitative Methods Pub Date : 2024-08-03 DOI:arxiv-2408.01869

Jihye Choi, Nils Palumbo, Prasad Chalasani, Matthew M. Engelhard, Somesh Jha, Anivarya Kumar, David Page

{"title":"MALADE: Orchestration of LLM-powered Agents with Retrieval Augmented Generation for Pharmacovigilance","authors":"Jihye Choi, Nils Palumbo, Prasad Chalasani, Matthew M. Engelhard, Somesh Jha, Anivarya Kumar, David Page","doi":"arxiv-2408.01869","DOIUrl":null,"url":null,"abstract":"In the era of Large Language Models (LLMs), given their remarkable text\nunderstanding and generation abilities, there is an unprecedented opportunity\nto develop new, LLM-based methods for trustworthy medical knowledge synthesis,\nextraction and summarization. This paper focuses on the problem of\nPharmacovigilance (PhV), where the significance and challenges lie in\nidentifying Adverse Drug Events (ADEs) from diverse text sources, such as\nmedical literature, clinical notes, and drug labels. Unfortunately, this task\nis hindered by factors including variations in the terminologies of drugs and\noutcomes, and ADE descriptions often being buried in large amounts of narrative\ntext. We present MALADE, the first effective collaborative multi-agent system\npowered by LLM with Retrieval Augmented Generation for ADE extraction from drug\nlabel data. This technique involves augmenting a query to an LLM with relevant\ninformation extracted from text resources, and instructing the LLM to compose a\nresponse consistent with the augmented data. MALADE is a general LLM-agnostic\narchitecture, and its unique capabilities are: (1) leveraging a variety of\nexternal sources, such as medical literature, drug labels, and FDA tools (e.g.,\nOpenFDA drug information API), (2) extracting drug-outcome association in a\nstructured format along with the strength of the association, and (3) providing\nexplanations for established associations. Instantiated with GPT-4 Turbo or\nGPT-4o, and FDA drug label data, MALADE demonstrates its efficacy with an Area\nUnder ROC Curve of 0.90 against the OMOP Ground Truth table of ADEs. Our\nimplementation leverages the Langroid multi-agent LLM framework and can be\nfound at https://github.com/jihyechoi77/malade.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":"42 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.01869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In the era of Large Language Models (LLMs), given their remarkable text understanding and generation abilities, there is an unprecedented opportunity to develop new, LLM-based methods for trustworthy medical knowledge synthesis, extraction and summarization. This paper focuses on the problem of Pharmacovigilance (PhV), where the significance and challenges lie in identifying Adverse Drug Events (ADEs) from diverse text sources, such as medical literature, clinical notes, and drug labels. Unfortunately, this task is hindered by factors including variations in the terminologies of drugs and outcomes, and ADE descriptions often being buried in large amounts of narrative text. We present MALADE, the first effective collaborative multi-agent system powered by LLM with Retrieval Augmented Generation for ADE extraction from drug label data. This technique involves augmenting a query to an LLM with relevant information extracted from text resources, and instructing the LLM to compose a response consistent with the augmented data. MALADE is a general LLM-agnostic architecture, and its unique capabilities are: (1) leveraging a variety of external sources, such as medical literature, drug labels, and FDA tools (e.g., OpenFDA drug information API), (2) extracting drug-outcome association in a structured format along with the strength of the association, and (3) providing explanations for established associations. Instantiated with GPT-4 Turbo or GPT-4o, and FDA drug label data, MALADE demonstrates its efficacy with an Area Under ROC Curve of 0.90 against the OMOP Ground Truth table of ADEs. Our implementation leverages the Langroid multi-agent LLM framework and can be found at https://github.com/jihyechoi77/malade.

查看原文本刊更多论文

MALADE：利用检索增强生成技术协调 LLM 驱动的药物警戒代理

在大语言模型（LLM）时代，由于其卓越的文本理解和生成能力，为开发基于 LLM 的新方法来合成、提取和总结值得信赖的医学知识提供了前所未有的机遇。本文的重点是药物警戒（PhV）问题，其意义和挑战在于从不同的文本来源（如医学文献、临床笔记和药物标签）中识别药物不良事件（ADEs）。遗憾的是，这项任务受到各种因素的阻碍，包括药物和结果术语的差异，以及 ADE 描述经常被埋没在大量的叙述性文本中。我们介绍了 MALADE，这是第一个由 LLM 支持的有效协作多代理系统，它采用了检索增强生成技术，用于从药物标签数据中提取 ADE。这项技术包括用从文本资源中提取的相关信息来增强 LLM 的查询，并指示 LLM 根据增强的数据做出响应。MALADE 是一种与 LLM 无关的通用架构，其独特的功能包括(1) 利用各种外部资源，如医学文献、药物标签和 FDA 工具（如 OpenFDA 药物信息 API）；(2) 以结构化格式提取药物-结果关联以及关联强度；(3) 为已建立的关联提供解释。MALADE 使用 GPT-4 Turbo 或 GPT-4o 和 FDA 药物标签数据进行实例化，与 ADEs 的 OMOP 地面实况表相比，MALADE 的 ROC 曲线下面积达到 0.90，证明了其有效性。我们的实现利用了 Langroid 多代理 LLM 框架，可在 https://github.com/jihyechoi77/malade 上找到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - QuanBio - Quantitative Methods

自引率

0.00%

发文量