Evaluating knowledge fusion models on detecting adverse drug events in text.

PLOS digital health Pub Date : 2025-03-18 eCollection Date: 2025-03-01 DOI:10.1371/journal.pdig.0000468
Philipp Wegner, Holger Fröhlich, Sumit Madan
{"title":"Evaluating knowledge fusion models on detecting adverse drug events in text.","authors":"Philipp Wegner, Holger Fröhlich, Sumit Madan","doi":"10.1371/journal.pdig.0000468","DOIUrl":null,"url":null,"abstract":"<p><p>Detecting adverse drug events (ADE) of drugs that are already available on the market is an essential part of the pharmacovigilance work conducted by both medical regulatory bodies and the pharmaceutical industry. Concerns regarding drug safety and economic interests serve as motivating factors for the efforts to identify ADEs. Hereby, social media platforms play an important role as a valuable source of reports on ADEs, particularly through collecting posts discussing adverse events associated with specific drugs. We aim with our study to assess the effectiveness of knowledge fusion approaches in combination with transformer-based NLP models to extract ADE mentions from diverse datasets, for instance, texts from Twitter, websites like askapatient.com, and drug labels. The extraction task is formulated as a named entity recognition (NER) problem. The proposed methodology involves applying fusion learning methods to enhance the performance of transformer-based language models with additional contextual knowledge from ontologies or knowledge graphs. Additionally, the study introduces a multi-modal architecture that combines transformer-based language models with graph attention networks (GAT) to identify ADE spans in textual data. A multi-modality model consisting of the ERNIE model with knowledge on drugs reached an F1-score of 71.84% on CADEC corpus. Additionally, a combination of a graph attention network with BERT resulted in an F1-score of 65.16% on SMM4H corpus. Impressively, the same model achieved an F1-score of 72.50% on the PsyTAR corpus, 79.54% on the ADE corpus, and 94.15% on the TAC corpus. Except for the CADEC corpus, the knowledge fusion models consistently outperformed the baseline model, BERT. Our study demonstrates the significance of context knowledge in improving the performance of knowledge fusion models for detecting ADEs from various types of textual data.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"4 3","pages":"e0000468"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11918363/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pdig.0000468","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Detecting adverse drug events (ADE) of drugs that are already available on the market is an essential part of the pharmacovigilance work conducted by both medical regulatory bodies and the pharmaceutical industry. Concerns regarding drug safety and economic interests serve as motivating factors for the efforts to identify ADEs. Hereby, social media platforms play an important role as a valuable source of reports on ADEs, particularly through collecting posts discussing adverse events associated with specific drugs. We aim with our study to assess the effectiveness of knowledge fusion approaches in combination with transformer-based NLP models to extract ADE mentions from diverse datasets, for instance, texts from Twitter, websites like askapatient.com, and drug labels. The extraction task is formulated as a named entity recognition (NER) problem. The proposed methodology involves applying fusion learning methods to enhance the performance of transformer-based language models with additional contextual knowledge from ontologies or knowledge graphs. Additionally, the study introduces a multi-modal architecture that combines transformer-based language models with graph attention networks (GAT) to identify ADE spans in textual data. A multi-modality model consisting of the ERNIE model with knowledge on drugs reached an F1-score of 71.84% on CADEC corpus. Additionally, a combination of a graph attention network with BERT resulted in an F1-score of 65.16% on SMM4H corpus. Impressively, the same model achieved an F1-score of 72.50% on the PsyTAR corpus, 79.54% on the ADE corpus, and 94.15% on the TAC corpus. Except for the CADEC corpus, the knowledge fusion models consistently outperformed the baseline model, BERT. Our study demonstrates the significance of context knowledge in improving the performance of knowledge fusion models for detecting ADEs from various types of textual data.

评估文本中药物不良事件检测的知识融合模型。
检测市场上已有药物的药物不良事件(ADE)是医疗监管机构和制药行业开展的药物警戒工作的重要组成部分。对药物安全和经济利益的关注是努力确定ade的激励因素。因此,社交媒体平台作为ade报告的宝贵来源发挥了重要作用,特别是通过收集讨论与特定药物相关的不良事件的帖子。我们的研究目的是评估知识融合方法与基于转换器的NLP模型相结合的有效性,以从不同的数据集中提取ADE提及,例如,来自Twitter的文本,askapatient.com等网站和药品标签。抽取任务被表述为一个命名实体识别(NER)问题。所提出的方法包括应用融合学习方法,通过来自本体或知识图的额外上下文知识来增强基于转换器的语言模型的性能。此外,该研究还引入了一种多模态架构,该架构将基于转换器的语言模型与图注意网络(GAT)相结合,以识别文本数据中的ADE跨度。基于药物知识的ERNIE模型组成的多模态模型在CADEC语料库上的f1得分为71.84%。此外,图注意网络与BERT的组合在SMM4H语料上的f1得分为65.16%。令人印象深刻的是,同样的模型在PsyTAR语料库上的f1得分为72.50%,在ADE语料库上为79.54%,在TAC语料库上为94.15%。除CADEC语料库外,知识融合模型均优于基线模型BERT。我们的研究证明了上下文知识在提高知识融合模型从各种类型的文本数据中检测ade的性能方面的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信