基于关联感知的多模态社会实体及关系提取方法

IF 6.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zhenbin Chen , Zhixin Li , Mingqi Liu , Canlong Zhang , Huifang Ma
{"title":"基于关联感知的多模态社会实体及关系提取方法","authors":"Zhenbin Chen ,&nbsp;Zhixin Li ,&nbsp;Mingqi Liu ,&nbsp;Canlong Zhang ,&nbsp;Huifang Ma","doi":"10.1016/j.neucom.2025.130316","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal Named Entity Recognition (MNER) and Multimodal Relation Extraction (MRE) aim to identify specific entities from given text–image pairs and classify the semantic relationships between them, and they have significant applications in social media platform analysis. However, the images and text in social media data are not always aligned, which makes the existing multimodal entity and relation extraction methods still mainly rely on text information. And those mismatched images can even introduce modality noise, leading to negative impacts on the model and preventing them from achieving better performance. To solve this issue, we propose a Relevance-Aware Prompt-tuning (RAP) method with dynamic router mechanism for multi-modal entity and relation extraction. Our method can adaptively learn effective multimodal features from various types of information as prompt vectors and utilize prompt-tuning for entity and relation extraction. Additionally, when integrating information from different modalities, we take into account the intermodal relevance to reduce the negative impact of mismatched visual information on the model, which allows our model to overcome modality noise and achieve better performance. Extensive experiments on three benchmark datasets of tweets demonstrated the effectiveness and superiority of our proposed approach, and achieved approximately 2% increase in F1 values on the three datasets, respectively.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"640 ","pages":"Article 130316"},"PeriodicalIF":6.5000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Relevance-aware prompt-tuning method for multimodal social entity and relation extraction\",\"authors\":\"Zhenbin Chen ,&nbsp;Zhixin Li ,&nbsp;Mingqi Liu ,&nbsp;Canlong Zhang ,&nbsp;Huifang Ma\",\"doi\":\"10.1016/j.neucom.2025.130316\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multimodal Named Entity Recognition (MNER) and Multimodal Relation Extraction (MRE) aim to identify specific entities from given text–image pairs and classify the semantic relationships between them, and they have significant applications in social media platform analysis. However, the images and text in social media data are not always aligned, which makes the existing multimodal entity and relation extraction methods still mainly rely on text information. And those mismatched images can even introduce modality noise, leading to negative impacts on the model and preventing them from achieving better performance. To solve this issue, we propose a Relevance-Aware Prompt-tuning (RAP) method with dynamic router mechanism for multi-modal entity and relation extraction. Our method can adaptively learn effective multimodal features from various types of information as prompt vectors and utilize prompt-tuning for entity and relation extraction. Additionally, when integrating information from different modalities, we take into account the intermodal relevance to reduce the negative impact of mismatched visual information on the model, which allows our model to overcome modality noise and achieve better performance. Extensive experiments on three benchmark datasets of tweets demonstrated the effectiveness and superiority of our proposed approach, and achieved approximately 2% increase in F1 values on the three datasets, respectively.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"640 \",\"pages\":\"Article 130316\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225009889\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225009889","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

多模态命名实体识别(MNER)和多模态关系提取(MRE)旨在从给定的文本-图像对中识别特定实体,并对它们之间的语义关系进行分类,在社交媒体平台分析中具有重要应用。然而,社交媒体数据中的图像和文本并不总是对齐的,这使得现有的多模态实体和关系提取方法仍然主要依赖于文本信息。这些不匹配的图像甚至会引入模态噪声,对模型产生负面影响,使其无法获得更好的性能。为了解决这一问题,我们提出了一种基于动态路由机制的关联感知提示调优(RAP)方法,用于多模态实体和关系提取。该方法可以自适应地从各种类型的信息中学习有效的多模态特征作为提示向量,并利用提示调谐进行实体和关系提取。此外,在整合不同模态的信息时,我们考虑了多模态相关性,以减少视觉信息不匹配对模型的负面影响,从而使我们的模型能够克服模态噪声并获得更好的性能。在tweet的三个基准数据集上进行的大量实验证明了我们提出的方法的有效性和优越性,并且在三个数据集上分别实现了大约2%的F1值提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Relevance-aware prompt-tuning method for multimodal social entity and relation extraction
Multimodal Named Entity Recognition (MNER) and Multimodal Relation Extraction (MRE) aim to identify specific entities from given text–image pairs and classify the semantic relationships between them, and they have significant applications in social media platform analysis. However, the images and text in social media data are not always aligned, which makes the existing multimodal entity and relation extraction methods still mainly rely on text information. And those mismatched images can even introduce modality noise, leading to negative impacts on the model and preventing them from achieving better performance. To solve this issue, we propose a Relevance-Aware Prompt-tuning (RAP) method with dynamic router mechanism for multi-modal entity and relation extraction. Our method can adaptively learn effective multimodal features from various types of information as prompt vectors and utilize prompt-tuning for entity and relation extraction. Additionally, when integrating information from different modalities, we take into account the intermodal relevance to reduce the negative impact of mismatched visual information on the model, which allows our model to overcome modality noise and achieve better performance. Extensive experiments on three benchmark datasets of tweets demonstrated the effectiveness and superiority of our proposed approach, and achieved approximately 2% increase in F1 values on the three datasets, respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信