Contrastive learning of cross-modal information enhancement for multimodal fake news detection

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Complex & Intelligent Systems Pub Date : 2025-05-22 DOI:10.1007/s40747-025-01919-4

Weijie Chen, Fei Cai, Yupu Guo, Zhiqiang Pan, Wanyu Chen, Yijia Zhang

{"title":"Contrastive learning of cross-modal information enhancement for multimodal fake news detection","authors":"Weijie Chen, Fei Cai, Yupu Guo, Zhiqiang Pan, Wanyu Chen, Yijia Zhang","doi":"10.1007/s40747-025-01919-4","DOIUrl":null,"url":null,"abstract":"<p>With the rapid development of the Internet, the existence of fake news and its rapid spread has brought many negative effects to the society. Consequently, the fake news detection task has become increasingly important over the past few years. Existing methods are predominantly unimodal methods or the multimodal representation of unimodal fusion for fake news detection. However, the large number of model parameters and the interference of noisy data increase the risk of overfitting. Thus, we construct an information enhancement and contrast learning framework by introducing Improved Low-rank Multimodal Fusion approach for Fake News Detection (ILMF-FND), which aims to reduce the noise interference and achieve efficient fusion of multimodal feature vectors with fewer parameters. In detail, an encoder extracts the feature vectors of text and images, which are subsequently refined using the Multi-gate Mixture-of-Experts. The refined features are mapped into the same space for semanteme sharing. Then, a cross-modal fusion is performed, resulting in that an efficient and highly precision fusion of text and image features is done with fewer parameters. Besides, we design an adaptive mechanism that can adjust the weights of the final components according to the modal fitness before inputting them into the classifier to achieve the best detection results in the current state. We evaluate the performance of ILMF-FND and the competitive baselines on two public datasets, i.e., Twitter and Weibo. The results indicate that our ILMF-FND greatly minimizes the number of parameters while outperforming the best baseline in terms of accuracy by 0.2% and 1.1% on the Weibo and Twitter datasets, respectively.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"18 1","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-025-01919-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

With the rapid development of the Internet, the existence of fake news and its rapid spread has brought many negative effects to the society. Consequently, the fake news detection task has become increasingly important over the past few years. Existing methods are predominantly unimodal methods or the multimodal representation of unimodal fusion for fake news detection. However, the large number of model parameters and the interference of noisy data increase the risk of overfitting. Thus, we construct an information enhancement and contrast learning framework by introducing Improved Low-rank Multimodal Fusion approach for Fake News Detection (ILMF-FND), which aims to reduce the noise interference and achieve efficient fusion of multimodal feature vectors with fewer parameters. In detail, an encoder extracts the feature vectors of text and images, which are subsequently refined using the Multi-gate Mixture-of-Experts. The refined features are mapped into the same space for semanteme sharing. Then, a cross-modal fusion is performed, resulting in that an efficient and highly precision fusion of text and image features is done with fewer parameters. Besides, we design an adaptive mechanism that can adjust the weights of the final components according to the modal fitness before inputting them into the classifier to achieve the best detection results in the current state. We evaluate the performance of ILMF-FND and the competitive baselines on two public datasets, i.e., Twitter and Weibo. The results indicate that our ILMF-FND greatly minimizes the number of parameters while outperforming the best baseline in terms of accuracy by 0.2% and 1.1% on the Weibo and Twitter datasets, respectively.

查看原文本刊更多论文

多模态假新闻检测中跨模态信息增强的对比学习

随着互联网的快速发展，假新闻的存在及其迅速传播给社会带来了许多负面影响。因此，在过去几年中，假新闻检测任务变得越来越重要。现有的假新闻检测方法主要是单模态方法或单模态融合的多模态表示。然而，大量的模型参数和噪声数据的干扰增加了过拟合的风险。因此，我们通过引入改进的低秩多模态融合方法（Improved Low-rank Multimodal Fusion approach for Fake News Detection, ILMF-FND）构建了一个信息增强和对比学习框架，旨在减少噪声干扰，以更少的参数实现多模态特征向量的高效融合。具体而言，编码器提取文本和图像的特征向量，随后使用多门混合专家对其进行细化。将改进后的特征映射到相同的空间中，实现语义共享。然后进行跨模态融合，以较少的参数实现高效、高精度的文本和图像特征融合。此外，我们设计了一种自适应机制，在将最终分量输入分类器之前，可以根据模态适应度调整最终分量的权重，以达到当前状态下的最佳检测结果。我们在两个公共数据集（即Twitter和Weibo）上评估了ILMF-FND和竞争基线的性能。结果表明，我们的ILMF-FND极大地减少了参数的数量，同时在微博和Twitter数据集上的准确率分别比最佳基线高0.2%和1.1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.