基于跨模态匹配的一致性-异质性平衡假新闻检测

IEEE transactions on artificial intelligence Pub Date : 2025-01-13 DOI:10.1109/TAI.2025.3527921

Ying Guo;Bingxin Li;Kexin Zhen;Jie Liu;Gaolei Li;Qi Wang;Yong-Jin Liu

{"title":"基于跨模态匹配的一致性-异质性平衡假新闻检测","authors":"Ying Guo;Bingxin Li;Kexin Zhen;Jie Liu;Gaolei Li;Qi Wang;Yong-Jin Liu","doi":"10.1109/TAI.2025.3527921","DOIUrl":null,"url":null,"abstract":"Generating synthetic content through generative AI (GAI) presents considerable hurdles for current fake news detection methodologies. Many existing detection approaches concentrate on feature-based multimodal fusion, neglecting semantic relationships such as correlations and diversities. In this study, we introduce an innovative cross-modal matching-driven approach to reconcile semantic relevance (text–image consistency) and semantic gap (text–image heterogeneity) in multimodal fake news detection. Unlike the conventional paradigm of multimodal fusion followed by detection, our approach integrates textual modality, visual modality (images), and text embedded within images (auxiliary modality) to construct an end-to-end framework. This framework considers the relevance of contents across different modalities while simultaneously addressing the gap in structures, achieving a delicate balance between consistency and heterogeneity. Consistency is fostered by evaluating intermodality correlation via pairwise-similarity scores, while heterogeneity is addressed by employing cross-attention mechanisms to account for intermodality diversity. To achieve equilibrium between consistency and heterogeneity, we employ attention-guided enhanced modality interaction and similarity-based dynamic weight assignment to establish robust frameworks. Comparative experiments conducted on the Chinese Weibo dataset and the English Twitter dataset demonstrate the effectiveness of our approach, surpassing the state-of-the-art by 7% to 13%.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 7","pages":"1787-1796"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Consistency-Heterogenity Balanced Fake News Detection via Cross-Modal Matching\",\"authors\":\"Ying Guo;Bingxin Li;Kexin Zhen;Jie Liu;Gaolei Li;Qi Wang;Yong-Jin Liu\",\"doi\":\"10.1109/TAI.2025.3527921\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generating synthetic content through generative AI (GAI) presents considerable hurdles for current fake news detection methodologies. Many existing detection approaches concentrate on feature-based multimodal fusion, neglecting semantic relationships such as correlations and diversities. In this study, we introduce an innovative cross-modal matching-driven approach to reconcile semantic relevance (text–image consistency) and semantic gap (text–image heterogeneity) in multimodal fake news detection. Unlike the conventional paradigm of multimodal fusion followed by detection, our approach integrates textual modality, visual modality (images), and text embedded within images (auxiliary modality) to construct an end-to-end framework. This framework considers the relevance of contents across different modalities while simultaneously addressing the gap in structures, achieving a delicate balance between consistency and heterogeneity. Consistency is fostered by evaluating intermodality correlation via pairwise-similarity scores, while heterogeneity is addressed by employing cross-attention mechanisms to account for intermodality diversity. To achieve equilibrium between consistency and heterogeneity, we employ attention-guided enhanced modality interaction and similarity-based dynamic weight assignment to establish robust frameworks. Comparative experiments conducted on the Chinese Weibo dataset and the English Twitter dataset demonstrate the effectiveness of our approach, surpassing the state-of-the-art by 7% to 13%.\",\"PeriodicalId\":73305,\"journal\":{\"name\":\"IEEE transactions on artificial intelligence\",\"volume\":\"6 7\",\"pages\":\"1787-1796\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on artificial intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10838616/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10838616/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

通过生成式人工智能（GAI）生成合成内容对当前的假新闻检测方法提出了相当大的障碍。许多现有的检测方法都集中在基于特征的多模态融合上，而忽略了相关性和多样性等语义关系。在本研究中，我们引入了一种创新的跨模态匹配驱动方法来协调多模态假新闻检测中的语义相关性（文本-图像一致性）和语义缺口（文本-图像异质性）。与传统的多模态融合然后检测的范式不同，我们的方法集成了文本模态、视觉模态（图像）和嵌入图像中的文本（辅助模态），以构建一个端到端的框架。该框架考虑了不同模式之间内容的相关性，同时解决了结构上的差距，在一致性和异质性之间实现了微妙的平衡。一致性是通过两两相似度评分来评估多式联运相关性来促进的，而异质性是通过采用交叉注意机制来解释多式联运多样性来解决的。为了实现一致性和异质性之间的平衡，我们采用了注意引导的增强模态交互和基于相似性的动态权重分配来建立鲁棒框架。在中文微博数据集和英文Twitter数据集上进行的对比实验证明了我们的方法的有效性，比最先进的方法高出7%到13%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Consistency-Heterogenity Balanced Fake News Detection via Cross-Modal Matching

Generating synthetic content through generative AI (GAI) presents considerable hurdles for current fake news detection methodologies. Many existing detection approaches concentrate on feature-based multimodal fusion, neglecting semantic relationships such as correlations and diversities. In this study, we introduce an innovative cross-modal matching-driven approach to reconcile semantic relevance (text–image consistency) and semantic gap (text–image heterogeneity) in multimodal fake news detection. Unlike the conventional paradigm of multimodal fusion followed by detection, our approach integrates textual modality, visual modality (images), and text embedded within images (auxiliary modality) to construct an end-to-end framework. This framework considers the relevance of contents across different modalities while simultaneously addressing the gap in structures, achieving a delicate balance between consistency and heterogeneity. Consistency is fostered by evaluating intermodality correlation via pairwise-similarity scores, while heterogeneity is addressed by employing cross-attention mechanisms to account for intermodality diversity. To achieve equilibrium between consistency and heterogeneity, we employ attention-guided enhanced modality interaction and similarity-based dynamic weight assignment to establish robust frameworks. Comparative experiments conducted on the Chinese Weibo dataset and the English Twitter dataset demonstrate the effectiveness of our approach, surpassing the state-of-the-art by 7% to 13%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量