基于自适应融合和模态信息增强的多模态知识图链接预测方法

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-06-27 DOI:10.1016/j.neunet.2025.107771

Zenglong Wang, Xuan Liu, Zheng Liu, Yu Weng, Chaomurilige

{"title":"基于自适应融合和模态信息增强的多模态知识图链接预测方法","authors":"Zenglong Wang, Xuan Liu, Zheng Liu, Yu Weng, Chaomurilige","doi":"10.1016/j.neunet.2025.107771","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-modal knowledge graphs (MMKGs) enrich the semantic expression capabilities of traditional knowledge graphs by incorporating diverse modal information, showcasing immense potential in various knowledge reasoning tasks. However, existing MMKGs encounter numerous challenges in the link prediction task (i.e., knowledge graph completion reasoning), primarily due to the complexity and diversity of modal information and the imbalance in quality. These challenges make the efficient fusion and enhancement of multi-modal information difficult to achieve. Most existing methods adopt simple concatenation or weighted fusion of modal features, but such approaches fail to fully capture the deep semantic interactions between modalities and perform poorly when confronted with modal noise or missing information. To address these issues, this paper proposes a novel framework model—Adaptive Fusion and Modality Information Enhancement(AFME). This framework consists of two parts: the Modal Information Fusion module (MoIFu) and the Modal Information Enhancement module (MoIEn). By introducing a relationship-driven denoising mechanism and a dynamic weight allocation mechanism, the framework achieves efficient adaptive fusion of multi-modal information. It employs a generative adversarial network (GAN) structure to enable global guidance of structural modalities over feature modalities and adopts a multi-layer self-attention mechanism to optimize both intra- and inter-modal features. Finally, it jointly optimizes the losses of the triple prediction task and the adversarial generation task. Experimental results demonstrate that the AFME framework significantly improves multi-modal feature utilization and knowledge reasoning capabilities on multiple benchmark datasets, validating its efficiency and superiority in complex multi-modal scenarios.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107771"},"PeriodicalIF":6.3000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A link prediction method for multi-modal knowledge graphs based on Adaptive Fusion and Modality Information Enhancement\",\"authors\":\"Zenglong Wang, Xuan Liu, Zheng Liu, Yu Weng, Chaomurilige\",\"doi\":\"10.1016/j.neunet.2025.107771\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multi-modal knowledge graphs (MMKGs) enrich the semantic expression capabilities of traditional knowledge graphs by incorporating diverse modal information, showcasing immense potential in various knowledge reasoning tasks. However, existing MMKGs encounter numerous challenges in the link prediction task (i.e., knowledge graph completion reasoning), primarily due to the complexity and diversity of modal information and the imbalance in quality. These challenges make the efficient fusion and enhancement of multi-modal information difficult to achieve. Most existing methods adopt simple concatenation or weighted fusion of modal features, but such approaches fail to fully capture the deep semantic interactions between modalities and perform poorly when confronted with modal noise or missing information. To address these issues, this paper proposes a novel framework model—Adaptive Fusion and Modality Information Enhancement(AFME). This framework consists of two parts: the Modal Information Fusion module (MoIFu) and the Modal Information Enhancement module (MoIEn). By introducing a relationship-driven denoising mechanism and a dynamic weight allocation mechanism, the framework achieves efficient adaptive fusion of multi-modal information. It employs a generative adversarial network (GAN) structure to enable global guidance of structural modalities over feature modalities and adopts a multi-layer self-attention mechanism to optimize both intra- and inter-modal features. Finally, it jointly optimizes the losses of the triple prediction task and the adversarial generation task. Experimental results demonstrate that the AFME framework significantly improves multi-modal feature utilization and knowledge reasoning capabilities on multiple benchmark datasets, validating its efficiency and superiority in complex multi-modal scenarios.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"191 \",\"pages\":\"Article 107771\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025006513\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025006513","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多模态知识图（MMKGs）通过融合多种模态信息，丰富了传统知识图的语义表达能力，在各种知识推理任务中显示出巨大潜力。然而，现有的mmkg在链路预测任务（即知识图完成推理）中遇到了许多挑战，主要是由于模态信息的复杂性和多样性以及质量的不平衡。这些挑战使得多模态信息的高效融合和增强难以实现。现有的方法大多采用简单的模态特征拼接或加权融合，但这些方法不能充分捕捉模态之间的深层语义交互，在面对模态噪声或缺失信息时表现不佳。为了解决这些问题，本文提出了一种新的框架模型——自适应融合与模态信息增强（AFME）。该框架由模态信息融合模块（MoIFu）和模态信息增强模块（MoIEn）两部分组成。该框架通过引入关系驱动的去噪机制和动态权重分配机制，实现了多模态信息的高效自适应融合。它采用生成对抗网络（GAN）结构来实现结构模态对特征模态的全局引导，并采用多层自关注机制来优化模态内和模态间的特征。最后，对三重预测任务和对抗生成任务的损失进行联合优化。实验结果表明，AFME框架在多个基准数据集上显著提高了多模态特征利用率和知识推理能力，验证了其在复杂多模态场景下的有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A link prediction method for multi-modal knowledge graphs based on Adaptive Fusion and Modality Information Enhancement

Multi-modal knowledge graphs (MMKGs) enrich the semantic expression capabilities of traditional knowledge graphs by incorporating diverse modal information, showcasing immense potential in various knowledge reasoning tasks. However, existing MMKGs encounter numerous challenges in the link prediction task (i.e., knowledge graph completion reasoning), primarily due to the complexity and diversity of modal information and the imbalance in quality. These challenges make the efficient fusion and enhancement of multi-modal information difficult to achieve. Most existing methods adopt simple concatenation or weighted fusion of modal features, but such approaches fail to fully capture the deep semantic interactions between modalities and perform poorly when confronted with modal noise or missing information. To address these issues, this paper proposes a novel framework model—Adaptive Fusion and Modality Information Enhancement(AFME). This framework consists of two parts: the Modal Information Fusion module (MoIFu) and the Modal Information Enhancement module (MoIEn). By introducing a relationship-driven denoising mechanism and a dynamic weight allocation mechanism, the framework achieves efficient adaptive fusion of multi-modal information. It employs a generative adversarial network (GAN) structure to enable global guidance of structural modalities over feature modalities and adopts a multi-layer self-attention mechanism to optimize both intra- and inter-modal features. Finally, it jointly optimizes the losses of the triple prediction task and the adversarial generation task. Experimental results demonstrate that the AFME framework significantly improves multi-modal feature utilization and knowledge reasoning capabilities on multiple benchmark datasets, validating its efficiency and superiority in complex multi-modal scenarios.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.