Xiquan Zhang , Jianwu Dang , Yangping Wang , Shuyang Li
{"title":"MCCI: A multi-channel collaborative interaction framework for multimodal knowledge graph completion","authors":"Xiquan Zhang , Jianwu Dang , Yangping Wang , Shuyang Li","doi":"10.1016/j.ipm.2025.104156","DOIUrl":null,"url":null,"abstract":"<div><div>Multimodal knowledge graph completion (MKGC) aims to leverage multimodal information to predict missing fact triplets. However, existing MKGC approaches largely ignore the heterogeneity and interaction complexity between modal details, resulting in a lack of balance in the intra- and inter-modal expression. To address the above challenges, we propose a novel multi-channel collaborative interaction (MCCI) framework for MKGC, which is composed of feature encoding, dual-flow alignment, and decision fusion modules. Specifically, in the encoding stage, information filtering and visual enhancement-based methods are used to capture high-quality multimodal features. Furthermore, the dual-flow alignment module expands the potential correlations between different modalities, thereby facilitating the interaction frequency of the information. In the fusion stage, dynamically allocate modality weights and generate prediction outcomes. Experimental results show that compared with the state-of-the-art approaches, the proposed MCCI framework has an improvement of 5.7% and 19.8% in Hits@10 and MR, respectively.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104156"},"PeriodicalIF":7.4000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325000974","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal knowledge graph completion (MKGC) aims to leverage multimodal information to predict missing fact triplets. However, existing MKGC approaches largely ignore the heterogeneity and interaction complexity between modal details, resulting in a lack of balance in the intra- and inter-modal expression. To address the above challenges, we propose a novel multi-channel collaborative interaction (MCCI) framework for MKGC, which is composed of feature encoding, dual-flow alignment, and decision fusion modules. Specifically, in the encoding stage, information filtering and visual enhancement-based methods are used to capture high-quality multimodal features. Furthermore, the dual-flow alignment module expands the potential correlations between different modalities, thereby facilitating the interaction frequency of the information. In the fusion stage, dynamically allocate modality weights and generate prediction outcomes. Experimental results show that compared with the state-of-the-art approaches, the proposed MCCI framework has an improvement of 5.7% and 19.8% in Hits@10 and MR, respectively.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.