多模态知识图谱补全的协同检索和重新排序框架

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-04-12 DOI:10.1016/j.neunet.2025.107467

Meng Gao , Yutao Xie , Wei Chen , Feng Zhang , Fei Ding , Tengjiao Wang , Jiahui Yao , Jiabin Zheng , Kam-Fai Wong

{"title":"多模态知识图谱补全的协同检索和重新排序框架","authors":"Meng Gao , Yutao Xie , Wei Chen , Feng Zhang , Fei Ding , Tengjiao Wang , Jiahui Yao , Jiabin Zheng , Kam-Fai Wong","doi":"10.1016/j.neunet.2025.107467","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-modal knowledge graph completion (MMKGC) aims to predict missing links using entity’s multi-modal attributes. Embedding-based methods excel in leveraging structural knowledge, making them robust to entity ambiguity, yet their performance is constrained by the underutilization of multi-modal knowledge. Conversely, fine-tune-based (FT-based) approaches excel in extracting multi-modal knowledge but are hindered by ambiguity issues. To harness the complementary strengths of both methods for MMKGC, this paper introduces an ensemble framework <em>ReranKGC</em>, which decomposes KGC to a retrieve-and-rerank pipeline. The retriever employs embedding-based methods for initial retrieval. The re-ranker adopts our proposed KGC-CLIP, an FT-based method that utilizes CLIP to extract multi-modal knowledge from attributes for candidate re-ranking. By leveraging a more comprehensive knowledge source, the retriever generates a candidate pool containing entities not only semantically, but also structurally related to the query entity. Within this higher-quality candidate pool, the re-ranker can better discern candidates’ semantics to further refine the initial ranking, thereby enhancing precision. Through cooperation, each method maximizes its strengths while mitigating the weaknesses of others to a certain extent, leading to superior performance that surpasses individual capabilities. Extensive experiments conducted on link prediction tasks demonstrate that our framework ReranKGC consistently enhances baseline performance, outperforming state-of-the-art models.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"188 ","pages":"Article 107467"},"PeriodicalIF":6.0000,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ReranKGC: A cooperative retrieve-and-rerank framework for multi-modal knowledge graph completion\",\"authors\":\"Meng Gao , Yutao Xie , Wei Chen , Feng Zhang , Fei Ding , Tengjiao Wang , Jiahui Yao , Jiabin Zheng , Kam-Fai Wong\",\"doi\":\"10.1016/j.neunet.2025.107467\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Multi-modal knowledge graph completion (MMKGC) aims to predict missing links using entity’s multi-modal attributes. Embedding-based methods excel in leveraging structural knowledge, making them robust to entity ambiguity, yet their performance is constrained by the underutilization of multi-modal knowledge. Conversely, fine-tune-based (FT-based) approaches excel in extracting multi-modal knowledge but are hindered by ambiguity issues. To harness the complementary strengths of both methods for MMKGC, this paper introduces an ensemble framework <em>ReranKGC</em>, which decomposes KGC to a retrieve-and-rerank pipeline. The retriever employs embedding-based methods for initial retrieval. The re-ranker adopts our proposed KGC-CLIP, an FT-based method that utilizes CLIP to extract multi-modal knowledge from attributes for candidate re-ranking. By leveraging a more comprehensive knowledge source, the retriever generates a candidate pool containing entities not only semantically, but also structurally related to the query entity. Within this higher-quality candidate pool, the re-ranker can better discern candidates’ semantics to further refine the initial ranking, thereby enhancing precision. Through cooperation, each method maximizes its strengths while mitigating the weaknesses of others to a certain extent, leading to superior performance that surpasses individual capabilities. Extensive experiments conducted on link prediction tasks demonstrate that our framework ReranKGC consistently enhances baseline performance, outperforming state-of-the-art models.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"188 \",\"pages\":\"Article 107467\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025003466\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025003466","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多模态知识图谱补全（MMKGC）旨在利用实体的多模态属性预测缺失链接。基于嵌入的方法擅长利用结构知识，使其对实体歧义具有鲁棒性，但其性能受到多模态知识利用不足的限制。相反，基于微调（FT-based）的方法在提取多模态知识方面表现出色，但受到歧义问题的阻碍。为了在MMKGC中利用这两种方法的互补优势，本文引入了一个集成框架ReranKGC，该框架将KGC分解为一个检索和重新排序管道。检索器采用基于嵌入的方法进行初始检索。重新排序器采用我们提出的KGC-CLIP方法，这是一种基于ft的方法，利用CLIP从属性中提取多模态知识进行重新排序。通过利用更全面的知识来源，检索器生成一个候选池，其中不仅包含与查询实体在语义上相关的实体，而且还包含与查询实体在结构上相关的实体。在这个更高质量的候选库中，重新排序器可以更好地识别候选的语义，从而进一步优化初始排序，从而提高精度。通过合作，每一种方法都能最大限度地发挥自己的优势，同时在一定程度上减轻别人的弱点，从而产生超越个人能力的卓越绩效。在链路预测任务上进行的大量实验表明，我们的框架ReranKGC始终提高基准性能，优于最先进的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ReranKGC: A cooperative retrieve-and-rerank framework for multi-modal knowledge graph completion

Multi-modal knowledge graph completion (MMKGC) aims to predict missing links using entity’s multi-modal attributes. Embedding-based methods excel in leveraging structural knowledge, making them robust to entity ambiguity, yet their performance is constrained by the underutilization of multi-modal knowledge. Conversely, fine-tune-based (FT-based) approaches excel in extracting multi-modal knowledge but are hindered by ambiguity issues. To harness the complementary strengths of both methods for MMKGC, this paper introduces an ensemble framework ReranKGC, which decomposes KGC to a retrieve-and-rerank pipeline. The retriever employs embedding-based methods for initial retrieval. The re-ranker adopts our proposed KGC-CLIP, an FT-based method that utilizes CLIP to extract multi-modal knowledge from attributes for candidate re-ranking. By leveraging a more comprehensive knowledge source, the retriever generates a candidate pool containing entities not only semantically, but also structurally related to the query entity. Within this higher-quality candidate pool, the re-ranker can better discern candidates’ semantics to further refine the initial ranking, thereby enhancing precision. Through cooperation, each method maximizes its strengths while mitigating the weaknesses of others to a certain extent, leading to superior performance that surpasses individual capabilities. Extensive experiments conducted on link prediction tasks demonstrate that our framework ReranKGC consistently enhances baseline performance, outperforming state-of-the-art models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.