A Survey of Multi-modal Knowledge Graphs: Technologies and Trends

IF 28 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys Pub Date : 2024-04-10 DOI:10.1145/3656579

Wanying Liang, Pasquale De Meo, Yong Tang, Jia Zhu

{"title":"A Survey of Multi-modal Knowledge Graphs: Technologies and Trends","authors":"Wanying Liang, Pasquale De Meo, Yong Tang, Jia Zhu","doi":"10.1145/3656579","DOIUrl":null,"url":null,"abstract":"<p>In recent years, Knowledge Graphs (KGs) have played a crucial role in the development of advanced knowledge-intensive applications, such as recommender systems and semantic search. However, the human sensory system is inherently multi-modal, as objects around us are often represented by a combination of multiple signals, such as visual and textual. Consequently, Multi-modal Knowledge Graphs (MMKGs), which combine structured knowledge representation with multiple modalities, represent a powerful extension of KGs. Although MMKGs can handle certain types of tasks (e.g., visual query answering) or queries that standard KGs cannot process, and they can effectively tackle some standard problems (e.g., entity alignment), we lack a widely accepted definition of MMKG. In this survey, we provide a rigorous definition of MMKGs along with a classification scheme based on how existing approaches address four fundamental challenges: representation, fusion, alignment, and translation, which are crucial to improving an MMKG. Our classification scheme is flexible and allows for easy incorporation of new approaches, as well as a comparison of two approaches in terms of how they address one of the fundamental challenges mentioned above. As the first comprehensive survey of MMKG, this article aims to inspire and provide a reference for relevant researchers in the field of Artificial Intelligence.</p>","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"50 1","pages":""},"PeriodicalIF":28.0000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3656579","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, Knowledge Graphs (KGs) have played a crucial role in the development of advanced knowledge-intensive applications, such as recommender systems and semantic search. However, the human sensory system is inherently multi-modal, as objects around us are often represented by a combination of multiple signals, such as visual and textual. Consequently, Multi-modal Knowledge Graphs (MMKGs), which combine structured knowledge representation with multiple modalities, represent a powerful extension of KGs. Although MMKGs can handle certain types of tasks (e.g., visual query answering) or queries that standard KGs cannot process, and they can effectively tackle some standard problems (e.g., entity alignment), we lack a widely accepted definition of MMKG. In this survey, we provide a rigorous definition of MMKGs along with a classification scheme based on how existing approaches address four fundamental challenges: representation, fusion, alignment, and translation, which are crucial to improving an MMKG. Our classification scheme is flexible and allows for easy incorporation of new approaches, as well as a comparison of two approaches in terms of how they address one of the fundamental challenges mentioned above. As the first comprehensive survey of MMKG, this article aims to inspire and provide a reference for relevant researchers in the field of Artificial Intelligence.

查看原文本刊更多论文

多模式知识图谱调查：技术与趋势

近年来，知识图谱（KG）在推荐系统和语义搜索等高级知识密集型应用的开发中发挥了至关重要的作用。然而，人类的感官系统本质上是多模态的，因为我们周围的物体通常由多种信号（如视觉和文本信号）组合而成。因此，多模态知识图谱（MMKGs）将结构化知识表示与多种模态相结合，是知识图谱的强大扩展。虽然多模态知识图谱可以处理标准知识图谱无法处理的某些类型的任务（如可视化查询回答）或查询，并且可以有效地解决一些标准问题（如实体配准），但我们对多模态知识图谱缺乏一个广为接受的定义。在这份调查报告中，我们提供了 MMKG 的严格定义，并根据现有方法如何应对表征、融合、对齐和翻译这四个基本挑战提供了一个分类方案，这四个挑战对于改进 MMKG 至关重要。我们的分类方案非常灵活，可以轻松纳入新方法，并根据两种方法如何应对上述基本挑战之一进行比较。作为对 MMKG 的首次全面调查，本文旨在为人工智能领域的相关研究人员提供启发和参考。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Computing Surveys 工程技术-计算机：理论方法

CiteScore

33.20

自引率

0.60%

发文量

372

审稿时长

12 months

期刊介绍： ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.