MGKsite：通过模态内和模态间图融合的多模态知识驱动的站点选择

IF 8.4 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Multimedia Pub Date : 2024-12-25 DOI:10.1109/TMM.2024.3521742

Ke Liang;Lingyuan Meng;Hao Li;Meng Liu;Siwei Wang;Sihang Zhou;Xinwang Liu;Kunlun He

{"title":"MGKsite：通过模态内和模态间图融合的多模态知识驱动的站点选择","authors":"Ke Liang;Lingyuan Meng;Hao Li;Meng Liu;Siwei Wang;Sihang Zhou;Xinwang Liu;Kunlun He","doi":"10.1109/TMM.2024.3521742","DOIUrl":null,"url":null,"abstract":"Site selection aims to select optimal locations for new stores, which is crucial in business management and urban computing. The early data-driven models heavily relied on feature engineering, which could not effectively model the complex relationships and diverse influences among different data. To alleviate such issues, the knowledge-driven paradigm is proposed based on urban knowledge graphs (KGs). However, the research on them is at an early stage. They omit extra multi-modal information corresponding to brands and stores due to two main challenges, i.e., (1) building available datasets, and (2) designing effective models. It constrains the expressive ability and practical value of previous models. To this end, we first construct new multi-modal urban KGs for site selection with three extra modal (i.e., visual, textual, and acoustic) attributes. Then, we propose a novel multi-modal knowledge-driven model (MGKsite). Concretely, a graph neural network (GNN) based fusion network is designed to fuse the features based on the attribute K-Nearest Neighbor (KNN) graph, which models both intra and inter-modal correlations among the features. The fused embeddings are further injected into the knowledge-driven backbones for learning and inference. Experiments prove promising capacities of MGKsite from five aspects, i.e., superiority, effectiveness, sensitivity, transferability and complexity.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"1722-1735"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MGKsite: Multi-Modal Knowledge-Driven Site Selection via Intra and Inter-Modal Graph Fusion\",\"authors\":\"Ke Liang;Lingyuan Meng;Hao Li;Meng Liu;Siwei Wang;Sihang Zhou;Xinwang Liu;Kunlun He\",\"doi\":\"10.1109/TMM.2024.3521742\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Site selection aims to select optimal locations for new stores, which is crucial in business management and urban computing. The early data-driven models heavily relied on feature engineering, which could not effectively model the complex relationships and diverse influences among different data. To alleviate such issues, the knowledge-driven paradigm is proposed based on urban knowledge graphs (KGs). However, the research on them is at an early stage. They omit extra multi-modal information corresponding to brands and stores due to two main challenges, i.e., (1) building available datasets, and (2) designing effective models. It constrains the expressive ability and practical value of previous models. To this end, we first construct new multi-modal urban KGs for site selection with three extra modal (i.e., visual, textual, and acoustic) attributes. Then, we propose a novel multi-modal knowledge-driven model (MGKsite). Concretely, a graph neural network (GNN) based fusion network is designed to fuse the features based on the attribute K-Nearest Neighbor (KNN) graph, which models both intra and inter-modal correlations among the features. The fused embeddings are further injected into the knowledge-driven backbones for learning and inference. Experiments prove promising capacities of MGKsite from five aspects, i.e., superiority, effectiveness, sensitivity, transferability and complexity.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"1722-1735\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2024-12-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10814695/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10814695/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

选址旨在为新店选择最佳位置，这在商业管理和城市计算中至关重要。早期的数据驱动模型严重依赖于特征工程，不能有效地对不同数据之间的复杂关系和多种影响进行建模。为了解决这一问题，本文提出了基于城市知识图谱的知识驱动范式。然而，对它们的研究还处于早期阶段。由于两个主要挑战，即(1)建立可用的数据集，(2)设计有效的模型，它们忽略了与品牌和商店相对应的额外多模态信息。这制约了以往模型的表达能力和实用价值。为此，我们首先构建了新的多模态城市kg，用于选址，其中包含三个额外的模态（即视觉、文本和声学）属性。然后，我们提出了一个新的多模态知识驱动模型（MGKsite）。具体而言，设计了基于图神经网络（GNN）的特征融合网络，该网络基于属性k -最近邻（KNN）图，对特征之间的模态内和模态间相关性进行建模。将融合的嵌入进一步注入到知识驱动的主干中进行学习和推理。实验从优越性、有效性、敏感性、可转移性和复杂性五个方面证明了MGKsite的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MGKsite: Multi-Modal Knowledge-Driven Site Selection via Intra and Inter-Modal Graph Fusion

Site selection aims to select optimal locations for new stores, which is crucial in business management and urban computing. The early data-driven models heavily relied on feature engineering, which could not effectively model the complex relationships and diverse influences among different data. To alleviate such issues, the knowledge-driven paradigm is proposed based on urban knowledge graphs (KGs). However, the research on them is at an early stage. They omit extra multi-modal information corresponding to brands and stores due to two main challenges, i.e., (1) building available datasets, and (2) designing effective models. It constrains the expressive ability and practical value of previous models. To this end, we first construct new multi-modal urban KGs for site selection with three extra modal (i.e., visual, textual, and acoustic) attributes. Then, we propose a novel multi-modal knowledge-driven model (MGKsite). Concretely, a graph neural network (GNN) based fusion network is designed to fuse the features based on the attribute K-Nearest Neighbor (KNN) graph, which models both intra and inter-modal correlations among the features. The fused embeddings are further injected into the knowledge-driven backbones for learning and inference. Experiments prove promising capacities of MGKsite from five aspects, i.e., superiority, effectiveness, sensitivity, transferability and complexity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.