Ke Liang;Lingyuan Meng;Hao Li;Meng Liu;Siwei Wang;Sihang Zhou;Xinwang Liu;Kunlun He
{"title":"MGKsite:通过模态内和模态间图融合的多模态知识驱动的站点选择","authors":"Ke Liang;Lingyuan Meng;Hao Li;Meng Liu;Siwei Wang;Sihang Zhou;Xinwang Liu;Kunlun He","doi":"10.1109/TMM.2024.3521742","DOIUrl":null,"url":null,"abstract":"Site selection aims to select optimal locations for new stores, which is crucial in business management and urban computing. The early data-driven models heavily relied on feature engineering, which could not effectively model the complex relationships and diverse influences among different data. To alleviate such issues, the knowledge-driven paradigm is proposed based on urban knowledge graphs (KGs). However, the research on them is at an early stage. They omit extra multi-modal information corresponding to brands and stores due to two main challenges, i.e., (1) building available datasets, and (2) designing effective models. It constrains the expressive ability and practical value of previous models. To this end, we first construct new multi-modal urban KGs for site selection with three extra modal (i.e., visual, textual, and acoustic) attributes. Then, we propose a novel multi-modal knowledge-driven model (MGKsite). Concretely, a graph neural network (GNN) based fusion network is designed to fuse the features based on the attribute K-Nearest Neighbor (KNN) graph, which models both intra and inter-modal correlations among the features. The fused embeddings are further injected into the knowledge-driven backbones for learning and inference. Experiments prove promising capacities of MGKsite from five aspects, i.e., superiority, effectiveness, sensitivity, transferability and complexity.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"1722-1735"},"PeriodicalIF":8.4000,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MGKsite: Multi-Modal Knowledge-Driven Site Selection via Intra and Inter-Modal Graph Fusion\",\"authors\":\"Ke Liang;Lingyuan Meng;Hao Li;Meng Liu;Siwei Wang;Sihang Zhou;Xinwang Liu;Kunlun He\",\"doi\":\"10.1109/TMM.2024.3521742\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Site selection aims to select optimal locations for new stores, which is crucial in business management and urban computing. The early data-driven models heavily relied on feature engineering, which could not effectively model the complex relationships and diverse influences among different data. To alleviate such issues, the knowledge-driven paradigm is proposed based on urban knowledge graphs (KGs). However, the research on them is at an early stage. They omit extra multi-modal information corresponding to brands and stores due to two main challenges, i.e., (1) building available datasets, and (2) designing effective models. It constrains the expressive ability and practical value of previous models. To this end, we first construct new multi-modal urban KGs for site selection with three extra modal (i.e., visual, textual, and acoustic) attributes. Then, we propose a novel multi-modal knowledge-driven model (MGKsite). Concretely, a graph neural network (GNN) based fusion network is designed to fuse the features based on the attribute K-Nearest Neighbor (KNN) graph, which models both intra and inter-modal correlations among the features. The fused embeddings are further injected into the knowledge-driven backbones for learning and inference. Experiments prove promising capacities of MGKsite from five aspects, i.e., superiority, effectiveness, sensitivity, transferability and complexity.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"1722-1735\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2024-12-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10814695/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10814695/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
MGKsite: Multi-Modal Knowledge-Driven Site Selection via Intra and Inter-Modal Graph Fusion
Site selection aims to select optimal locations for new stores, which is crucial in business management and urban computing. The early data-driven models heavily relied on feature engineering, which could not effectively model the complex relationships and diverse influences among different data. To alleviate such issues, the knowledge-driven paradigm is proposed based on urban knowledge graphs (KGs). However, the research on them is at an early stage. They omit extra multi-modal information corresponding to brands and stores due to two main challenges, i.e., (1) building available datasets, and (2) designing effective models. It constrains the expressive ability and practical value of previous models. To this end, we first construct new multi-modal urban KGs for site selection with three extra modal (i.e., visual, textual, and acoustic) attributes. Then, we propose a novel multi-modal knowledge-driven model (MGKsite). Concretely, a graph neural network (GNN) based fusion network is designed to fuse the features based on the attribute K-Nearest Neighbor (KNN) graph, which models both intra and inter-modal correlations among the features. The fused embeddings are further injected into the knowledge-driven backbones for learning and inference. Experiments prove promising capacities of MGKsite from five aspects, i.e., superiority, effectiveness, sensitivity, transferability and complexity.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.