Xuehao Zhai , Junqi Jiang , Adam Dejl , Antonio Rago , Fangce Guo , Francesca Toni , Aruna Sivakumar
{"title":"Heterogeneous graph neural networks with post-hoc explanations for multi-modal and explainable land use inference","authors":"Xuehao Zhai , Junqi Jiang , Adam Dejl , Antonio Rago , Fangce Guo , Francesca Toni , Aruna Sivakumar","doi":"10.1016/j.inffus.2025.103057","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, the increased use of sensor and location technologies has facilitated the collection of multi-modal mobility data, offering valuable insights into daily activity patterns. Many studies have adopted advanced data-driven techniques to explore the potential of these multi-modal mobility data in land use inference. However, existing studies often process samples independently, ignoring the spatial correlations among neighbouring objects and heterogeneity among different services. Furthermore, the inherently low interpretability of complex deep learning methods poses a significant barrier in urban planning, where transparency and extrapolability are crucial for making long-term policy decisions. To overcome these challenges, we introduce an explainable framework for inferring land use that synergises heterogeneous graph neural networks (HGNs) with Explainable AI techniques, enhancing both accuracy and explainability. We evaluate the proposed approach on three cities with different urban layout and mobility combinations: London (with tube, bus and bike sharing data sources), San Francisco (parking and bike sharing), and New York City (metro and bike sharing). The empirical experiments demonstrate that the proposed HGNs significantly outperform baseline graph neural networks for all six land use indicators, especially in terms of ‘office’ and ‘sustenance’. We then deploy feature attribution (at both temporal and spacial levels) and counterfactual explanations, which shed light on several important findings. The node-based feature attribution explanations show that the symmetrical nature of the ‘residence’ and ‘work’ categories predicted by the framework aligns well with the commuters’ ‘work’ and ‘recreation’ activities. Meanwhile, the spatial feature attribution explanations indicates that the heightened central importance of commercial categories and the dominance of residential influences in outer zones align closely with typical urban structures. Finally, the counterfactual explanations reveal that variations in node features and types are primarily responsible for the differences observed between the predicted land use distribution and the ideal mixed state. These analyses demonstrate that the proposed HGNs can suitably support urban stakeholders in their urban planning and policy-making. The source code is available at <span><span>https://github.com/xuehao0806/GNN-land-use</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103057"},"PeriodicalIF":14.7000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525001307","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, the increased use of sensor and location technologies has facilitated the collection of multi-modal mobility data, offering valuable insights into daily activity patterns. Many studies have adopted advanced data-driven techniques to explore the potential of these multi-modal mobility data in land use inference. However, existing studies often process samples independently, ignoring the spatial correlations among neighbouring objects and heterogeneity among different services. Furthermore, the inherently low interpretability of complex deep learning methods poses a significant barrier in urban planning, where transparency and extrapolability are crucial for making long-term policy decisions. To overcome these challenges, we introduce an explainable framework for inferring land use that synergises heterogeneous graph neural networks (HGNs) with Explainable AI techniques, enhancing both accuracy and explainability. We evaluate the proposed approach on three cities with different urban layout and mobility combinations: London (with tube, bus and bike sharing data sources), San Francisco (parking and bike sharing), and New York City (metro and bike sharing). The empirical experiments demonstrate that the proposed HGNs significantly outperform baseline graph neural networks for all six land use indicators, especially in terms of ‘office’ and ‘sustenance’. We then deploy feature attribution (at both temporal and spacial levels) and counterfactual explanations, which shed light on several important findings. The node-based feature attribution explanations show that the symmetrical nature of the ‘residence’ and ‘work’ categories predicted by the framework aligns well with the commuters’ ‘work’ and ‘recreation’ activities. Meanwhile, the spatial feature attribution explanations indicates that the heightened central importance of commercial categories and the dominance of residential influences in outer zones align closely with typical urban structures. Finally, the counterfactual explanations reveal that variations in node features and types are primarily responsible for the differences observed between the predicted land use distribution and the ideal mixed state. These analyses demonstrate that the proposed HGNs can suitably support urban stakeholders in their urban planning and policy-making. The source code is available at https://github.com/xuehao0806/GNN-land-use.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.