Multi-modal contrastive learning of urban space representations from POI data

IF 7.1 1区地球科学 Q1 ENVIRONMENTAL STUDIES

Computers Environment and Urban Systems Pub Date : 2025-04-30 DOI:10.1016/j.compenvurbsys.2025.102299

Xinglei Wang , Tao Cheng , Stephen Law , Zichao Zeng , Lu Yin , Junyuan Liu

{"title":"Multi-modal contrastive learning of urban space representations from POI data","authors":"Xinglei Wang , Tao Cheng , Stephen Law , Zichao Zeng , Lu Yin , Junyuan Liu","doi":"10.1016/j.compenvurbsys.2025.102299","DOIUrl":null,"url":null,"abstract":"<div><div>Understanding and characterising urban environment is crucial for urban planning and geospatial analysis. One common approach to this process is through using point of interest (POI) data, which offers rich information about the spatial-semantic characteristics of urban spaces. Existing methods for learning urban space representations from POIs face several limitations, including reliance on predefined spatial units, ignorance of POI location information, underutilisation of POI semantic attributes, and computational inefficiencies. To address these gaps, we propose CaLLiPer (Contrastive Language-Location Pre-training), a novel approach that directly embeds continuous urban spaces into vector representations that capture the spatial and semantic characteristics of urban environment. This model leverages multimodal contrastive learning to align location embeddings with textual descriptions of POIs, bypassing the need for complex training corpus construction and negative sampling. Applying CaLLiPer to learning urban space representations in London, UK, we demonstrate 5–15% improvement in predictive performance for land use classification and socioeconomic mapping tasks compared to state-of-the-art methods. Visualisations and correlation analysis of the learned representations further verify our model's ability to capture spatial variations in urban semantics with high accuracy and fine resolution. Moreover, CaLLiPer achieves reduced training time, showcasing its efficiency and scalability. Additional experiments demonstrate the robustness of our model across different spatial scales and urban context. Notably, the experiment on Singapore showed an improvement of over 20%. This work also provides a promising pathway for scalable, semantically rich urban space representation learning that can support the development of geospatial foundation models. The implementation code is available at https://github.com/xlwang233/CaLLiPer<svg><path></path></svg>.</div></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"120 ","pages":"Article 102299"},"PeriodicalIF":7.1000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers Environment and Urban Systems","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0198971525000523","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL STUDIES","Score":null,"Total":0}

引用次数: 0

Abstract

Understanding and characterising urban environment is crucial for urban planning and geospatial analysis. One common approach to this process is through using point of interest (POI) data, which offers rich information about the spatial-semantic characteristics of urban spaces. Existing methods for learning urban space representations from POIs face several limitations, including reliance on predefined spatial units, ignorance of POI location information, underutilisation of POI semantic attributes, and computational inefficiencies. To address these gaps, we propose CaLLiPer (Contrastive Language-Location Pre-training), a novel approach that directly embeds continuous urban spaces into vector representations that capture the spatial and semantic characteristics of urban environment. This model leverages multimodal contrastive learning to align location embeddings with textual descriptions of POIs, bypassing the need for complex training corpus construction and negative sampling. Applying CaLLiPer to learning urban space representations in London, UK, we demonstrate 5–15% improvement in predictive performance for land use classification and socioeconomic mapping tasks compared to state-of-the-art methods. Visualisations and correlation analysis of the learned representations further verify our model's ability to capture spatial variations in urban semantics with high accuracy and fine resolution. Moreover, CaLLiPer achieves reduced training time, showcasing its efficiency and scalability. Additional experiments demonstrate the robustness of our model across different spatial scales and urban context. Notably, the experiment on Singapore showed an improvement of over 20%. This work also provides a promising pathway for scalable, semantically rich urban space representation learning that can support the development of geospatial foundation models. The implementation code is available at https://github.com/xlwang233/CaLLiPer.

查看原文本刊更多论文

基于POI数据的城市空间表征的多模态对比学习

了解和描述城市环境对城市规划和地理空间分析至关重要。实现这一过程的一种常见方法是使用兴趣点（POI）数据，这些数据提供了关于城市空间空间语义特征的丰富信息。现有的从POI中学习城市空间表示的方法面临一些限制，包括依赖预定义的空间单元、忽略POI位置信息、未充分利用POI语义属性以及计算效率低下。为了解决这些差距，我们提出了CaLLiPer（对比语言-位置预训练），这是一种新颖的方法，它直接将连续的城市空间嵌入到矢量表示中，从而捕捉城市环境的空间和语义特征。该模型利用多模态对比学习将位置嵌入与poi的文本描述对齐，从而绕过了复杂的训练语料库构建和负采样的需要。将CaLLiPer应用于学习英国伦敦的城市空间表示，我们证明，与最先进的方法相比，土地利用分类和社会经济制图任务的预测性能提高了5-15%。对学习表征的可视化和相关性分析进一步验证了我们的模型以高精度和高分辨率捕获城市语义空间变化的能力。此外，CaLLiPer实现了更短的训练时间，展示了其效率和可扩展性。其他实验证明了我们的模型在不同空间尺度和城市背景下的稳健性。值得注意的是，在新加坡的实验显示，改善幅度超过20%。这项工作还为可扩展的、语义丰富的城市空间表示学习提供了一条有希望的途径，可以支持地理空间基础模型的开发。实现代码可从https://github.com/xlwang233/CaLLiPer获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computers Environment and Urban Systems Multiple-

CiteScore

13.30

自引率

7.40%

发文量

111

审稿时长

32 days

期刊介绍： Computers, Environment and Urban Systemsis an interdisciplinary journal publishing cutting-edge and innovative computer-based research on environmental and urban systems, that privileges the geospatial perspective. The journal welcomes original high quality scholarship of a theoretical, applied or technological nature, and provides a stimulating presentation of perspectives, research developments, overviews of important new technologies and uses of major computational, information-based, and visualization innovations. Applied and theoretical contributions demonstrate the scope of computer-based analysis fostering a better understanding of environmental and urban systems, their spatial scope and their dynamics.