{"title":"基于对比学习的城市建成环境众包图像多源地理定位","authors":"Qianbao Hou , Ce Hou , Fan Zhang , Qihao Weng","doi":"10.1016/j.isprsjprs.2025.09.024","DOIUrl":null,"url":null,"abstract":"<div><div>Crowd-sourced images (CSIs) offer an unprecedented opportunity for gaining deeper insights into urban built environments. However, the lack of precise geographic information limits their effectiveness in various urban applications. Traditional geo-localization methods, which rely on matching CSIs with geo-tagged street-view images (SVIs), face significant challenges due to sparse coverage and temporal misalignment of reference data, especially in developing countries. To overcome these limitations, this paper proposes a novel contrastive learning framework that integrates SVIs and satellite images (SIs), utilizing a multi-scale channel attention module and InfoNCE loss to enhance the geo-localization accuracy of CSIs. Additionally, we leverage SIs to generate synthetic SVIs in areas where actual SVIs are unavailable or outdated, ensuring comprehensive coverage across diverse urban environments. A simple yet efficient data preprocessing method is proposed to align multi-view images for enhanced feature fusion. As part of our research efforts, we introduce a Multi-Source Geo-localization Dataset (MSGD) consisting of 500k geo-tagged pairs collected from 12 cities across six continents, encompassing diverse urban typologies from dense skyscraper districts to low-density areas, providing a valuable resource for future research and advancements in geo-localization methods. Our experiments show that the proposed method outperforms state-of-the-art approaches on the challenging MSGD dataset, highlighting the importance of incorporating SIs as a complementary data source for accurate geo-localization. Our code and dataset will be released at <span><span>https://github.com/RCAIG/CrowdsourcingGeoLocalization</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"230 ","pages":"Pages 616-629"},"PeriodicalIF":12.2000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-source geo-localization in urban built environments for crowd-sourced images by contrastive learning\",\"authors\":\"Qianbao Hou , Ce Hou , Fan Zhang , Qihao Weng\",\"doi\":\"10.1016/j.isprsjprs.2025.09.024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Crowd-sourced images (CSIs) offer an unprecedented opportunity for gaining deeper insights into urban built environments. However, the lack of precise geographic information limits their effectiveness in various urban applications. Traditional geo-localization methods, which rely on matching CSIs with geo-tagged street-view images (SVIs), face significant challenges due to sparse coverage and temporal misalignment of reference data, especially in developing countries. To overcome these limitations, this paper proposes a novel contrastive learning framework that integrates SVIs and satellite images (SIs), utilizing a multi-scale channel attention module and InfoNCE loss to enhance the geo-localization accuracy of CSIs. Additionally, we leverage SIs to generate synthetic SVIs in areas where actual SVIs are unavailable or outdated, ensuring comprehensive coverage across diverse urban environments. A simple yet efficient data preprocessing method is proposed to align multi-view images for enhanced feature fusion. As part of our research efforts, we introduce a Multi-Source Geo-localization Dataset (MSGD) consisting of 500k geo-tagged pairs collected from 12 cities across six continents, encompassing diverse urban typologies from dense skyscraper districts to low-density areas, providing a valuable resource for future research and advancements in geo-localization methods. Our experiments show that the proposed method outperforms state-of-the-art approaches on the challenging MSGD dataset, highlighting the importance of incorporating SIs as a complementary data source for accurate geo-localization. Our code and dataset will be released at <span><span>https://github.com/RCAIG/CrowdsourcingGeoLocalization</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50269,\"journal\":{\"name\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"volume\":\"230 \",\"pages\":\"Pages 616-629\"},\"PeriodicalIF\":12.2000,\"publicationDate\":\"2025-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092427162500382X\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092427162500382X","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
Multi-source geo-localization in urban built environments for crowd-sourced images by contrastive learning
Crowd-sourced images (CSIs) offer an unprecedented opportunity for gaining deeper insights into urban built environments. However, the lack of precise geographic information limits their effectiveness in various urban applications. Traditional geo-localization methods, which rely on matching CSIs with geo-tagged street-view images (SVIs), face significant challenges due to sparse coverage and temporal misalignment of reference data, especially in developing countries. To overcome these limitations, this paper proposes a novel contrastive learning framework that integrates SVIs and satellite images (SIs), utilizing a multi-scale channel attention module and InfoNCE loss to enhance the geo-localization accuracy of CSIs. Additionally, we leverage SIs to generate synthetic SVIs in areas where actual SVIs are unavailable or outdated, ensuring comprehensive coverage across diverse urban environments. A simple yet efficient data preprocessing method is proposed to align multi-view images for enhanced feature fusion. As part of our research efforts, we introduce a Multi-Source Geo-localization Dataset (MSGD) consisting of 500k geo-tagged pairs collected from 12 cities across six continents, encompassing diverse urban typologies from dense skyscraper districts to low-density areas, providing a valuable resource for future research and advancements in geo-localization methods. Our experiments show that the proposed method outperforms state-of-the-art approaches on the challenging MSGD dataset, highlighting the importance of incorporating SIs as a complementary data source for accurate geo-localization. Our code and dataset will be released at https://github.com/RCAIG/CrowdsourcingGeoLocalization.
期刊介绍:
The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive.
P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields.
In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.