基于街景影像和卫星影像的城市功能区识别统一多模态学习方法

IF 8.6 Q1 REMOTE SENSING
Jiajun Chen, Runyu Fan, Hongyang Niu, Zijian Xu, Jining Yan, Weijing Song, Ruyi Feng
{"title":"基于街景影像和卫星影像的城市功能区识别统一多模态学习方法","authors":"Jiajun Chen,&nbsp;Runyu Fan,&nbsp;Hongyang Niu,&nbsp;Zijian Xu,&nbsp;Jining Yan,&nbsp;Weijing Song,&nbsp;Ruyi Feng","doi":"10.1016/j.jag.2025.104685","DOIUrl":null,"url":null,"abstract":"<div><div>Urban functional zones (UFZ) are areas that divide urban space into specific uses based on the distribution of different human activities and infrastructure. UFZ mapping is to analyze the geographic information data of urban space, combine remote sensing images (RSI), point of interest (POI) data and other data sources, and use advanced spatial analysis technology to divide and visualize the UFZ. The intelligent interpretation of UFZ can provide support for urban management and planning. Previous studies on UFZ mainly focused on using remote sensing images and POI data, which can obtain the city’s macroscopic remote sensing visual features and the distribution of land use. However, these methods often ignore the inner-street details due to the absence of using inner-street perspective data and cannot capture the complex spatial relations between objects in complex urban scenes, resulting in unsatisfied UFZ results. For this purpose, we propose a unified multimodal learning method to interpret UFZ by combining remote sensing images, POI data, and street view data with inner-street details to provide a more comprehensive perspective to boost UFZ interpretation. To make full use of the inner-street perspective advantage of street view images (SVI), we not only use their visual features but also extract textual features that can reflect various human activities in street views through image captioning technology, better to capture the subtle socio-economic activity information in urban space. We conduct extensive experiments in Wuhan, Changsha, and Nanchang. The OA of this method on the test set reached 91.80%. Experimental results show a significant improvement in the model’s performance in interpreting UFZ.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"142 ","pages":"Article 104685"},"PeriodicalIF":8.6000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A unified multimodal learning method for urban functional zone identification by fusing inner-street visual–textual information from street-view and satellite images\",\"authors\":\"Jiajun Chen,&nbsp;Runyu Fan,&nbsp;Hongyang Niu,&nbsp;Zijian Xu,&nbsp;Jining Yan,&nbsp;Weijing Song,&nbsp;Ruyi Feng\",\"doi\":\"10.1016/j.jag.2025.104685\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Urban functional zones (UFZ) are areas that divide urban space into specific uses based on the distribution of different human activities and infrastructure. UFZ mapping is to analyze the geographic information data of urban space, combine remote sensing images (RSI), point of interest (POI) data and other data sources, and use advanced spatial analysis technology to divide and visualize the UFZ. The intelligent interpretation of UFZ can provide support for urban management and planning. Previous studies on UFZ mainly focused on using remote sensing images and POI data, which can obtain the city’s macroscopic remote sensing visual features and the distribution of land use. However, these methods often ignore the inner-street details due to the absence of using inner-street perspective data and cannot capture the complex spatial relations between objects in complex urban scenes, resulting in unsatisfied UFZ results. For this purpose, we propose a unified multimodal learning method to interpret UFZ by combining remote sensing images, POI data, and street view data with inner-street details to provide a more comprehensive perspective to boost UFZ interpretation. To make full use of the inner-street perspective advantage of street view images (SVI), we not only use their visual features but also extract textual features that can reflect various human activities in street views through image captioning technology, better to capture the subtle socio-economic activity information in urban space. We conduct extensive experiments in Wuhan, Changsha, and Nanchang. The OA of this method on the test set reached 91.80%. Experimental results show a significant improvement in the model’s performance in interpreting UFZ.</div></div>\",\"PeriodicalId\":73423,\"journal\":{\"name\":\"International journal of applied earth observation and geoinformation : ITC journal\",\"volume\":\"142 \",\"pages\":\"Article 104685\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of applied earth observation and geoinformation : ITC journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1569843225003322\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"REMOTE SENSING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843225003322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0

摘要

城市功能区(UFZ)是根据不同人类活动和基础设施的分布将城市空间划分为特定用途的区域。UFZ制图是对城市空间的地理信息数据进行分析,结合遥感影像(RSI)、兴趣点(POI)数据等数据源,利用先进的空间分析技术对UFZ进行划分和可视化。对UFZ的智能解读可以为城市管理和规划提供支持。以往对城市UFZ的研究主要是利用遥感影像和POI数据,获取城市宏观遥感视觉特征和土地利用分布。然而,这些方法由于没有使用内街视角数据,往往忽略了内街细节,无法捕捉复杂城市场景中物体之间复杂的空间关系,导致UFZ效果不理想。为此,我们提出了一种统一的多模态学习方法,将遥感图像、POI数据、街景数据与街道内部细节相结合来解释UFZ,以提供更全面的视角来促进UFZ的解释。为了充分利用街景图像(SVI)的内街视角优势,我们不仅利用街景图像的视觉特征,还通过图像字幕技术提取街景中能够反映各种人类活动的文本特征,更好地捕捉城市空间中微妙的社会经济活动信息。我们在武汉、长沙和南昌进行了广泛的实验。该方法在测试集上的OA达到了91.80%。实验结果表明,该模型在解释UFZ时的性能有了显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A unified multimodal learning method for urban functional zone identification by fusing inner-street visual–textual information from street-view and satellite images
Urban functional zones (UFZ) are areas that divide urban space into specific uses based on the distribution of different human activities and infrastructure. UFZ mapping is to analyze the geographic information data of urban space, combine remote sensing images (RSI), point of interest (POI) data and other data sources, and use advanced spatial analysis technology to divide and visualize the UFZ. The intelligent interpretation of UFZ can provide support for urban management and planning. Previous studies on UFZ mainly focused on using remote sensing images and POI data, which can obtain the city’s macroscopic remote sensing visual features and the distribution of land use. However, these methods often ignore the inner-street details due to the absence of using inner-street perspective data and cannot capture the complex spatial relations between objects in complex urban scenes, resulting in unsatisfied UFZ results. For this purpose, we propose a unified multimodal learning method to interpret UFZ by combining remote sensing images, POI data, and street view data with inner-street details to provide a more comprehensive perspective to boost UFZ interpretation. To make full use of the inner-street perspective advantage of street view images (SVI), we not only use their visual features but also extract textual features that can reflect various human activities in street views through image captioning technology, better to capture the subtle socio-economic activity information in urban space. We conduct extensive experiments in Wuhan, Changsha, and Nanchang. The OA of this method on the test set reached 91.80%. Experimental results show a significant improvement in the model’s performance in interpreting UFZ.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International journal of applied earth observation and geoinformation : ITC journal
International journal of applied earth observation and geoinformation : ITC journal Global and Planetary Change, Management, Monitoring, Policy and Law, Earth-Surface Processes, Computers in Earth Sciences
CiteScore
12.00
自引率
0.00%
发文量
0
审稿时长
77 days
期刊介绍: The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信