Multimodal Fusion of Satellite Images and Crowdsourced GPS Traces for Robust Road Attribute Detection

Proceedings of the 29th International Conference on Advances in Geographic Information Systems Pub Date : 2021-11-02 DOI:10.1145/3474717.3483917

Yifang Yin, An Tran, Ying Zhang, Wenmiao Hu, Guanfeng Wang, Jagannadan Varadarajan, Roger Zimmermann, See-Kiong Ng

{"title":"Multimodal Fusion of Satellite Images and Crowdsourced GPS Traces for Robust Road Attribute Detection","authors":"Yifang Yin, An Tran, Ying Zhang, Wenmiao Hu, Guanfeng Wang, Jagannadan Varadarajan, Roger Zimmermann, See-Kiong Ng","doi":"10.1145/3474717.3483917","DOIUrl":null,"url":null,"abstract":"Automatic inference of missing road attributes (e.g., road type and speed limit) for enriching digital maps has attracted significant research attention in recent years. A number of machine learning based approaches have been proposed to detect road attributes from GPS traces, dash-cam videos, or satellite images. However, existing solutions mostly focus on a single modality without modeling the correlations among multiple data sources. To bridge the gap, we present a multimodal road attribute detection method, which improves the robustness by performing pixel-level fusion of crowdsourced GPS traces and satellite images. A GPS trace is usually given by a sequence of location, bearing, and speed. To align it with satellite imagery in the spatial domain, we render GPS traces into a sequence of multi-channel images that simultaneously capture the global distribution of the GPS points, the local distribution of vehicles' moving directions and speeds, and their temporal changes over time, at each pixel. Unlike previous GPS based road feature extraction methods, our proposed GPS rendering does not require map matching in the data preprocessing step. Moreover, our multimodal solution addresses single-modal challenges such as occlusions in satellite images and data sparsity in GPS traces by learning the pixel-wise correspondences among different data sources. Extensive experiments have been conducted on two real-world datasets in Singapore and Jakarta. Compared with previous work, our method is able to improve the detection accuracy on road attributes by a large margin.","PeriodicalId":340759,"journal":{"name":"Proceedings of the 29th International Conference on Advances in Geographic Information Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 29th International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3474717.3483917","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Automatic inference of missing road attributes (e.g., road type and speed limit) for enriching digital maps has attracted significant research attention in recent years. A number of machine learning based approaches have been proposed to detect road attributes from GPS traces, dash-cam videos, or satellite images. However, existing solutions mostly focus on a single modality without modeling the correlations among multiple data sources. To bridge the gap, we present a multimodal road attribute detection method, which improves the robustness by performing pixel-level fusion of crowdsourced GPS traces and satellite images. A GPS trace is usually given by a sequence of location, bearing, and speed. To align it with satellite imagery in the spatial domain, we render GPS traces into a sequence of multi-channel images that simultaneously capture the global distribution of the GPS points, the local distribution of vehicles' moving directions and speeds, and their temporal changes over time, at each pixel. Unlike previous GPS based road feature extraction methods, our proposed GPS rendering does not require map matching in the data preprocessing step. Moreover, our multimodal solution addresses single-modal challenges such as occlusions in satellite images and data sparsity in GPS traces by learning the pixel-wise correspondences among different data sources. Extensive experiments have been conducted on two real-world datasets in Singapore and Jakarta. Compared with previous work, our method is able to improve the detection accuracy on road attributes by a large margin.

查看原文本刊更多论文

基于多模态融合的卫星图像和众包GPS轨迹鲁棒道路属性检测

近年来，对缺失道路属性(如道路类型和限速)进行自动推断以丰富数字地图，引起了广泛的研究关注。已经提出了许多基于机器学习的方法来从GPS轨迹、行车记录仪视频或卫星图像中检测道路属性。然而，现有的解决方案主要关注单一模态，而没有对多个数据源之间的相关性进行建模。为了弥补这一差距，我们提出了一种多模式道路属性检测方法，该方法通过对众包GPS轨迹和卫星图像进行像素级融合来提高鲁棒性。GPS轨迹通常由一系列位置、方位和速度给出。为了将其与空间域的卫星图像对齐，我们将GPS轨迹渲染成一系列多通道图像，同时捕获GPS点的全球分布、车辆移动方向和速度的局部分布以及它们在每个像素上随时间的变化。与以往基于GPS的道路特征提取方法不同，本文提出的GPS绘制方法在数据预处理阶段不需要地图匹配。此外，我们的多模态解决方案通过学习不同数据源之间的逐像素对应关系来解决单模态挑战，例如卫星图像中的遮挡和GPS轨迹中的数据稀疏性。在新加坡和雅加达的两个真实数据集上进行了广泛的实验。与以往的工作相比，我们的方法能够大大提高道路属性的检测精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 29th International Conference on Advances in Geographic Information Systems

自引率

0.00%

发文量