Inter- & Intra-City Image Geolocalization

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2022-05-01 DOI:10.1109/CRV55824.2022.00023

J. Tanner, K. Dick, J.R. Green

{"title":"Inter- & Intra-City Image Geolocalization","authors":"J. Tanner, K. Dick, J.R. Green","doi":"10.1109/CRV55824.2022.00023","DOIUrl":null,"url":null,"abstract":"Can a photo be accurately geolocated within a city from its pixels alone? While this image geolocation problem has been successfully addressed at the planetary- and nation-levels when framed as a classification problem using convolutional neural networks, no method has yet been able to precisely geolocate images within the city- and/or at the street-level when framed as a latitude/longitude regression-type problem. We leverage the highly densely sampled Streetlearn dataset of imagery from Manhattan and Pittsburgh to first develop a highly accurate inter-city predictor and then experimentally resolve, for the first time, the intra-city performance limits of framing image geolocation as a regression-type problem. We then reformulate the problem as an extreme-resolution classification task by subdividing the city into hundreds of equirectangular-scaled bins and train our respective intra-city deep convolutional neural network on tens of thousands of images. Our experiments serve as a foundation to develop a scalable inter- and intra-city image geolocation framework that, on average, resolves an image within 250 m2. We demonstrate that our models outperform SIFT-based image retrieval-type models based on differing weather patterns, lighting conditions, location-specific imagery, and are temporally robust when evaluated upon both past and future imagery. Both the practical and ethical ramifications of such a model are also discussed given the threat to individual privacy in a technocentric surveillance capitalist society.","PeriodicalId":131142,"journal":{"name":"2022 19th Conference on Robots and Vision (CRV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 19th Conference on Robots and Vision (CRV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV55824.2022.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Can a photo be accurately geolocated within a city from its pixels alone? While this image geolocation problem has been successfully addressed at the planetary- and nation-levels when framed as a classification problem using convolutional neural networks, no method has yet been able to precisely geolocate images within the city- and/or at the street-level when framed as a latitude/longitude regression-type problem. We leverage the highly densely sampled Streetlearn dataset of imagery from Manhattan and Pittsburgh to first develop a highly accurate inter-city predictor and then experimentally resolve, for the first time, the intra-city performance limits of framing image geolocation as a regression-type problem. We then reformulate the problem as an extreme-resolution classification task by subdividing the city into hundreds of equirectangular-scaled bins and train our respective intra-city deep convolutional neural network on tens of thousands of images. Our experiments serve as a foundation to develop a scalable inter- and intra-city image geolocation framework that, on average, resolves an image within 250 m2. We demonstrate that our models outperform SIFT-based image retrieval-type models based on differing weather patterns, lighting conditions, location-specific imagery, and are temporally robust when evaluated upon both past and future imagery. Both the practical and ethical ramifications of such a model are also discussed given the threat to individual privacy in a technocentric surveillance capitalist society.

查看原文本刊更多论文

城市间与城市内形象地理定位

一张照片能否仅凭像素精确地定位在一个城市内?虽然这个图像地理定位问题已经成功地在行星和国家层面上解决了，当作为一个使用卷积神经网络的分类问题框架时，还没有方法能够精确地定位城市和/或街道层面的图像，当作为一个纬度/经度回归类型的问题框架时。我们利用来自曼哈顿和匹兹堡的高度密集采样的Streetlearn图像数据集，首先开发了一个高度精确的城市间预测器，然后首次通过实验解决了将帧图像地理定位作为回归类型问题的城市内性能限制。然后，我们通过将城市细分为数百个等矩形尺度的箱子，并在数万张图像上训练我们各自的城市内部深度卷积神经网络，将问题重新表述为一个极端分辨率的分类任务。我们的实验为开发可扩展的城市间和城市内图像地理定位框架奠定了基础，该框架平均可在250平方米内解析图像。我们证明，我们的模型优于基于sift的基于不同天气模式、照明条件、特定位置图像的图像检索型模型，并且在对过去和未来图像进行评估时具有时间鲁棒性。鉴于在以技术为中心的监控资本主义社会中对个人隐私的威胁，还讨论了这种模式的实践和伦理后果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 19th Conference on Robots and Vision (CRV)

自引率

0.00%

发文量