Global Assists Local: Effective Aerial Representations for Field of View Constrained Image Geo-Localization

Royston Rodrigues, Masahiro Tani
{"title":"Global Assists Local: Effective Aerial Representations for Field of View Constrained Image Geo-Localization","authors":"Royston Rodrigues, Masahiro Tani","doi":"10.1109/WACV51458.2022.00275","DOIUrl":null,"url":null,"abstract":"When we humans recognize places from images, we not only infer about the objects that are available but even think about landmarks that might be surrounding it. Current place recognition approaches lack the ability to go beyond objects that are available in the image and hence miss out on understanding the scene completely. In this paper, we take a step towards holistic scene understanding. We address the problem of image geo-localization by retrieving corresponding aerial views from a large database of geotagged aerial imagery. One of the main challenges in tackling this problem is the limited Field of View (FoV) nature of query images which needs to be matched to aerial views which contain 360°FoV details. State-of-the-art method DSM-Net [19] tackles this challenge by matching aerial images locally within fixed FoV sectors. We show that local matching limits complete scene understanding and is inadequate when partial buildings are visible in query images or when local sectors of aerial images are covered by dense trees. Our approach considers both local and global properties of aerial images and hence is robust to such conditions. Experiments on standard benchmarks demonstrates that the proposed approach improves top-1% image recall rate on the CVACT [9] data-set from 57.08% to 77.19% and from 61.20% to 75.21% on the CVUSA [28] data-set for 70°FoV. We also achieve state-of-the art results for 90°FoV on both CVACT [9] and CVUSA [28] data-sets demonstrating the effectiveness of our proposed method.","PeriodicalId":297092,"journal":{"name":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV51458.2022.00275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

When we humans recognize places from images, we not only infer about the objects that are available but even think about landmarks that might be surrounding it. Current place recognition approaches lack the ability to go beyond objects that are available in the image and hence miss out on understanding the scene completely. In this paper, we take a step towards holistic scene understanding. We address the problem of image geo-localization by retrieving corresponding aerial views from a large database of geotagged aerial imagery. One of the main challenges in tackling this problem is the limited Field of View (FoV) nature of query images which needs to be matched to aerial views which contain 360°FoV details. State-of-the-art method DSM-Net [19] tackles this challenge by matching aerial images locally within fixed FoV sectors. We show that local matching limits complete scene understanding and is inadequate when partial buildings are visible in query images or when local sectors of aerial images are covered by dense trees. Our approach considers both local and global properties of aerial images and hence is robust to such conditions. Experiments on standard benchmarks demonstrates that the proposed approach improves top-1% image recall rate on the CVACT [9] data-set from 57.08% to 77.19% and from 61.20% to 75.21% on the CVUSA [28] data-set for 70°FoV. We also achieve state-of-the art results for 90°FoV on both CVACT [9] and CVUSA [28] data-sets demonstrating the effectiveness of our proposed method.
全局辅助局部:视场约束图像地理定位的有效空中表示
当我们人类从图像中识别地点时,我们不仅会推断出可用的物体,甚至还会考虑周围可能存在的地标。目前的位置识别方法缺乏超越图像中可用物体的能力,因此无法完全理解场景。在本文中,我们向整体场景理解迈出了一步。我们通过从大型地理标记航空图像数据库中检索相应的鸟瞰图来解决图像地理定位问题。解决这个问题的主要挑战之一是查询图像的有限视场(FoV)性质,需要与包含360°FoV细节的鸟瞰图相匹配。最先进的方法DSM-Net[19]通过在固定视场区域内局部匹配航空图像来解决这一挑战。我们表明,局部匹配限制了完整的场景理解,当查询图像中可见部分建筑物或航拍图像的局部区域被茂密的树木覆盖时,局部匹配是不充分的。我们的方法考虑了航空图像的局部和全局特性,因此对这些条件具有鲁棒性。标准基准实验表明,该方法将CVACT[9]数据集上前1%的图像召回率从57.08%提高到77.19%,在CVUSA[28]数据集上从61.20%提高到75.21%。我们还在CVACT[9]和CVUSA[28]数据集上实现了90°视场的最先进结果,证明了我们提出的方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信