Text Spotting towards Perceptually Aliased Urban Place Recognition

Dulmini Hettiarachchi, Ye Tian, Han Yu, S. Kamijo
{"title":"Text Spotting towards Perceptually Aliased Urban Place Recognition","authors":"Dulmini Hettiarachchi, Ye Tian, Han Yu, S. Kamijo","doi":"10.3390/mti6110102","DOIUrl":null,"url":null,"abstract":"Recognizing places of interest (POIs) can be challenging for humans, especially in foreign environments. In this study, we leverage smartphone sensors (i.e., camera, GPS) and deep learning algorithms to propose an intelligent solution to recognize POIs in an urban environment. Recent studies have approached landmark recognition as an image retrieval problem. However, visual similarity alone is not robust against challenging conditions such as extreme appearance variance and perceptual aliasing in urban environments. To this end, we propose to fuse visual, textual, and positioning information. Our contributions are as follows. Firstly, we propose VPR through text reading pipeline (VPRText) that uses off-the-shelf text spotting algorithms for word spotting followed by layout analysis and text similarity search modules. Secondly, we propose a hierarchical architecture that combines VPRText and image retrieval. Thirdly, we perform a comprehensive empirical study on the applicability of state-of-the-art text spotting methods for the VPR task. Additionally, we introduce a challenging purpose-built urban dataset for VPR evaluation. The proposed VPR architecture achieves a superior performance overall, especially in challenging conditions (i.e., perceptually aliased and illuminated environments).","PeriodicalId":408374,"journal":{"name":"Multimodal Technol. Interact.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimodal Technol. Interact.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/mti6110102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recognizing places of interest (POIs) can be challenging for humans, especially in foreign environments. In this study, we leverage smartphone sensors (i.e., camera, GPS) and deep learning algorithms to propose an intelligent solution to recognize POIs in an urban environment. Recent studies have approached landmark recognition as an image retrieval problem. However, visual similarity alone is not robust against challenging conditions such as extreme appearance variance and perceptual aliasing in urban environments. To this end, we propose to fuse visual, textual, and positioning information. Our contributions are as follows. Firstly, we propose VPR through text reading pipeline (VPRText) that uses off-the-shelf text spotting algorithms for word spotting followed by layout analysis and text similarity search modules. Secondly, we propose a hierarchical architecture that combines VPRText and image retrieval. Thirdly, we perform a comprehensive empirical study on the applicability of state-of-the-art text spotting methods for the VPR task. Additionally, we introduce a challenging purpose-built urban dataset for VPR evaluation. The proposed VPR architecture achieves a superior performance overall, especially in challenging conditions (i.e., perceptually aliased and illuminated environments).
面向感知混叠城市地点识别的文本定位
识别名胜古迹(poi)对人类来说可能是一项挑战,尤其是在陌生的环境中。在本研究中,我们利用智能手机传感器(即摄像头、GPS)和深度学习算法,提出了一种在城市环境中识别poi的智能解决方案。最近的研究将地标识别作为图像检索问题。然而,在城市环境中,视觉相似性本身并不能抵御极端的外观差异和感知混叠等具有挑战性的条件。为此,我们建议融合视觉、文字和定位信息。我们的贡献如下。首先,我们提出了通过文本阅读管道(VPRText)的VPR,该管道使用现成的文本识别算法进行单词识别,然后是布局分析和文本相似度搜索模块。其次,我们提出了一种结合VPRText和图像检索的层次结构。第三,我们对最先进的文本识别方法在VPR任务中的适用性进行了全面的实证研究。此外,我们还介绍了一个具有挑战性的城市数据集,用于VPR评估。提出的VPR架构总体上实现了卓越的性能,特别是在具有挑战性的条件下(即感知混叠和照明环境)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信