要不要拱心石,这就是改正

K. Dick, J. Tanner, J. Green
{"title":"要不要拱心石,这就是改正","authors":"K. Dick, J. Tanner, J. Green","doi":"10.1109/CRV52889.2021.00027","DOIUrl":null,"url":null,"abstract":"\"To Keystone or not to Keystone, that is the correction\"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\\color{Magenta}{\\text{Keystoned}}$ imagery versus $\\color{Blue}{\\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"To Keystone or Not to Keystone, that is the Correction\",\"authors\":\"K. Dick, J. Tanner, J. Green\",\"doi\":\"10.1109/CRV52889.2021.00027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\\"To Keystone or not to Keystone, that is the correction\\\"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\\\\color{Magenta}{\\\\text{Keystoned}}$ imagery versus $\\\\color{Blue}{\\\\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.\",\"PeriodicalId\":413697,\"journal\":{\"name\":\"2021 18th Conference on Robots and Vision (CRV)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th Conference on Robots and Vision (CRV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CRV52889.2021.00027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th Conference on Robots and Vision (CRV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV52889.2021.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

“去不去拱心石,这是纠正”……这就是问题!在高度受限的条件下,绝大多数拍摄的自然环境图像都是非正方形的,因此,那些出现在扭曲视角下的物体可以通过Keystone Correction进行计算校正。当考虑来自车载摄像头的图像时,这种差异经常被观察到,例如自动驾驶汽车基础设施或谷歌街景等街景收集计划。作为视觉生物,靠近道路的生活环境中充斥着文字和数字广告,争夺着我们的注意力,而且方便的是,这些标识并不垂直于车辆的前置摄像头。鉴于文本和/或其中包含的值的视角失真,它们的自动检测和读取可能受益于Keystone校正。在这项工作中,我们解决了一个尚未回答的问题:我们可以期望从Keystone校正预处理图像中获得什么好处?我们不明确提倡使用Keystone校正,而是评估其在预测管道中的效用。为此,我们利用了包含多位数、多价格值的美国天然气价格(GPA)数据集和法国街道标志名称(FSNS)多词文本数据集,这些数据集已知几何形状,可以实现图像Keystone校正的自动化。我们从五个方面比较了$\color{Magenta}{\text{keystone}}$图像与$\color{Blue}{\text{non - keystone}}$图像的结果:1)预测性能,2)标注正确性,3)算法计算复杂度和经验时间估计,4)图像缩放,5)透视变换程度。从我们的研究结果中,我们得出了一些关于Keystone校正的好处和负担的建议,为未来在野外提取信息的研究提供信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
To Keystone or Not to Keystone, that is the Correction
"To Keystone or not to Keystone, that is the correction"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\color{Magenta}{\text{Keystoned}}$ imagery versus $\color{Blue}{\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信