要不要拱心石，这就是改正

2021 18th Conference on Robots and Vision (CRV) Pub Date : 2021-05-01 DOI:10.1109/CRV52889.2021.00027

K. Dick, J. Tanner, J. Green

{"title":"要不要拱心石，这就是改正","authors":"K. Dick, J. Tanner, J. Green","doi":"10.1109/CRV52889.2021.00027","DOIUrl":null,"url":null,"abstract":"\"To Keystone or not to Keystone, that is the correction\"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\\color{Magenta}{\\text{Keystoned}}$ imagery versus $\\color{Blue}{\\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"To Keystone or Not to Keystone, that is the Correction\",\"authors\":\"K. Dick, J. Tanner, J. Green\",\"doi\":\"10.1109/CRV52889.2021.00027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\\"To Keystone or not to Keystone, that is the correction\\\"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\\\\color{Magenta}{\\\\text{Keystoned}}$ imagery versus $\\\\color{Blue}{\\\\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.\",\"PeriodicalId\":413697,\"journal\":{\"name\":\"2021 18th Conference on Robots and Vision (CRV)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th Conference on Robots and Vision (CRV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CRV52889.2021.00027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th Conference on Robots and Vision (CRV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV52889.2021.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

“去不去拱心石，这是纠正”……这就是问题!在高度受限的条件下，绝大多数拍摄的自然环境图像都是非正方形的，因此，那些出现在扭曲视角下的物体可以通过Keystone Correction进行计算校正。当考虑来自车载摄像头的图像时，这种差异经常被观察到，例如自动驾驶汽车基础设施或谷歌街景等街景收集计划。作为视觉生物，靠近道路的生活环境中充斥着文字和数字广告，争夺着我们的注意力，而且方便的是，这些标识并不垂直于车辆的前置摄像头。鉴于文本和/或其中包含的值的视角失真，它们的自动检测和读取可能受益于Keystone校正。在这项工作中，我们解决了一个尚未回答的问题:我们可以期望从Keystone校正预处理图像中获得什么好处?我们不明确提倡使用Keystone校正，而是评估其在预测管道中的效用。为此，我们利用了包含多位数、多价格值的美国天然气价格(GPA)数据集和法国街道标志名称(FSNS)多词文本数据集，这些数据集已知几何形状，可以实现图像Keystone校正的自动化。我们从五个方面比较了$\color{Magenta}{\text{keystone}}$图像与$\color{Blue}{\text{non - keystone}}$图像的结果:1)预测性能，2)标注正确性，3)算法计算复杂度和经验时间估计，4)图像缩放，5)透视变换程度。从我们的研究结果中，我们得出了一些关于Keystone校正的好处和负担的建议，为未来在野外提取信息的研究提供信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

To Keystone or Not to Keystone, that is the Correction

"To Keystone or not to Keystone, that is the correction"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\color{Magenta}{\text{Keystoned}}$ imagery versus $\color{Blue}{\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 18th Conference on Robots and Vision (CRV)

自引率

0.00%

发文量