{"title":"要不要拱心石,这就是改正","authors":"K. Dick, J. Tanner, J. Green","doi":"10.1109/CRV52889.2021.00027","DOIUrl":null,"url":null,"abstract":"\"To Keystone or not to Keystone, that is the correction\"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\\color{Magenta}{\\text{Keystoned}}$ imagery versus $\\color{Blue}{\\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.","PeriodicalId":413697,"journal":{"name":"2021 18th Conference on Robots and Vision (CRV)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"To Keystone or Not to Keystone, that is the Correction\",\"authors\":\"K. Dick, J. Tanner, J. Green\",\"doi\":\"10.1109/CRV52889.2021.00027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\\"To Keystone or not to Keystone, that is the correction\\\"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\\\\color{Magenta}{\\\\text{Keystoned}}$ imagery versus $\\\\color{Blue}{\\\\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.\",\"PeriodicalId\":413697,\"journal\":{\"name\":\"2021 18th Conference on Robots and Vision (CRV)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th Conference on Robots and Vision (CRV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CRV52889.2021.00027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th Conference on Robots and Vision (CRV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV52889.2021.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
To Keystone or Not to Keystone, that is the Correction
"To Keystone or not to Keystone, that is the correction"... and indeed the question! Outside of highly constrained conditions, the vast majority of photographed imagery of the natural environment is taken non-square to the objects that they represent Consequently, those objects appearing at a distorted perspective may be computationally corrected via Keystone Correction. This disparity is frequently observed when considering imagery sourced from vehicle-mounted cameras, such as those levied in autonomous vehicle infrastructure or by streetscape collection initiatives such as Google Street View. As visual creatures, the lived environment proximal to roadways is filled with text- and numeric-based advertisements vying for our attention and, conveniently, this signage isn’t placed perpendicular to a vehicle’s forward-facing camera. Given the perspective distortion of the text and/or values contained therein, their automated detection and reading may benefit from Keystone correction. In this work, we address the yet-unanswered question: what benefit might we expect from Keystone correction preprocessing of images? We do not explicitly promote the use of Keystone correction but rather, evaluate its utility within a prediction pipeline. To this end, we leverage the Gas Prices of America (GPA) dataset containing multi-digit, multi-price values and the French Street Sign Names (FSNS) multi-word text dataset given their known geometry enabling the automation of image Keystone correction. We compare the outcomes of $\color{Magenta}{\text{Keystoned}}$ imagery versus $\color{Blue}{\text{non - Keystoned}}$ imagery along five axes: 1) predictive performance, 2) annotation correctness, 3) algorithmic computational complexity and empirical time estimation, 4) image scaling, and 5) degree of perspective transform. From our findings, we arrive at several recommendations on both the benefit & burden of Keystone correction to inform future research on extracting information in the wild.