{"title":"Semantic Scene Models for Visual Localization under Large Viewpoint Changes","authors":"J. Li, Zhaoqi Xu, D. Meger, G. Dudek","doi":"10.1109/CRV.2018.00033","DOIUrl":null,"url":null,"abstract":"We propose an approach for camera pose estimation under large viewpoint changes using only 2D RGB images. This enables a mobile robot to relocalize itself with respect to a previously-visited scene when seeing it again from a completely new vantage point. In order to overcome large appearance changes, we integrate a variety of cues, including object detections, vanishing points, structure from motion, and object-to-object context in order to constrain the camera geometry, while simultaneously estimating the 3D pose of covisible objects represented as bounding cuboids. We propose an efficient sampling-based approach that quickly cuts down the high-dimensional search space, and a robust correspondence algorithm that matches covisible objects via inter-object spatial relationships. We validate our approach using the publicly available Sun3D dataset, in which we demonstrate the ability to handle camera translations of up to 5.9 meters and camera rotations of up to 110 degrees.","PeriodicalId":281779,"journal":{"name":"2018 15th Conference on Computer and Robot Vision (CRV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 15th Conference on Computer and Robot Vision (CRV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CRV.2018.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
We propose an approach for camera pose estimation under large viewpoint changes using only 2D RGB images. This enables a mobile robot to relocalize itself with respect to a previously-visited scene when seeing it again from a completely new vantage point. In order to overcome large appearance changes, we integrate a variety of cues, including object detections, vanishing points, structure from motion, and object-to-object context in order to constrain the camera geometry, while simultaneously estimating the 3D pose of covisible objects represented as bounding cuboids. We propose an efficient sampling-based approach that quickly cuts down the high-dimensional search space, and a robust correspondence algorithm that matches covisible objects via inter-object spatial relationships. We validate our approach using the publicly available Sun3D dataset, in which we demonstrate the ability to handle camera translations of up to 5.9 meters and camera rotations of up to 110 degrees.