Changchang Wu, F. Fraundorfer, Jan-Michael Frahm, M. Pollefeys
{"title":"3D model search and pose estimation from single images using VIP features","authors":"Changchang Wu, F. Fraundorfer, Jan-Michael Frahm, M. Pollefeys","doi":"10.1109/CVPRW.2008.4563037","DOIUrl":null,"url":null,"abstract":"This paper describes a method to efficiently search for 3D models in a city-scale database and to compute the camera poses from single query images. The proposed method matches SIFT features (from a single image) to viewpoint invariant patches (VIP) from a 3D model by warping the SIFT features approximately into the orthographic frame of the VIP features. This significantly increases the number of feature correspondences which results in a reliable and robust pose estimation. We also present a 3D model search tool that uses a visual word based search scheme to efficiently retrieve 3D models from large databases using individual query images. Together the 3D model search and the pose estimation represent a highly scalable and efficient city-scale localization system. The performance of the 3D model search and pose estimation is demonstrated on urban image data.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW.2008.4563037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32
Abstract
This paper describes a method to efficiently search for 3D models in a city-scale database and to compute the camera poses from single query images. The proposed method matches SIFT features (from a single image) to viewpoint invariant patches (VIP) from a 3D model by warping the SIFT features approximately into the orthographic frame of the VIP features. This significantly increases the number of feature correspondences which results in a reliable and robust pose estimation. We also present a 3D model search tool that uses a visual word based search scheme to efficiently retrieve 3D models from large databases using individual query images. Together the 3D model search and the pose estimation represent a highly scalable and efficient city-scale localization system. The performance of the 3D model search and pose estimation is demonstrated on urban image data.