{"title":"基于多个单目线索的三维姿态估计","authors":"Björn Barrois, C. Wöhler","doi":"10.1109/CVPR.2007.383352","DOIUrl":null,"url":null,"abstract":"In this study we propose an integrated approach to the problem of 3D pose estimation. The main difference to the majority of known methods is the usage of complementary image information, including intensity and polarisation state of the light reflected from the object surface, edge information, and absolute depth values obtained based on a depth from defocus approach. Our method is based on the comparison of the input image to synthetic images generated by an OpenGL-based renderer using model information about the object provided by CAD data. This comparison provides an error term which is minimised by an iterative optimisation algorithm. Although all six degrees of freedom are estimated, our method requires only a monocular camera, circumventing disadvantages of multiocular camera systems such as the need for external camera calibration. Our framework is open for the inclusion of independently acquired depth data. We evaluate our method on a toy example as well as in two realistic scenarios in the domain of industrial quality inspection. Our experiments regarding complex real-world objects located at a distance of about 0.5 m to the camera show that the algorithm achieves typical accuracies of better than 1 degree for the rotation angles, 1-2 image pixels for the lateral translations, and several millimetres or about 1 percent for the object distance.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"3D Pose Estimation Based on Multiple Monocular Cues\",\"authors\":\"Björn Barrois, C. Wöhler\",\"doi\":\"10.1109/CVPR.2007.383352\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study we propose an integrated approach to the problem of 3D pose estimation. The main difference to the majority of known methods is the usage of complementary image information, including intensity and polarisation state of the light reflected from the object surface, edge information, and absolute depth values obtained based on a depth from defocus approach. Our method is based on the comparison of the input image to synthetic images generated by an OpenGL-based renderer using model information about the object provided by CAD data. This comparison provides an error term which is minimised by an iterative optimisation algorithm. Although all six degrees of freedom are estimated, our method requires only a monocular camera, circumventing disadvantages of multiocular camera systems such as the need for external camera calibration. Our framework is open for the inclusion of independently acquired depth data. We evaluate our method on a toy example as well as in two realistic scenarios in the domain of industrial quality inspection. Our experiments regarding complex real-world objects located at a distance of about 0.5 m to the camera show that the algorithm achieves typical accuracies of better than 1 degree for the rotation angles, 1-2 image pixels for the lateral translations, and several millimetres or about 1 percent for the object distance.\",\"PeriodicalId\":351008,\"journal\":{\"name\":\"2007 IEEE Conference on Computer Vision and Pattern Recognition\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2007.383352\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2007.383352","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
3D Pose Estimation Based on Multiple Monocular Cues
In this study we propose an integrated approach to the problem of 3D pose estimation. The main difference to the majority of known methods is the usage of complementary image information, including intensity and polarisation state of the light reflected from the object surface, edge information, and absolute depth values obtained based on a depth from defocus approach. Our method is based on the comparison of the input image to synthetic images generated by an OpenGL-based renderer using model information about the object provided by CAD data. This comparison provides an error term which is minimised by an iterative optimisation algorithm. Although all six degrees of freedom are estimated, our method requires only a monocular camera, circumventing disadvantages of multiocular camera systems such as the need for external camera calibration. Our framework is open for the inclusion of independently acquired depth data. We evaluate our method on a toy example as well as in two realistic scenarios in the domain of industrial quality inspection. Our experiments regarding complex real-world objects located at a distance of about 0.5 m to the camera show that the algorithm achieves typical accuracies of better than 1 degree for the rotation angles, 1-2 image pixels for the lateral translations, and several millimetres or about 1 percent for the object distance.