基于多个单目线索的三维姿态估计

2007 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2007-06-17 DOI:10.1109/CVPR.2007.383352

Björn Barrois, C. Wöhler

{"title":"基于多个单目线索的三维姿态估计","authors":"Björn Barrois, C. Wöhler","doi":"10.1109/CVPR.2007.383352","DOIUrl":null,"url":null,"abstract":"In this study we propose an integrated approach to the problem of 3D pose estimation. The main difference to the majority of known methods is the usage of complementary image information, including intensity and polarisation state of the light reflected from the object surface, edge information, and absolute depth values obtained based on a depth from defocus approach. Our method is based on the comparison of the input image to synthetic images generated by an OpenGL-based renderer using model information about the object provided by CAD data. This comparison provides an error term which is minimised by an iterative optimisation algorithm. Although all six degrees of freedom are estimated, our method requires only a monocular camera, circumventing disadvantages of multiocular camera systems such as the need for external camera calibration. Our framework is open for the inclusion of independently acquired depth data. We evaluate our method on a toy example as well as in two realistic scenarios in the domain of industrial quality inspection. Our experiments regarding complex real-world objects located at a distance of about 0.5 m to the camera show that the algorithm achieves typical accuracies of better than 1 degree for the rotation angles, 1-2 image pixels for the lateral translations, and several millimetres or about 1 percent for the object distance.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"3D Pose Estimation Based on Multiple Monocular Cues\",\"authors\":\"Björn Barrois, C. Wöhler\",\"doi\":\"10.1109/CVPR.2007.383352\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study we propose an integrated approach to the problem of 3D pose estimation. The main difference to the majority of known methods is the usage of complementary image information, including intensity and polarisation state of the light reflected from the object surface, edge information, and absolute depth values obtained based on a depth from defocus approach. Our method is based on the comparison of the input image to synthetic images generated by an OpenGL-based renderer using model information about the object provided by CAD data. This comparison provides an error term which is minimised by an iterative optimisation algorithm. Although all six degrees of freedom are estimated, our method requires only a monocular camera, circumventing disadvantages of multiocular camera systems such as the need for external camera calibration. Our framework is open for the inclusion of independently acquired depth data. We evaluate our method on a toy example as well as in two realistic scenarios in the domain of industrial quality inspection. Our experiments regarding complex real-world objects located at a distance of about 0.5 m to the camera show that the algorithm achieves typical accuracies of better than 1 degree for the rotation angles, 1-2 image pixels for the lateral translations, and several millimetres or about 1 percent for the object distance.\",\"PeriodicalId\":351008,\"journal\":{\"name\":\"2007 IEEE Conference on Computer Vision and Pattern Recognition\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPR.2007.383352\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2007.383352","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

在这项研究中，我们提出了一种集成的方法来解决三维姿态估计问题。与大多数已知方法的主要区别在于使用互补图像信息，包括从物体表面反射的光的强度和偏振状态、边缘信息和基于离焦方法获得的深度绝对值。我们的方法是基于输入图像与基于opengl的渲染器生成的合成图像的比较，该渲染器使用CAD数据提供的关于对象的模型信息。这种比较提供了一个由迭代优化算法最小化的误差项。虽然所有六个自由度都是估计的，但我们的方法只需要一个单目摄像机，避免了多目摄像机系统需要外部摄像机校准的缺点。我们的框架是开放的，可以包含独立获取的深度数据。我们在一个玩具示例以及工业质量检测领域的两个现实场景中评估了我们的方法。我们对距离相机约0.5 m的复杂现实世界物体进行的实验表明，该算法的旋转角度精度优于1度，横向平移精度优于1-2个图像像素，物体距离精度优于几毫米或约1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

3D Pose Estimation Based on Multiple Monocular Cues

In this study we propose an integrated approach to the problem of 3D pose estimation. The main difference to the majority of known methods is the usage of complementary image information, including intensity and polarisation state of the light reflected from the object surface, edge information, and absolute depth values obtained based on a depth from defocus approach. Our method is based on the comparison of the input image to synthetic images generated by an OpenGL-based renderer using model information about the object provided by CAD data. This comparison provides an error term which is minimised by an iterative optimisation algorithm. Although all six degrees of freedom are estimated, our method requires only a monocular camera, circumventing disadvantages of multiocular camera systems such as the need for external camera calibration. Our framework is open for the inclusion of independently acquired depth data. We evaluate our method on a toy example as well as in two realistic scenarios in the domain of industrial quality inspection. Our experiments regarding complex real-world objects located at a distance of about 0.5 m to the camera show that the algorithm achieves typical accuracies of better than 1 degree for the rotation angles, 1-2 image pixels for the lateral translations, and several millimetres or about 1 percent for the object distance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量