HyperDepth: Learning Depth from Structured Light without Matching

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI:10.1109/CVPR.2016.587

S. Fanello, Christoph Rhemann, V. Tankovich, Adarsh Kowdle, Sergio Orts, David Kim, S. Izadi

{"title":"HyperDepth: Learning Depth from Structured Light without Matching","authors":"S. Fanello, Christoph Rhemann, V. Tankovich, Adarsh Kowdle, Sergio Orts, David Kim, S. Izadi","doi":"10.1109/CVPR.2016.587","DOIUrl":null,"url":null,"abstract":"Structured light sensors are popular due to their robustness to untextured scenes and multipath. These systems triangulate depth by solving a correspondence problem between each camera and projector pixel. This is often framed as a local stereo matching task, correlating patches of pixels in the observed and reference image. However, this is computationally intensive, leading to reduced depth accuracy and framerate. We contribute an algorithm for solving this correspondence problem efficiently, without compromising depth accuracy. For the first time, this problem is cast as a classification-regression task, which we solve extremely efficiently using an ensemble of cascaded random forests. Our algorithm scales in number of disparities, and each pixel can be processed independently, and in parallel. No matching or even access to the corresponding reference pattern is required at runtime, and regressed labels are directly mapped to depth. Our GPU-based algorithm runs at a 1KHz for 1.3MP input/output images, with disparity error of 0.1 subpixels. We show a prototype high framerate depth camera running at 375Hz, useful for solving tracking-related problems. We demonstrate our algorithmic performance, creating high resolution real-time depth maps that surpass the quality of current state of the art depth technologies, highlighting quantization-free results with reduced holes, edge fattening and other stereo-based depth artifacts.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"33 1","pages":"5441-5450"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"99","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2016.587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 99

Abstract

Structured light sensors are popular due to their robustness to untextured scenes and multipath. These systems triangulate depth by solving a correspondence problem between each camera and projector pixel. This is often framed as a local stereo matching task, correlating patches of pixels in the observed and reference image. However, this is computationally intensive, leading to reduced depth accuracy and framerate. We contribute an algorithm for solving this correspondence problem efficiently, without compromising depth accuracy. For the first time, this problem is cast as a classification-regression task, which we solve extremely efficiently using an ensemble of cascaded random forests. Our algorithm scales in number of disparities, and each pixel can be processed independently, and in parallel. No matching or even access to the corresponding reference pattern is required at runtime, and regressed labels are directly mapped to depth. Our GPU-based algorithm runs at a 1KHz for 1.3MP input/output images, with disparity error of 0.1 subpixels. We show a prototype high framerate depth camera running at 375Hz, useful for solving tracking-related problems. We demonstrate our algorithmic performance, creating high resolution real-time depth maps that surpass the quality of current state of the art depth technologies, highlighting quantization-free results with reduced holes, edge fattening and other stereo-based depth artifacts.

查看原文本刊更多论文

HyperDepth:从没有匹配的结构光中学习深度

结构光传感器因其对非纹理场景和多路径的鲁棒性而广受欢迎。这些系统通过解决每个摄像机和投影仪像素之间的对应问题来三角测量深度。这通常是一个局部立体匹配任务，将观察图像和参考图像中的像素块相关联。然而，这是计算密集型的，导致深度精度和帧率降低。我们提出了一种算法来有效地解决这个对应问题，而不影响深度精度。这是第一次，这个问题被转换为分类回归任务，我们使用级联随机森林的集合非常有效地解决了这个问题。我们的算法按差异的数量进行缩放，每个像素都可以独立处理，也可以并行处理。在运行时不需要匹配甚至访问相应的引用模式，并且回归的标签直接映射到depth。我们基于gpu的算法在1.3MP输入/输出图像上以1KHz运行，视差误差为0.1子像素。我们展示了一个运行在375Hz的高帧率深度相机的原型，用于解决与跟踪相关的问题。我们展示了我们的算法性能，创建了超过当前最先进深度技术质量的高分辨率实时深度图，突出了无量化结果，减少了孔，边缘增厚和其他基于立体的深度伪影。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量