SimpleRecon: 3D Reconstruction Without 3D Convolutions

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision Pub Date : 2022-08-31 DOI:10.48550/arXiv.2208.14743

Mohamed Sayed, J. Gibson, Jamie Watson, V. Prisacariu, Michael Firman, Clément Godard

{"title":"SimpleRecon: 3D Reconstruction Without 3D Convolutions","authors":"Mohamed Sayed, J. Gibson, Jamie Watson, V. Prisacariu, Michael Firman, Clément Godard","doi":"10.48550/arXiv.2208.14743","DOIUrl":null,"url":null,"abstract":"Traditionally, 3D indoor scene reconstruction from posed images happens in two phases: per-image depth estimation, followed by depth merging and surface reconstruction. Recently, a family of methods have emerged that perform reconstruction directly in final 3D volumetric feature space. While these methods have shown impressive reconstruction results, they rely on expensive 3D convolutional layers, limiting their application in resource-constrained environments. In this work, we instead go back to the traditional route, and show how focusing on high quality multi-view depth prediction leads to highly accurate 3D reconstructions using simple off-the-shelf depth fusion. We propose a simple state-of-the-art multi-view depth estimator with two main contributions: 1) a carefully-designed 2D CNN which utilizes strong image priors alongside a plane-sweep feature volume and geometric losses, combined with 2) the integration of keyframe and geometric metadata into the cost volume which allows informed depth plane scoring. Our method achieves a significant lead over the current state-of-the-art for depth estimation and close or better for 3D reconstruction on ScanNet and 7-Scenes, yet still allows for online real-time low-memory reconstruction. Code, models and results are available at https://nianticlabs.github.io/simplerecon","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"170 1","pages":"1-19"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2208.14743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 31

Abstract

Traditionally, 3D indoor scene reconstruction from posed images happens in two phases: per-image depth estimation, followed by depth merging and surface reconstruction. Recently, a family of methods have emerged that perform reconstruction directly in final 3D volumetric feature space. While these methods have shown impressive reconstruction results, they rely on expensive 3D convolutional layers, limiting their application in resource-constrained environments. In this work, we instead go back to the traditional route, and show how focusing on high quality multi-view depth prediction leads to highly accurate 3D reconstructions using simple off-the-shelf depth fusion. We propose a simple state-of-the-art multi-view depth estimator with two main contributions: 1) a carefully-designed 2D CNN which utilizes strong image priors alongside a plane-sweep feature volume and geometric losses, combined with 2) the integration of keyframe and geometric metadata into the cost volume which allows informed depth plane scoring. Our method achieves a significant lead over the current state-of-the-art for depth estimation and close or better for 3D reconstruction on ScanNet and 7-Scenes, yet still allows for online real-time low-memory reconstruction. Code, models and results are available at https://nianticlabs.github.io/simplerecon

查看原文本刊更多论文

SimpleRecon: 3D重建没有3D卷积

传统的三维室内场景重建方法分为两个阶段:图像深度估计、深度合并和表面重建。最近，出现了一系列直接在最终三维体积特征空间中进行重建的方法。虽然这些方法显示了令人印象深刻的重建结果，但它们依赖于昂贵的3D卷积层，限制了它们在资源受限环境中的应用。在这项工作中，我们回到了传统的路线，并展示了如何专注于高质量的多视图深度预测，从而使用简单的现成深度融合实现高精度的3D重建。我们提出了一个简单的最先进的多视图深度估计器，它有两个主要贡献:1)一个精心设计的2D CNN，它利用强大的图像先验以及平面扫描特征体积和几何损失，结合2)将关键帧和几何元数据集成到成本体积中，从而允许知情深度平面评分。我们的方法在深度估计和ScanNet和7-Scenes上接近或更好的3D重建方面取得了显著的领先优势，但仍然允许在线实时低内存重建。代码、模型和结果可在https://nianticlabs.github.io/simplerecon上获得

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision

自引率

0.00%

发文量