Robustifying the Multi-Scale Representation of Neural Radiance Fields

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-10-09 DOI:10.48550/arXiv.2210.04233

Nishant Jain, Suryansh Kumar, L. Gool

{"title":"Robustifying the Multi-Scale Representation of Neural Radiance Fields","authors":"Nishant Jain, Suryansh Kumar, L. Gool","doi":"10.48550/arXiv.2210.04233","DOIUrl":null,"url":null,"abstract":"Neural Radiance Fields (NeRF) recently emerged as a new paradigm for object representation from multi-view (MV) images. Yet, it cannot handle multi-scale (MS) images and camera pose estimation errors, which generally is the case with multi-view images captured from a day-to-day commodity camera. Although recently proposed Mip-NeRF could handle multi-scale imaging problems with NeRF, it cannot handle camera pose estimation error. On the other hand, the newly proposed BARF can solve the camera pose problem with NeRF but fails if the images are multi-scale in nature. This paper presents a robust multi-scale neural radiance fields representation approach to simultaneously overcome both real-world imaging issues. Our method handles multi-scale imaging effects and camera-pose estimation problems with NeRF-inspired approaches by leveraging the fundamentals of scene rigidity. To reduce unpleasant aliasing artifacts due to multi-scale images in the ray space, we leverage Mip-NeRF multi-scale representation. For joint estimation of robust camera pose, we propose graph-neural network-based multiple motion averaging in the neural volume rendering framework. We demonstrate, with examples, that for an accurate neural representation of an object from day-to-day acquired multi-view images, it is crucial to have precise camera-pose estimates. Without considering robustness measures in the camera pose estimation, modeling for multi-scale aliasing artifacts via conical frustum can be counterproductive. We present extensive experiments on the benchmark datasets to demonstrate that our approach provides better results than the recent NeRF-inspired approaches for such realistic settings.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"164 1","pages":"578"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2210.04233","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Neural Radiance Fields (NeRF) recently emerged as a new paradigm for object representation from multi-view (MV) images. Yet, it cannot handle multi-scale (MS) images and camera pose estimation errors, which generally is the case with multi-view images captured from a day-to-day commodity camera. Although recently proposed Mip-NeRF could handle multi-scale imaging problems with NeRF, it cannot handle camera pose estimation error. On the other hand, the newly proposed BARF can solve the camera pose problem with NeRF but fails if the images are multi-scale in nature. This paper presents a robust multi-scale neural radiance fields representation approach to simultaneously overcome both real-world imaging issues. Our method handles multi-scale imaging effects and camera-pose estimation problems with NeRF-inspired approaches by leveraging the fundamentals of scene rigidity. To reduce unpleasant aliasing artifacts due to multi-scale images in the ray space, we leverage Mip-NeRF multi-scale representation. For joint estimation of robust camera pose, we propose graph-neural network-based multiple motion averaging in the neural volume rendering framework. We demonstrate, with examples, that for an accurate neural representation of an object from day-to-day acquired multi-view images, it is crucial to have precise camera-pose estimates. Without considering robustness measures in the camera pose estimation, modeling for multi-scale aliasing artifacts via conical frustum can be counterproductive. We present extensive experiments on the benchmark datasets to demonstrate that our approach provides better results than the recent NeRF-inspired approaches for such realistic settings.

查看原文本刊更多论文

神经辐射场的多尺度鲁棒化

神经辐射场(Neural Radiance Fields, NeRF)是最近出现的一种用于多视图(MV)图像对象表示的新范式。然而，它不能处理多尺度(MS)图像和相机姿态估计误差，这通常是由日常商用相机捕获的多视图图像的情况。虽然最近提出的Mip-NeRF可以用NeRF处理多尺度成像问题，但它不能处理相机姿态估计误差。另一方面，新提出的BARF可以解决NeRF的相机姿态问题，但如果图像是多尺度的，则无法解决。本文提出了一种鲁棒的多尺度神经辐射场表示方法，以同时克服这两个现实世界的成像问题。我们的方法通过利用场景刚性的基本原理，处理多尺度成像效果和相机姿态估计问题。为了减少光线空间中由于多尺度图像而产生的令人不快的混叠伪影，我们利用了Mip-NeRF多尺度表示。为了鲁棒相机姿态的联合估计，我们提出了基于图神经网络的多运动平均神经体绘制框架。我们通过实例证明，为了从日常获取的多视图图像中准确地表示物体，具有精确的相机姿势估计是至关重要的。如果不考虑相机姿态估计中的鲁棒性措施，通过锥形截锥体对多尺度混叠伪影进行建模可能会适得其反。我们在基准数据集上进行了大量的实验，以证明我们的方法比最近的nerf启发的方法在这种现实设置上提供了更好的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference

自引率

0.00%

发文量