FGS-NeRF：一种基于体素和反射方向的快速光滑表面重建方法

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Image and Vision Computing Pub Date : 2025-02-14 DOI:10.1016/j.imavis.2025.105455

Han Hong , Qing Ye , Keyun Xiong , Qing Tao , Yiqian Wan

{"title":"FGS-NeRF：一种基于体素和反射方向的快速光滑表面重建方法","authors":"Han Hong , Qing Ye , Keyun Xiong , Qing Tao , Yiqian Wan","doi":"10.1016/j.imavis.2025.105455","DOIUrl":null,"url":null,"abstract":"<div><div>Neural surface reconstruction technology has great potential for recovering 3D surfaces from multiview images. However, surface gloss can severely affect the reconstruction quality. Although existing methods address the issue of glossy surface reconstruction, achieving rapid reconstruction remains a challenge. While DVGO can achieve rapid scene geometry search, it tends to create numerous holes in glossy surfaces during the search process. To address this, we design a geometry search method based on SDF and reflection directions, employing a method called progressive voxel-MLP scaling to achieve accurate and efficient geometry searches for glossy scenes. To mitigate object edge artifacts caused by reflection directions, we use a simple loss function called sigmoid RGB loss, which helps reduce artifacts around objects during the early stages of training and promotes efficient surface convergence. In this work, we introduce the FGS-NeRF model, which uses a coarse-to-fine training method combined with reflection directions to achieve rapid reconstruction of glossy object surfaces based on voxel grids. The training time on a single RTX 4080 GPU is 20 min. Evaluations on the Shiny Blender and Smart Car datasets confirm that our model significantly improves the speed when compared with existing glossy object reconstruction methods while achieving accurate object surfaces. Code: <span><span>https://github.com/yosugahhh/FGS-nerf</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"155 ","pages":"Article 105455"},"PeriodicalIF":4.2000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FGS-NeRF: A fast glossy surface reconstruction method based on voxel and reflection directions\",\"authors\":\"Han Hong , Qing Ye , Keyun Xiong , Qing Tao , Yiqian Wan\",\"doi\":\"10.1016/j.imavis.2025.105455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Neural surface reconstruction technology has great potential for recovering 3D surfaces from multiview images. However, surface gloss can severely affect the reconstruction quality. Although existing methods address the issue of glossy surface reconstruction, achieving rapid reconstruction remains a challenge. While DVGO can achieve rapid scene geometry search, it tends to create numerous holes in glossy surfaces during the search process. To address this, we design a geometry search method based on SDF and reflection directions, employing a method called progressive voxel-MLP scaling to achieve accurate and efficient geometry searches for glossy scenes. To mitigate object edge artifacts caused by reflection directions, we use a simple loss function called sigmoid RGB loss, which helps reduce artifacts around objects during the early stages of training and promotes efficient surface convergence. In this work, we introduce the FGS-NeRF model, which uses a coarse-to-fine training method combined with reflection directions to achieve rapid reconstruction of glossy object surfaces based on voxel grids. The training time on a single RTX 4080 GPU is 20 min. Evaluations on the Shiny Blender and Smart Car datasets confirm that our model significantly improves the speed when compared with existing glossy object reconstruction methods while achieving accurate object surfaces. Code: <span><span>https://github.com/yosugahhh/FGS-nerf</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"155 \",\"pages\":\"Article 105455\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625000435\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625000435","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

神经表面重建技术在从多视图图像中恢复三维表面方面具有很大的潜力。然而，表面光泽会严重影响重建质量。虽然现有的方法解决了光滑表面重建的问题，但实现快速重建仍然是一个挑战。虽然DVGO可以实现快速的场景几何搜索，但在搜索过程中，它往往会在光滑的表面上产生大量的洞。为了解决这个问题，我们设计了一种基于SDF和反射方向的几何搜索方法，采用一种称为渐进体素- mlp缩放的方法来实现对光滑场景的精确高效的几何搜索。为了减轻反射方向引起的物体边缘伪影，我们使用了一个简单的损失函数，称为sigmoid RGB损失，这有助于在训练的早期阶段减少物体周围的伪影，并促进有效的表面收敛。本文介绍了基于体素网格的FGS-NeRF模型，该模型采用粗到精结合反射方向的训练方法，实现了光滑物体表面的快速重建。在单个RTX 4080 GPU上的训练时间为20分钟。对Shiny Blender和Smart Car数据集的评估证实，与现有的光滑物体重建方法相比，我们的模型在获得精确物体表面的同时显著提高了速度。代码:https://github.com/yosugahhh/FGS-nerf。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FGS-NeRF: A fast glossy surface reconstruction method based on voxel and reflection directions

Neural surface reconstruction technology has great potential for recovering 3D surfaces from multiview images. However, surface gloss can severely affect the reconstruction quality. Although existing methods address the issue of glossy surface reconstruction, achieving rapid reconstruction remains a challenge. While DVGO can achieve rapid scene geometry search, it tends to create numerous holes in glossy surfaces during the search process. To address this, we design a geometry search method based on SDF and reflection directions, employing a method called progressive voxel-MLP scaling to achieve accurate and efficient geometry searches for glossy scenes. To mitigate object edge artifacts caused by reflection directions, we use a simple loss function called sigmoid RGB loss, which helps reduce artifacts around objects during the early stages of training and promotes efficient surface convergence. In this work, we introduce the FGS-NeRF model, which uses a coarse-to-fine training method combined with reflection directions to achieve rapid reconstruction of glossy object surfaces based on voxel grids. The training time on a single RTX 4080 GPU is 20 min. Evaluations on the Shiny Blender and Smart Car datasets confirm that our model significantly improves the speed when compared with existing glossy object reconstruction methods while achieving accurate object surfaces. Code: https://github.com/yosugahhh/FGS-nerf.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.