{"title":"ATM-NeRF:通过几何正则化加速训练移动设备上的NeRF渲染","authors":"Yang Chen;Lin Zhang;Shengjie Zhao;Yicong Zhou","doi":"10.1109/TMM.2025.3535288","DOIUrl":null,"url":null,"abstract":"Recently, an increasing number of researchers have been dedicated to transferring the impressive novel view synthesis capability of Neural Radiance Fields (NeRF) to resource-constrained mobile devices. One common solution is to pre-train NeRF and bake it into textured meshes which are well supported by mobile graphics hardware. However, the training process of existing methods often requires several hours even with multiple high-end NVIDIA V100 GPUs. The underlying reason is that these schemes mainly rely on photometric rendering loss, neglecting the geometric relationship between the pre-trained NeRF and the baked results. Standing on this point, we present <bold>ATM-NeRF</b> (<bold>A</b>ccelerating <bold>T</b>raining for <bold>M</b>obile rendering based on <bold>NeRF</b>), which is the first to apply effective geometric regularization constraints during both the pre-training and the baking training stages for faster convergence. Specifically, in the initial NeRF pre-training stage, we enforce consistency of the multi-resolution density grids representing the scene geometry to mitigate the shape-radiance ambiguity problem to some extent, achieving a coarse mesh with smoothness. In the second stage, we utilize the positions and geometric features of 3D points projected from the pre-trained posed depths to provide geometric supervision for joint refinement of geometry and appearance of the coarse mesh. As a result, our ATM-NeRF achieves comparable rendering quality to MobileNeRF with a training speed that is about <inline-formula><tex-math>$30\\times \\sim 70\\times$</tex-math></inline-formula> faster while maintaining finer structure details of the exported mesh.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"3279-3293"},"PeriodicalIF":9.7000,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ATM-NeRF: Accelerating Training for NeRF Rendering on Mobile Devices via Geometric Regularization\",\"authors\":\"Yang Chen;Lin Zhang;Shengjie Zhao;Yicong Zhou\",\"doi\":\"10.1109/TMM.2025.3535288\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, an increasing number of researchers have been dedicated to transferring the impressive novel view synthesis capability of Neural Radiance Fields (NeRF) to resource-constrained mobile devices. One common solution is to pre-train NeRF and bake it into textured meshes which are well supported by mobile graphics hardware. However, the training process of existing methods often requires several hours even with multiple high-end NVIDIA V100 GPUs. The underlying reason is that these schemes mainly rely on photometric rendering loss, neglecting the geometric relationship between the pre-trained NeRF and the baked results. Standing on this point, we present <bold>ATM-NeRF</b> (<bold>A</b>ccelerating <bold>T</b>raining for <bold>M</b>obile rendering based on <bold>NeRF</b>), which is the first to apply effective geometric regularization constraints during both the pre-training and the baking training stages for faster convergence. Specifically, in the initial NeRF pre-training stage, we enforce consistency of the multi-resolution density grids representing the scene geometry to mitigate the shape-radiance ambiguity problem to some extent, achieving a coarse mesh with smoothness. In the second stage, we utilize the positions and geometric features of 3D points projected from the pre-trained posed depths to provide geometric supervision for joint refinement of geometry and appearance of the coarse mesh. As a result, our ATM-NeRF achieves comparable rendering quality to MobileNeRF with a training speed that is about <inline-formula><tex-math>$30\\\\times \\\\sim 70\\\\times$</tex-math></inline-formula> faster while maintaining finer structure details of the exported mesh.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"3279-3293\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10856404/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10856404/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
ATM-NeRF: Accelerating Training for NeRF Rendering on Mobile Devices via Geometric Regularization
Recently, an increasing number of researchers have been dedicated to transferring the impressive novel view synthesis capability of Neural Radiance Fields (NeRF) to resource-constrained mobile devices. One common solution is to pre-train NeRF and bake it into textured meshes which are well supported by mobile graphics hardware. However, the training process of existing methods often requires several hours even with multiple high-end NVIDIA V100 GPUs. The underlying reason is that these schemes mainly rely on photometric rendering loss, neglecting the geometric relationship between the pre-trained NeRF and the baked results. Standing on this point, we present ATM-NeRF (Accelerating Training for Mobile rendering based on NeRF), which is the first to apply effective geometric regularization constraints during both the pre-training and the baking training stages for faster convergence. Specifically, in the initial NeRF pre-training stage, we enforce consistency of the multi-resolution density grids representing the scene geometry to mitigate the shape-radiance ambiguity problem to some extent, achieving a coarse mesh with smoothness. In the second stage, we utilize the positions and geometric features of 3D points projected from the pre-trained posed depths to provide geometric supervision for joint refinement of geometry and appearance of the coarse mesh. As a result, our ATM-NeRF achieves comparable rendering quality to MobileNeRF with a training speed that is about $30\times \sim 70\times$ faster while maintaining finer structure details of the exported mesh.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.