{"title":"NeRF-PIM:神经渲染网络的 PIM 硬件-软件协同设计","authors":"Jaeyoung Heo;Sungjoo Yoo","doi":"10.1109/TCAD.2024.3443712","DOIUrl":null,"url":null,"abstract":"Neural radiance field (NeRF) has emerged as a state-of-the-art technique, offering unprecedented realism in rendering. Despite its advancements, the adoption of NeRF is constrained by high computational cost, leading to slow rendering speed. Voxel-based optimization of NeRF addresses this by reducing the computational cost, but it introduces substantial memory overheads. To address this problem, we propose NeRF-PIM, a hardware-software co-design approach. In order to address the problem of the memory accesses to the large model (of the voxel grid) with poor locality and low compute density, we propose exploiting processing-in-memory (PIM) together with PIM-aware software optimizations in terms of the data layout, redundancy removal, and computation reuse. Our PIM hardware aims to accelerate the trilinear interpolation and dot product operations. Specifically, to address the low utilization of internal bandwidth due to the random accesses to the voxels, we propose a data layout that judiciously exploits the characteristics of the interpolation operation on the voxel grid, which helps remove bank conflicts in voxel accesses and also improves the efficiency of PIM command issue by exploiting the all-bank mode in the existing PIM device. As PIM-aware software optimizations, we also propose occupancy-grid-aware pruning and one-voxel two-sampling (1V2S) methods, which contribute to compute the efficiency improvement (by avoiding the redundant computation on the empty space) and memory traffic reduction (by reusing the per-voxel dot product results). We conduct experiments using an actual baseline HBM-PIM device. Our NeRF-PIM demonstrates a speedup of 7.4 and \n<inline-formula> <tex-math>$5.0\\times $ </tex-math></inline-formula>\n compared to the baseline on the two datasets, Synthetic-NeRF and Tanks and Temples, respectively.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"3900-3912"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NeRF-PIM: PIM Hardware-Software Co-Design of Neural Rendering Networks\",\"authors\":\"Jaeyoung Heo;Sungjoo Yoo\",\"doi\":\"10.1109/TCAD.2024.3443712\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neural radiance field (NeRF) has emerged as a state-of-the-art technique, offering unprecedented realism in rendering. Despite its advancements, the adoption of NeRF is constrained by high computational cost, leading to slow rendering speed. Voxel-based optimization of NeRF addresses this by reducing the computational cost, but it introduces substantial memory overheads. To address this problem, we propose NeRF-PIM, a hardware-software co-design approach. In order to address the problem of the memory accesses to the large model (of the voxel grid) with poor locality and low compute density, we propose exploiting processing-in-memory (PIM) together with PIM-aware software optimizations in terms of the data layout, redundancy removal, and computation reuse. Our PIM hardware aims to accelerate the trilinear interpolation and dot product operations. Specifically, to address the low utilization of internal bandwidth due to the random accesses to the voxels, we propose a data layout that judiciously exploits the characteristics of the interpolation operation on the voxel grid, which helps remove bank conflicts in voxel accesses and also improves the efficiency of PIM command issue by exploiting the all-bank mode in the existing PIM device. As PIM-aware software optimizations, we also propose occupancy-grid-aware pruning and one-voxel two-sampling (1V2S) methods, which contribute to compute the efficiency improvement (by avoiding the redundant computation on the empty space) and memory traffic reduction (by reusing the per-voxel dot product results). We conduct experiments using an actual baseline HBM-PIM device. Our NeRF-PIM demonstrates a speedup of 7.4 and \\n<inline-formula> <tex-math>$5.0\\\\times $ </tex-math></inline-formula>\\n compared to the baseline on the two datasets, Synthetic-NeRF and Tanks and Temples, respectively.\",\"PeriodicalId\":13251,\"journal\":{\"name\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"volume\":\"43 11\",\"pages\":\"3900-3912\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10745790/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745790/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
NeRF-PIM: PIM Hardware-Software Co-Design of Neural Rendering Networks
Neural radiance field (NeRF) has emerged as a state-of-the-art technique, offering unprecedented realism in rendering. Despite its advancements, the adoption of NeRF is constrained by high computational cost, leading to slow rendering speed. Voxel-based optimization of NeRF addresses this by reducing the computational cost, but it introduces substantial memory overheads. To address this problem, we propose NeRF-PIM, a hardware-software co-design approach. In order to address the problem of the memory accesses to the large model (of the voxel grid) with poor locality and low compute density, we propose exploiting processing-in-memory (PIM) together with PIM-aware software optimizations in terms of the data layout, redundancy removal, and computation reuse. Our PIM hardware aims to accelerate the trilinear interpolation and dot product operations. Specifically, to address the low utilization of internal bandwidth due to the random accesses to the voxels, we propose a data layout that judiciously exploits the characteristics of the interpolation operation on the voxel grid, which helps remove bank conflicts in voxel accesses and also improves the efficiency of PIM command issue by exploiting the all-bank mode in the existing PIM device. As PIM-aware software optimizations, we also propose occupancy-grid-aware pruning and one-voxel two-sampling (1V2S) methods, which contribute to compute the efficiency improvement (by avoiding the redundant computation on the empty space) and memory traffic reduction (by reusing the per-voxel dot product results). We conduct experiments using an actual baseline HBM-PIM device. Our NeRF-PIM demonstrates a speedup of 7.4 and
$5.0\times $
compared to the baseline on the two datasets, Synthetic-NeRF and Tanks and Temples, respectively.
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.