An Efficient Accelerator for Deep Learning-based Point Cloud Registration on FPGAs

2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) Pub Date : 2022-03-11 DOI:10.1109/PDP59025.2023.00018

K. Sugiura, Hiroki Matsutani

{"title":"An Efficient Accelerator for Deep Learning-based Point Cloud Registration on FPGAs","authors":"K. Sugiura, Hiroki Matsutani","doi":"10.1109/PDP59025.2023.00018","DOIUrl":null,"url":null,"abstract":"Point cloud registration is the basis for many robotic applications such as odometry and Simultaneous Localization And Mapping (SLAM), which are increasingly important for autonomous mobile robots. The limitation of computational resources and power budgets on such robots motivates us to study the resource-efficient registration method on low-cost edge devices. In this paper, we propose an FPGA-based novel pipeline for 3D point cloud registration built upon a recent deep learning-based method, PointNetLK. Based on the profiling results, we focus on the PointNet feature extraction as it becomes a major bottleneck; we improve its scalability and memory-efficiency by consuming each input point one-by-one in a pipelined manner instead of processing the whole point cloud at once. We then design a fully-parallelized and pipelined accelerator consisting of a custom PointNet IP core, which fits within both low-cost and mid-range FPGAs (e.g., Avnet Ultra96v2 and Xilinx ZCU104). Experimental results show that our proposed pipeline achieves up to 21.34x and 69.60x faster registration speed than the vanilla PointNetLK and ICP, respectively, while only consuming 722mW and maintaining the same level of accuracy.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP59025.2023.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Point cloud registration is the basis for many robotic applications such as odometry and Simultaneous Localization And Mapping (SLAM), which are increasingly important for autonomous mobile robots. The limitation of computational resources and power budgets on such robots motivates us to study the resource-efficient registration method on low-cost edge devices. In this paper, we propose an FPGA-based novel pipeline for 3D point cloud registration built upon a recent deep learning-based method, PointNetLK. Based on the profiling results, we focus on the PointNet feature extraction as it becomes a major bottleneck; we improve its scalability and memory-efficiency by consuming each input point one-by-one in a pipelined manner instead of processing the whole point cloud at once. We then design a fully-parallelized and pipelined accelerator consisting of a custom PointNet IP core, which fits within both low-cost and mid-range FPGAs (e.g., Avnet Ultra96v2 and Xilinx ZCU104). Experimental results show that our proposed pipeline achieves up to 21.34x and 69.60x faster registration speed than the vanilla PointNetLK and ICP, respectively, while only consuming 722mW and maintaining the same level of accuracy.

查看原文本刊更多论文

基于fpga的深度学习点云配准的高效加速器

点云配准是许多机器人应用的基础，如里程计和同步定位和测绘(SLAM)，这对自主移动机器人越来越重要。这种机器人的计算资源和功耗预算的限制促使我们研究低成本边缘设备上的资源高效配准方法。在本文中，我们提出了一种基于fpga的3D点云配准新管道，该管道基于最近的基于深度学习的方法PointNetLK。在分析结果的基础上，我们重点研究了PointNet特征提取，因为它是一个主要的瓶颈;我们以流水线的方式一个接一个地消耗每个输入点，而不是一次处理整个点云，从而提高了其可扩展性和内存效率。然后，我们设计了一个完全并行的流水线加速器，由定制的PointNet IP核组成，适用于低成本和中低端fpga(例如，Avnet Ultra96v2和Xilinx ZCU104)。实验结果表明，我们提出的管道的配准速度分别比普通的PointNetLK和ICP快21.34倍和69.60倍，而功耗仅为722mW，并保持相同的精度水平。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

自引率

0.00%

发文量