基于关注模块改进的端到端支柱特征神经网络自动驾驶汽车目标检测

IF 1.1 4区工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEJ Transactions on Electrical and Electronic Engineering Pub Date : 2025-02-05 DOI:10.1002/tee.24276

Bin Zhang, Congzhi Ren, Hun-ok Lim

{"title":"基于关注模块改进的端到端支柱特征神经网络自动驾驶汽车目标检测","authors":"Bin Zhang, Congzhi Ren, Hun-ok Lim","doi":"10.1002/tee.24276","DOIUrl":null,"url":null,"abstract":"<p>The development of 3D object detectors for dealing with 3D point clouds generated by LiDAR sensors is facing a significant challenge in real-world autonomous driving scenarios. Current research mainly focuses on Voxel-based detectors, which use sparse convolution for training and inference. These models often require substantial computational resources for training, making them hard to be applied to real autonomous vehicles. Among these models, two models called PointPillars and CenterPoint (pillar-version) are noticed since they are based on 2D Pillar encoding, making the inferencing process fast. However, in comparison to other models, they exhibit relatively lower detection accuracy performances. In this paper, to enhance the detection accuracy of Pillar encoding models without significantly increasing computational complexity, attention modules added within the Pillar encoder are proposed. These modules adopt the attention mechanism while reducing input dimensions. Simultaneously, the attention modules are also added to the CNN backbone network to increase the detection accuracy. The inference time increases from 16 to 17 ms, compared with the fastest PointPillar model. The effectiveness of the proposed network is proven by experiments. © 2025 Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.</p>","PeriodicalId":13435,"journal":{"name":"IEEJ Transactions on Electrical and Electronic Engineering","volume":"20 8","pages":"1212-1218"},"PeriodicalIF":1.1000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An End-to-End Pillar Feature Based Neural Network Improved by Attention Modules for Object Detection of Autonomous Vehicles\",\"authors\":\"Bin Zhang, Congzhi Ren, Hun-ok Lim\",\"doi\":\"10.1002/tee.24276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The development of 3D object detectors for dealing with 3D point clouds generated by LiDAR sensors is facing a significant challenge in real-world autonomous driving scenarios. Current research mainly focuses on Voxel-based detectors, which use sparse convolution for training and inference. These models often require substantial computational resources for training, making them hard to be applied to real autonomous vehicles. Among these models, two models called PointPillars and CenterPoint (pillar-version) are noticed since they are based on 2D Pillar encoding, making the inferencing process fast. However, in comparison to other models, they exhibit relatively lower detection accuracy performances. In this paper, to enhance the detection accuracy of Pillar encoding models without significantly increasing computational complexity, attention modules added within the Pillar encoder are proposed. These modules adopt the attention mechanism while reducing input dimensions. Simultaneously, the attention modules are also added to the CNN backbone network to increase the detection accuracy. The inference time increases from 16 to 17 ms, compared with the fastest PointPillar model. The effectiveness of the proposed network is proven by experiments. © 2025 Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.</p>\",\"PeriodicalId\":13435,\"journal\":{\"name\":\"IEEJ Transactions on Electrical and Electronic Engineering\",\"volume\":\"20 8\",\"pages\":\"1212-1218\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2025-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEJ Transactions on Electrical and Electronic Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/tee.24276\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEJ Transactions on Electrical and Electronic Engineering","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/tee.24276","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

用于处理激光雷达传感器生成的3D点云的3D目标探测器的开发在现实世界的自动驾驶场景中面临着重大挑战。目前的研究主要集中在基于体素的检测器上，它使用稀疏卷积进行训练和推理。这些模型通常需要大量的计算资源来进行训练，这使得它们很难应用于真正的自动驾驶汽车。在这些模型中，有两个模型被称为PointPillars和CenterPoint（柱版本），因为它们基于2D柱编码，使得推理过程更快。然而，与其他模型相比，它们的检测精度性能相对较低。为了在不显著增加计算复杂度的前提下提高Pillar编码模型的检测精度，本文提出在Pillar编码器中增加注意力模块。这些模块在减少输入维度的同时采用了注意机制。同时，在CNN骨干网中加入关注模块，提高检测精度。与最快的PointPillar模型相比，推理时间从16毫秒增加到17毫秒。实验证明了该网络的有效性。©2025日本电气工程师协会和Wiley期刊有限责任公司。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An End-to-End Pillar Feature Based Neural Network Improved by Attention Modules for Object Detection of Autonomous Vehicles

The development of 3D object detectors for dealing with 3D point clouds generated by LiDAR sensors is facing a significant challenge in real-world autonomous driving scenarios. Current research mainly focuses on Voxel-based detectors, which use sparse convolution for training and inference. These models often require substantial computational resources for training, making them hard to be applied to real autonomous vehicles. Among these models, two models called PointPillars and CenterPoint (pillar-version) are noticed since they are based on 2D Pillar encoding, making the inferencing process fast. However, in comparison to other models, they exhibit relatively lower detection accuracy performances. In this paper, to enhance the detection accuracy of Pillar encoding models without significantly increasing computational complexity, attention modules added within the Pillar encoder are proposed. These modules adopt the attention mechanism while reducing input dimensions. Simultaneously, the attention modules are also added to the CNN backbone network to increase the detection accuracy. The inference time increases from 16 to 17 ms, compared with the fastest PointPillar model. The effectiveness of the proposed network is proven by experiments. © 2025 Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEJ Transactions on Electrical and Electronic Engineering 工程技术-工程：电子与电气

CiteScore

2.70

自引率

10.00%

发文量

199

审稿时长

4.3 months

期刊介绍： IEEJ Transactions on Electrical and Electronic Engineering (hereinafter called TEEE ) publishes 6 times per year as an official journal of the Institute of Electrical Engineers of Japan (hereinafter "IEEJ"). This peer-reviewed journal contains original research papers and review articles on the most important and latest technological advances in core areas of Electrical and Electronic Engineering and in related disciplines. The journal also publishes short communications reporting on the results of the latest research activities TEEE ) aims to provide a new forum for IEEJ members in Japan as well as fellow researchers in Electrical and Electronic Engineering from around the world to exchange ideas and research findings.