基于语义分割的自动驾驶汽车BEV检测与定位

2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT) Pub Date : 2021-07-09 DOI:10.1109/CONECCT52877.2021.9622702

Ashwin Nayak U, Nachiket Naganure, S. S.

{"title":"基于语义分割的自动驾驶汽车BEV检测与定位","authors":"Ashwin Nayak U, Nachiket Naganure, S. S.","doi":"10.1109/CONECCT52877.2021.9622702","DOIUrl":null,"url":null,"abstract":"In autonomous vehicles, the perception system plays an important role in environment modeling and object detection in 3D space. Existing perception systems use various sensors to localize and track the surrounding obstacles, but have some limitations. Most existing end-to-end autonomous systems are computationally heavy as they are built on multiple deep networks that are trained to detect and localize objects, thus requiring custom, high-end computation devices with high compute power. To address this issue, we propose and experiment with different semantic segmentation-based models for Birds Eye View (BEV) detection and localization of surrounding objects like vehicles and pedestrians from LiDAR (light detection, and ranging) point clouds. Voxelisation techniques are used to transform 3D LiDAR point clouds to 2D RGB images. The semantic segmentation models are trained from the ground up on the Lyft Level 5 dataset. During experimental evaluation, the proposed approach achieved a mean average precision score of 0.044 for UNET, 0.041 for SegNet and 0.033 for FCN, while being significantly less compute-intensive when compared to the state-of-the-art approaches.","PeriodicalId":164499,"journal":{"name":"2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BEV Detection and Localisation using Semantic Segmentation in Autonomous Car Driving Systems\",\"authors\":\"Ashwin Nayak U, Nachiket Naganure, S. S.\",\"doi\":\"10.1109/CONECCT52877.2021.9622702\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In autonomous vehicles, the perception system plays an important role in environment modeling and object detection in 3D space. Existing perception systems use various sensors to localize and track the surrounding obstacles, but have some limitations. Most existing end-to-end autonomous systems are computationally heavy as they are built on multiple deep networks that are trained to detect and localize objects, thus requiring custom, high-end computation devices with high compute power. To address this issue, we propose and experiment with different semantic segmentation-based models for Birds Eye View (BEV) detection and localization of surrounding objects like vehicles and pedestrians from LiDAR (light detection, and ranging) point clouds. Voxelisation techniques are used to transform 3D LiDAR point clouds to 2D RGB images. The semantic segmentation models are trained from the ground up on the Lyft Level 5 dataset. During experimental evaluation, the proposed approach achieved a mean average precision score of 0.044 for UNET, 0.041 for SegNet and 0.033 for FCN, while being significantly less compute-intensive when compared to the state-of-the-art approaches.\",\"PeriodicalId\":164499,\"journal\":{\"name\":\"2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CONECCT52877.2021.9622702\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONECCT52877.2021.9622702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在自动驾驶汽车中，感知系统在三维空间环境建模和目标检测中起着重要作用。现有的感知系统使用各种传感器来定位和跟踪周围的障碍物，但存在一些局限性。大多数现有的端到端自治系统都是计算量很大的，因为它们建立在多个深度网络上，这些网络被训练来检测和定位对象，因此需要具有高计算能力的定制高端计算设备。为了解决这个问题，我们提出并实验了不同的基于语义分割的模型，用于从LiDAR(光探测和测距)点云对周围物体(如车辆和行人)进行鸟瞰(BEV)检测和定位。体素化技术用于将3D激光雷达点云转换为2D RGB图像。语义分割模型是在Lyft Level 5数据集上从头开始训练的。在实验评估期间，所提出的方法在UNET、SegNet和FCN上的平均精度得分分别为0.044、0.041和0.033，与最先进的方法相比，计算强度显著降低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BEV Detection and Localisation using Semantic Segmentation in Autonomous Car Driving Systems

In autonomous vehicles, the perception system plays an important role in environment modeling and object detection in 3D space. Existing perception systems use various sensors to localize and track the surrounding obstacles, but have some limitations. Most existing end-to-end autonomous systems are computationally heavy as they are built on multiple deep networks that are trained to detect and localize objects, thus requiring custom, high-end computation devices with high compute power. To address this issue, we propose and experiment with different semantic segmentation-based models for Birds Eye View (BEV) detection and localization of surrounding objects like vehicles and pedestrians from LiDAR (light detection, and ranging) point clouds. Voxelisation techniques are used to transform 3D LiDAR point clouds to 2D RGB images. The semantic segmentation models are trained from the ground up on the Lyft Level 5 dataset. During experimental evaluation, the proposed approach achieved a mean average precision score of 0.044 for UNET, 0.041 for SegNet and 0.033 for FCN, while being significantly less compute-intensive when compared to the state-of-the-art approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT)

自引率

0.00%

发文量