基于卷积神经网络的鸟瞰高程图像车辆检测与定位

2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR) Pub Date : 2017-10-01 DOI:10.1109/SSRR.2017.8088147

Shang-Lin Yu, Thomas Westfechtel, Ryunosuke Hamada, K. Ohno, S. Tadokoro

{"title":"基于卷积神经网络的鸟瞰高程图像车辆检测与定位","authors":"Shang-Lin Yu, Thomas Westfechtel, Ryunosuke Hamada, K. Ohno, S. Tadokoro","doi":"10.1109/SSRR.2017.8088147","DOIUrl":null,"url":null,"abstract":"For autonomous vehicles, the ability to detect and localize surrounding vehicles is critical. It is fundamental for further processing steps like collision avoidance or path planning. This paper introduces a convolutional neural network- based vehicle detection and localization method using point cloud data acquired by a LIDAR sensor. Acquired point clouds are transformed into bird's eye view elevation images, where each pixel represents a grid cell of the horizontal x-y plane. We intentionally encode each pixel using three channels, namely the maximal, median and minimal height value of all points within the respective grid. A major advantage of this three channel representation is that it allows us to utilize common RGB image-based detection networks without modification. The bird's eye view elevation images are processed by a two stage detector. Due to the nature of the bird's eye view, each pixel of the image represent ground coordinates, meaning that the bounding box of detected vehicles correspond directly to the horizontal position of the vehicles. Therefore, in contrast to RGB-based detectors, we not just detect the vehicles, but simultaneously localize them in ground coordinates. To evaluate the accuracy of our method and the usefulness for further high-level applications like path planning, we evaluate the detection results based on the localization error in ground coordinates. Our proposed method achieves an average precision of 87.9% for an intersection over union (IoU) value of 0.5. In addition, 75% of the detected cars are localized with an absolute positioning error of below 0.2m.","PeriodicalId":403881,"journal":{"name":"2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"47","resultStr":"{\"title\":\"Vehicle detection and localization on bird's eye view elevation images using convolutional neural network\",\"authors\":\"Shang-Lin Yu, Thomas Westfechtel, Ryunosuke Hamada, K. Ohno, S. Tadokoro\",\"doi\":\"10.1109/SSRR.2017.8088147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For autonomous vehicles, the ability to detect and localize surrounding vehicles is critical. It is fundamental for further processing steps like collision avoidance or path planning. This paper introduces a convolutional neural network- based vehicle detection and localization method using point cloud data acquired by a LIDAR sensor. Acquired point clouds are transformed into bird's eye view elevation images, where each pixel represents a grid cell of the horizontal x-y plane. We intentionally encode each pixel using three channels, namely the maximal, median and minimal height value of all points within the respective grid. A major advantage of this three channel representation is that it allows us to utilize common RGB image-based detection networks without modification. The bird's eye view elevation images are processed by a two stage detector. Due to the nature of the bird's eye view, each pixel of the image represent ground coordinates, meaning that the bounding box of detected vehicles correspond directly to the horizontal position of the vehicles. Therefore, in contrast to RGB-based detectors, we not just detect the vehicles, but simultaneously localize them in ground coordinates. To evaluate the accuracy of our method and the usefulness for further high-level applications like path planning, we evaluate the detection results based on the localization error in ground coordinates. Our proposed method achieves an average precision of 87.9% for an intersection over union (IoU) value of 0.5. In addition, 75% of the detected cars are localized with an absolute positioning error of below 0.2m.\",\"PeriodicalId\":403881,\"journal\":{\"name\":\"2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"47\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SSRR.2017.8088147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSRR.2017.8088147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 47

摘要

对于自动驾驶汽车来说，检测和定位周围车辆的能力至关重要。它是进一步处理步骤的基础，如避免碰撞或路径规划。本文介绍了一种基于卷积神经网络的车辆检测与定位方法，该方法利用激光雷达传感器获取的点云数据进行车辆检测与定位。将获取的点云转换为鸟瞰高程图像，其中每个像素代表水平x-y平面的一个网格单元。我们有意使用三个通道对每个像素进行编码，即各自网格内所有点的最大，中位数和最小高度值。这种三通道表示的一个主要优点是，它允许我们使用普通的基于RGB图像的检测网络而无需修改。鸟瞰仰角图像由二级探测器处理。由于鸟瞰的性质，图像的每个像素代表地面坐标，这意味着检测到的车辆的边界框直接对应于车辆的水平位置。因此，与基于rgb的探测器相比，我们不仅可以检测车辆，还可以同时在地面坐标中对其进行定位。为了评估我们的方法的准确性和对路径规划等进一步高级应用的有用性，我们基于地面坐标的定位误差来评估检测结果。当IoU值为0.5时，该方法的平均精度为87.9%。此外，75%的检测车辆被定位，绝对定位误差在0.2m以下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Vehicle detection and localization on bird's eye view elevation images using convolutional neural network

For autonomous vehicles, the ability to detect and localize surrounding vehicles is critical. It is fundamental for further processing steps like collision avoidance or path planning. This paper introduces a convolutional neural network- based vehicle detection and localization method using point cloud data acquired by a LIDAR sensor. Acquired point clouds are transformed into bird's eye view elevation images, where each pixel represents a grid cell of the horizontal x-y plane. We intentionally encode each pixel using three channels, namely the maximal, median and minimal height value of all points within the respective grid. A major advantage of this three channel representation is that it allows us to utilize common RGB image-based detection networks without modification. The bird's eye view elevation images are processed by a two stage detector. Due to the nature of the bird's eye view, each pixel of the image represent ground coordinates, meaning that the bounding box of detected vehicles correspond directly to the horizontal position of the vehicles. Therefore, in contrast to RGB-based detectors, we not just detect the vehicles, but simultaneously localize them in ground coordinates. To evaluate the accuracy of our method and the usefulness for further high-level applications like path planning, we evaluate the detection results based on the localization error in ground coordinates. Our proposed method achieves an average precision of 87.9% for an intersection over union (IoU) value of 0.5. In addition, 75% of the detected cars are localized with an absolute positioning error of below 0.2m.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR)

自引率

0.00%

发文量