Gina Abdelhalim, Kevin Simon, Robert Bensch, Sai Parimi, Bilal Ahmed Qureshi
{"title":"基于人工智能的自动注释框架,用于从工业区激光雷达数据中检测 3D 物体","authors":"Gina Abdelhalim, Kevin Simon, Robert Bensch, Sai Parimi, Bilal Ahmed Qureshi","doi":"10.4271/2024-01-2999","DOIUrl":null,"url":null,"abstract":"Autonomous Driving is used in various settings, including indoor areas such as industrial halls and warehouses. For perception in these environments, LIDAR is currently very popular due to its high accuracy compared to RADAR and its robustness to varying lighting conditions compared to cameras. However, there is a notable lack of freely available labeled LIDAR data in these settings, and most public datasets, such as KITTI and Waymo, focus on public road scenarios. As a result, specialized publicly available annotation frameworks are rare as well. This work tackles these shortcomings by developing an automated AI-based labeling tool to generate a LIDAR dataset with 3D ground truth annotations for industrial warehouse scenarios. The base pipeline for the annotation framework first upsamples the incoming 16-channel data into dense 64-channel data. The upsampled data is then manually annotated for the defined classes and this annotated 64-channel dataset is used to fine-tune the Part-A2-Net that has been pretrained on the KITTI dataset. This fine-tuned network shows promising results for the defined classes. To overcome some shortcomings with this pipeline, which mainly involves artefacts from upsampling and manual labeling, we extend the pipeline to make use of SLAM to generate the dense point cloud and use the generated poses to speed up the labeling process. The progression, therefore shows the three generations of the framework which started with manual upsampling and labeling. This then was extended to a semi-automated approach with automatic generation of dense map using SLAM and automatic annotation propagation to all the scans for all static classes and then the complete automatic pipeline that generates ground truth using the Part-A2-Net which was trained using the dataset generated from the manual and semi-automated pipelines. The dataset generated for this warehouse environment will continuously be extended and is publicly available at https://github.com/anavsgmbh/lidar-warehouse-dataset.","PeriodicalId":510086,"journal":{"name":"SAE Technical Paper Series","volume":"55 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated AI-Based Annotation Framework for 3D Object Detection from LIDAR Data in Industrial Areas\",\"authors\":\"Gina Abdelhalim, Kevin Simon, Robert Bensch, Sai Parimi, Bilal Ahmed Qureshi\",\"doi\":\"10.4271/2024-01-2999\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autonomous Driving is used in various settings, including indoor areas such as industrial halls and warehouses. For perception in these environments, LIDAR is currently very popular due to its high accuracy compared to RADAR and its robustness to varying lighting conditions compared to cameras. However, there is a notable lack of freely available labeled LIDAR data in these settings, and most public datasets, such as KITTI and Waymo, focus on public road scenarios. As a result, specialized publicly available annotation frameworks are rare as well. This work tackles these shortcomings by developing an automated AI-based labeling tool to generate a LIDAR dataset with 3D ground truth annotations for industrial warehouse scenarios. The base pipeline for the annotation framework first upsamples the incoming 16-channel data into dense 64-channel data. The upsampled data is then manually annotated for the defined classes and this annotated 64-channel dataset is used to fine-tune the Part-A2-Net that has been pretrained on the KITTI dataset. This fine-tuned network shows promising results for the defined classes. To overcome some shortcomings with this pipeline, which mainly involves artefacts from upsampling and manual labeling, we extend the pipeline to make use of SLAM to generate the dense point cloud and use the generated poses to speed up the labeling process. The progression, therefore shows the three generations of the framework which started with manual upsampling and labeling. This then was extended to a semi-automated approach with automatic generation of dense map using SLAM and automatic annotation propagation to all the scans for all static classes and then the complete automatic pipeline that generates ground truth using the Part-A2-Net which was trained using the dataset generated from the manual and semi-automated pipelines. The dataset generated for this warehouse environment will continuously be extended and is publicly available at https://github.com/anavsgmbh/lidar-warehouse-dataset.\",\"PeriodicalId\":510086,\"journal\":{\"name\":\"SAE Technical Paper Series\",\"volume\":\"55 5\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SAE Technical Paper Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4271/2024-01-2999\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SAE Technical Paper Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4271/2024-01-2999","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automated AI-Based Annotation Framework for 3D Object Detection from LIDAR Data in Industrial Areas
Autonomous Driving is used in various settings, including indoor areas such as industrial halls and warehouses. For perception in these environments, LIDAR is currently very popular due to its high accuracy compared to RADAR and its robustness to varying lighting conditions compared to cameras. However, there is a notable lack of freely available labeled LIDAR data in these settings, and most public datasets, such as KITTI and Waymo, focus on public road scenarios. As a result, specialized publicly available annotation frameworks are rare as well. This work tackles these shortcomings by developing an automated AI-based labeling tool to generate a LIDAR dataset with 3D ground truth annotations for industrial warehouse scenarios. The base pipeline for the annotation framework first upsamples the incoming 16-channel data into dense 64-channel data. The upsampled data is then manually annotated for the defined classes and this annotated 64-channel dataset is used to fine-tune the Part-A2-Net that has been pretrained on the KITTI dataset. This fine-tuned network shows promising results for the defined classes. To overcome some shortcomings with this pipeline, which mainly involves artefacts from upsampling and manual labeling, we extend the pipeline to make use of SLAM to generate the dense point cloud and use the generated poses to speed up the labeling process. The progression, therefore shows the three generations of the framework which started with manual upsampling and labeling. This then was extended to a semi-automated approach with automatic generation of dense map using SLAM and automatic annotation propagation to all the scans for all static classes and then the complete automatic pipeline that generates ground truth using the Part-A2-Net which was trained using the dataset generated from the manual and semi-automated pipelines. The dataset generated for this warehouse environment will continuously be extended and is publicly available at https://github.com/anavsgmbh/lidar-warehouse-dataset.