Hendrik Königshof, Niels Ole Salscheider, C. Stiller
{"title":"Realtime 3D Object Detection for Automated Driving Using Stereo Vision and Semantic Information","authors":"Hendrik Königshof, Niels Ole Salscheider, C. Stiller","doi":"10.1109/ITSC.2019.8917330","DOIUrl":null,"url":null,"abstract":"We propose a 3D object detection and pose estimation method for automated driving using stereo images. In contrast to existing stereo-based approaches, we focus not only on cars, but on all types of road users and can ensure real-time capability through GPU implementation of the entire processing chain. These are essential conditions to exploit an algorithm for highly automated driving. Semantic information is provided by a deep convolutional neural network and used together with disparity and geometric constraints to recover accurate 3D bounding boxes. Experiments on the challenging KITTI 3D object detection benchmark show results that are within the range of the best image-based algorithms, while the runtime is only about a fifth. This makes our algorithm the first real-time image-based approach on KITTI.","PeriodicalId":6717,"journal":{"name":"2019 IEEE Intelligent Transportation Systems Conference (ITSC)","volume":"40 11 1","pages":"1405-1410"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Intelligent Transportation Systems Conference (ITSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITSC.2019.8917330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 53
Abstract
We propose a 3D object detection and pose estimation method for automated driving using stereo images. In contrast to existing stereo-based approaches, we focus not only on cars, but on all types of road users and can ensure real-time capability through GPU implementation of the entire processing chain. These are essential conditions to exploit an algorithm for highly automated driving. Semantic information is provided by a deep convolutional neural network and used together with disparity and geometric constraints to recover accurate 3D bounding boxes. Experiments on the challenging KITTI 3D object detection benchmark show results that are within the range of the best image-based algorithms, while the runtime is only about a fifth. This makes our algorithm the first real-time image-based approach on KITTI.