{"title":"Evaluation of synthetic data for deep learning stereo depth algorithms on embedded platforms","authors":"Kevin Lee, D. Moloney","doi":"10.1109/ICSAI.2017.8248284","DOIUrl":null,"url":null,"abstract":"Stereo vision is a very active field in the realm of computer vision and in recent years Convolutional Neural Networks (CNNs) have proven to be very competitive against the state-of-the-art. However the performance of these networks are limited by the quality of the data that is used when training the CNNs. Data acquisition of high quality labelled images is a time-consuming and expensive process. By exploiting the power of modern-day powerful GPUs, we present a synthetic dataset with fully rectified stereo image pairs and accompanying accurate ground truth information that can be used for training and testing stereo algorithms. We provide validation of the quality of our dataset by performing quantitative experiments that suggest pre-training deep learning algorithms on synthetic data can perform competitively against networks trained on real life data. Testing on the KITTI data-set[1], we found the accuracy performance difference between the real and synthetically trained networks was within a margin of 1.8%. We also illustrate the functionality synthetic data can provide, by conducting a key performance index on a selection of conventional and deep learning stereo algorithms available on embedded platforms and compared them under common metrics. We also focused on power consumption and performance and we were able to achieve a compute the matching cost from a CNN performing inference on an embedded device at 11.9 FPS at 1.2 Watts.","PeriodicalId":285726,"journal":{"name":"2017 4th International Conference on Systems and Informatics (ICSAI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 4th International Conference on Systems and Informatics (ICSAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2017.8248284","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Stereo vision is a very active field in the realm of computer vision and in recent years Convolutional Neural Networks (CNNs) have proven to be very competitive against the state-of-the-art. However the performance of these networks are limited by the quality of the data that is used when training the CNNs. Data acquisition of high quality labelled images is a time-consuming and expensive process. By exploiting the power of modern-day powerful GPUs, we present a synthetic dataset with fully rectified stereo image pairs and accompanying accurate ground truth information that can be used for training and testing stereo algorithms. We provide validation of the quality of our dataset by performing quantitative experiments that suggest pre-training deep learning algorithms on synthetic data can perform competitively against networks trained on real life data. Testing on the KITTI data-set[1], we found the accuracy performance difference between the real and synthetically trained networks was within a margin of 1.8%. We also illustrate the functionality synthetic data can provide, by conducting a key performance index on a selection of conventional and deep learning stereo algorithms available on embedded platforms and compared them under common metrics. We also focused on power consumption and performance and we were able to achieve a compute the matching cost from a CNN performing inference on an embedded device at 11.9 FPS at 1.2 Watts.