{"title":"Dense-depth-net: a spatial-temporal approach on depth completion task","authors":"Tri-Hai Nguyen, Myungsik Yoo","doi":"10.1109/TENSYMP52854.2021.9550990","DOIUrl":null,"url":null,"abstract":"Depth completion is essential functionality in the perception system of an autonomous vehicle. With various convolution neural networks (CNN), scene geometric representation has been studied extensively under supervised learning or self-supervised learning. This paper utilizes recurrent neural networks (RNNs) to investigate temporal information from camera video sequences, which can help mitigate the mismatch between two consecutive data frames. Our paper proposed an architecture consisting of two sequence processing: the spatial exploitation stage built from a two-branches network and the temporal exploitation stage, a novel convolutional LSTM (ConvLSTM). Furthermore, we take the ability of long short-term memory (LSTM)-based RNNs to estimate a one-step depth map as an additional role of the representations of objects not only in a data frame but also in its temporal neighborhood. Moreover, the proposed ConvLSTM network demonstrated to have the option to make depth forecasts for future or occluded parts of an image frame. We evaluate the performance of the proposed architecture on the KITTI dataset and achieve the result proving to improve accuracy via a supervised-learning.","PeriodicalId":137485,"journal":{"name":"2021 IEEE Region 10 Symposium (TENSYMP)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Region 10 Symposium (TENSYMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENSYMP52854.2021.9550990","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Depth completion is essential functionality in the perception system of an autonomous vehicle. With various convolution neural networks (CNN), scene geometric representation has been studied extensively under supervised learning or self-supervised learning. This paper utilizes recurrent neural networks (RNNs) to investigate temporal information from camera video sequences, which can help mitigate the mismatch between two consecutive data frames. Our paper proposed an architecture consisting of two sequence processing: the spatial exploitation stage built from a two-branches network and the temporal exploitation stage, a novel convolutional LSTM (ConvLSTM). Furthermore, we take the ability of long short-term memory (LSTM)-based RNNs to estimate a one-step depth map as an additional role of the representations of objects not only in a data frame but also in its temporal neighborhood. Moreover, the proposed ConvLSTM network demonstrated to have the option to make depth forecasts for future or occluded parts of an image frame. We evaluate the performance of the proposed architecture on the KITTI dataset and achieve the result proving to improve accuracy via a supervised-learning.