Ana Daysi Ruvalcaba-Cardenas, T. Scoleri, Geoffrey Day
{"title":"Object Classification using Deep Learning on Extremely Low-Resolution Time-of-Flight Data","authors":"Ana Daysi Ruvalcaba-Cardenas, T. Scoleri, Geoffrey Day","doi":"10.1109/DICTA.2018.8615877","DOIUrl":null,"url":null,"abstract":"This paper proposes two novel deep learning models for 2D and 3D classification of objects in extremely low-resolution time-of-flight imagery. The models have been developed to suit contemporary range imaging hardware based on a recently fabricated Single Photon Avalanche Diode (SPAD) camera with 64 χ 64 pixel resolution. Being the first prototype of its kind, only a small data set has been collected so far which makes it challenging for training models. To bypass this hurdle, transfer learning is applied to the widely used VGG-16 convolutional neural network (CNN), with supplementary layers added specifically to handle SPAD data. This classifier and the renowned Faster-RCNN detector offer benchmark models for comparison to a newly created 3D CNN operating on time-of-flight data acquired by the SPAD sensor. Another contribution of this work is the proposed shot noise removal algorithm which is particularly useful to mitigate the camera sensitivity in situations of excessive lighting. Models have been tested in both low-light indoor settings and outdoor daytime conditions, on eight objects exhibiting small physical dimensions, low reflectivity, featureless structures and located at ranges from 25m to 700m. Despite antagonist factors, the proposed 2D model has achieved 95% average precision and recall, with higher accuracy for the 3D model.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2018.8615877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
This paper proposes two novel deep learning models for 2D and 3D classification of objects in extremely low-resolution time-of-flight imagery. The models have been developed to suit contemporary range imaging hardware based on a recently fabricated Single Photon Avalanche Diode (SPAD) camera with 64 χ 64 pixel resolution. Being the first prototype of its kind, only a small data set has been collected so far which makes it challenging for training models. To bypass this hurdle, transfer learning is applied to the widely used VGG-16 convolutional neural network (CNN), with supplementary layers added specifically to handle SPAD data. This classifier and the renowned Faster-RCNN detector offer benchmark models for comparison to a newly created 3D CNN operating on time-of-flight data acquired by the SPAD sensor. Another contribution of this work is the proposed shot noise removal algorithm which is particularly useful to mitigate the camera sensitivity in situations of excessive lighting. Models have been tested in both low-light indoor settings and outdoor daytime conditions, on eight objects exhibiting small physical dimensions, low reflectivity, featureless structures and located at ranges from 25m to 700m. Despite antagonist factors, the proposed 2D model has achieved 95% average precision and recall, with higher accuracy for the 3D model.
本文提出了两种新的深度学习模型,用于极低分辨率飞行时间图像中物体的二维和三维分类。这些模型的开发是为了适应基于最近制造的64 x 64像素分辨率的单光子雪崩二极管(SPAD)相机的当代距离成像硬件。作为该类型的第一个原型,到目前为止只收集了一小部分数据集,这给训练模型带来了挑战。为了绕过这个障碍,迁移学习被应用到广泛使用的VGG-16卷积神经网络(CNN)中,并添加了专门用于处理SPAD数据的补充层。该分类器和著名的Faster-RCNN检测器提供了基准模型,用于与SPAD传感器获取的飞行时间数据上运行的新创建的3D CNN进行比较。这项工作的另一个贡献是提出的镜头噪声去除算法,该算法特别有助于在过度照明的情况下降低相机的灵敏度。模型在室内低光环境和室外白天条件下进行了测试,测试对象为8个物体,它们的物理尺寸小,反射率低,结构无特征,距离从25米到700米不等。尽管存在拮抗剂因素,所提出的2D模型达到了95%的平均精度和召回率,3D模型的准确率更高。