Benefits of Synthetically Pre-trained Depth-Prediction Networks for Indoor/Outdoor Image Classification

2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) Pub Date : 2023-01-01 DOI:10.1109/WACVW58289.2023.00040

Ke Lin, Irene Cho, Ameya S. Walimbe, Bryan A. Zamora, Alex Rich, Sirius Z. Zhang, Tobias Höllerer

{"title":"Benefits of Synthetically Pre-trained Depth-Prediction Networks for Indoor/Outdoor Image Classification","authors":"Ke Lin, Irene Cho, Ameya S. Walimbe, Bryan A. Zamora, Alex Rich, Sirius Z. Zhang, Tobias Höllerer","doi":"10.1109/WACVW58289.2023.00040","DOIUrl":null,"url":null,"abstract":"Ground truth depth information is necessary for many computer vision tasks. Collecting this information is chal-lenging, especially for outdoor scenes. In this work, we propose utilizing single-view depth prediction neural networks pre-trained on synthetic scenes to generate relative depth, which we call pseudo-depth. This approach is a less expen-sive option as the pre-trained neural network obtains ac-curate depth information from synthetic scenes, which does not require any expensive sensor equipment and takes less time. We measure the usefulness of pseudo-depth from pre-trained neural networks by training indoor/outdoor binary classifiers with and without it. We also compare the difference in accuracy between using pseudo-depth and ground truth depth. We experimentally show that adding pseudo-depth to training achieves a 4.4% performance boost over the non-depth baseline model on DIODE, a large stan-dard test dataset, retaining 63.8% of the performance boost achieved from training a classifier on RGB and ground truth depth. It also boosts performance by 1.3% on another dataset, SUN397, for which ground truth depth is not avail-able. Our result shows that it is possible to take information obtained from a model pre-trained on synthetic scenes and successfully apply it beyond the synthetic domain to real-world data.","PeriodicalId":306545,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACVW58289.2023.00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Ground truth depth information is necessary for many computer vision tasks. Collecting this information is chal-lenging, especially for outdoor scenes. In this work, we propose utilizing single-view depth prediction neural networks pre-trained on synthetic scenes to generate relative depth, which we call pseudo-depth. This approach is a less expen-sive option as the pre-trained neural network obtains ac-curate depth information from synthetic scenes, which does not require any expensive sensor equipment and takes less time. We measure the usefulness of pseudo-depth from pre-trained neural networks by training indoor/outdoor binary classifiers with and without it. We also compare the difference in accuracy between using pseudo-depth and ground truth depth. We experimentally show that adding pseudo-depth to training achieves a 4.4% performance boost over the non-depth baseline model on DIODE, a large stan-dard test dataset, retaining 63.8% of the performance boost achieved from training a classifier on RGB and ground truth depth. It also boosts performance by 1.3% on another dataset, SUN397, for which ground truth depth is not avail-able. Our result shows that it is possible to take information obtained from a model pre-trained on synthetic scenes and successfully apply it beyond the synthetic domain to real-world data.

查看原文本刊更多论文

综合预训练深度预测网络在室内外图像分类中的应用

地面真实深度信息是许多计算机视觉任务所必需的。收集这些信息是具有挑战性的，特别是对于户外场景。在这项工作中，我们建议使用在合成场景上预训练的单视图深度预测神经网络来生成相对深度，我们称之为伪深度。这种方法成本较低，因为预训练的神经网络可以从合成场景中获得准确的深度信息，不需要任何昂贵的传感器设备，而且耗时更短。我们通过训练室内/室外二元分类器来衡量预训练神经网络的伪深度的有用性。我们还比较了伪深度和地面真深度在精度上的差异。我们通过实验表明，在二极管(一个大型标准测试数据集)上，将伪深度添加到训练中可以比非深度基线模型提高4.4%的性能，保留了在RGB和ground truth深度上训练分类器所获得的63.8%的性能提升。在另一个数据集SUN397上，它的性能也提高了1.3%，该数据集的地面真值深度是不可用的。我们的结果表明，可以从合成场景预训练的模型中获取信息，并成功地将其应用于合成领域之外的真实世界数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)

自引率

0.00%

发文量