A Comparison Study of Depth Map Estimation in Indoor Environments Using pix2pix and CycleGAN

IF 1.3 4区 工程技术 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Ricardo Salvino Casado;Emerson Carlos Pedrino
{"title":"A Comparison Study of Depth Map Estimation in Indoor Environments Using pix2pix and CycleGAN","authors":"Ricardo Salvino Casado;Emerson Carlos Pedrino","doi":"10.1109/TLA.2024.10431422","DOIUrl":null,"url":null,"abstract":"This article presents a Deep Learning-based approach for comparing automatic depth map estimation in indoor environments, with the aim of using them in navigation aid systems for visually impaired individuals. Depth map estimation is a laborious process, as most high-precision systems consist of complex stereo vision systems. The methodology utilizes Generative Adversarial Networks (GANs) techniques for generating depth maps from single RGB images. The study introduces methods for generating depth maps using pix2pix and CycleGAN. The major challenges still lie in the need to use large datasets, which are coupled with long training times. Additionally, a comparison of L1 Loss with a variation of the MonoDepth2 and DenseDepth systems was performed, using ResNet50 and ResNet18 as encoders, which are mentioned in this work, for comparison and validation of the presented method. The results demonstrate that CycleGAN is capable of generating more reliable maps compared to pix2pix and DepthNetResNet50, with an L1 Loss approximately 2,5 times smaller than pix2pix, approximately 2,4 times smaller than DepthNetResNet50, and approximately 14 times smaller than DepthNetResNet18.","PeriodicalId":55024,"journal":{"name":"IEEE Latin America Transactions","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10431422","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Latin America Transactions","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10431422/","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

This article presents a Deep Learning-based approach for comparing automatic depth map estimation in indoor environments, with the aim of using them in navigation aid systems for visually impaired individuals. Depth map estimation is a laborious process, as most high-precision systems consist of complex stereo vision systems. The methodology utilizes Generative Adversarial Networks (GANs) techniques for generating depth maps from single RGB images. The study introduces methods for generating depth maps using pix2pix and CycleGAN. The major challenges still lie in the need to use large datasets, which are coupled with long training times. Additionally, a comparison of L1 Loss with a variation of the MonoDepth2 and DenseDepth systems was performed, using ResNet50 and ResNet18 as encoders, which are mentioned in this work, for comparison and validation of the presented method. The results demonstrate that CycleGAN is capable of generating more reliable maps compared to pix2pix and DepthNetResNet50, with an L1 Loss approximately 2,5 times smaller than pix2pix, approximately 2,4 times smaller than DepthNetResNet50, and approximately 14 times smaller than DepthNetResNet18.
使用 pix2pix 和 CycleGAN 进行室内环境深度图估算的比较研究
本文介绍了一种基于深度学习的方法,用于比较室内环境中的自动深度图估算,目的是将其用于视障人士的导航辅助系统。深度图估算是一个费力的过程,因为大多数高精度系统都由复杂的立体视觉系统组成。该方法利用生成对抗网络(GANs)技术从单个 RGB 图像生成深度图。研究介绍了使用 pix2pix 和 CycleGAN 生成深度图的方法。主要的挑战仍然在于需要使用大型数据集,而且训练时间较长。此外,还使用 ResNet50 和 ResNet18 作为编码器,对 L1 Loss 与 MonoDepth2 和 DenseDepth 系统的变体进行了比较,以比较和验证所提出的方法。结果表明,与 pix2pix 和 DepthNetResNet50 相比,CycleGAN 能够生成更可靠的地图,其 L1 损失比 pix2pix 小约 2.5 倍,比 DepthNetResNet50 小约 2.4 倍,比 DepthNetResNet18 小约 14 倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Latin America Transactions
IEEE Latin America Transactions COMPUTER SCIENCE, INFORMATION SYSTEMS-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
3.50
自引率
7.70%
发文量
192
审稿时长
3-8 weeks
期刊介绍: IEEE Latin America Transactions (IEEE LATAM) is an interdisciplinary journal focused on the dissemination of original and quality research papers / review articles in Spanish and Portuguese of emerging topics in three main areas: Computing, Electric Energy and Electronics. Some of the sub-areas of the journal are, but not limited to: Automatic control, communications, instrumentation, artificial intelligence, power and industrial electronics, fault diagnosis and detection, transportation electrification, internet of things, electrical machines, circuits and systems, biomedicine and biomedical / haptic applications, secure communications, robotics, sensors and actuators, computer networks, smart grids, among others.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信