FoggyDepth: Leveraging Channel Frequency and Non-Local Features for Depth Estimation in Fog

IF 11.1 1区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Mengjiao Shen;Liuyi Wang;Xianyou Zhong;Chengju Liu;Qijun Chen
{"title":"FoggyDepth: Leveraging Channel Frequency and Non-Local Features for Depth Estimation in Fog","authors":"Mengjiao Shen;Liuyi Wang;Xianyou Zhong;Chengju Liu;Qijun Chen","doi":"10.1109/TCSVT.2024.3509696","DOIUrl":null,"url":null,"abstract":"With the development of computer vision technology, unsupervised depth estimation from single images has experienced significant advancements under normal weather conditions, demonstrating highly promising results. Nevertheless, its efficacy in estimating depth under less-than-optimal weather conditions, particularly those characterized by fog, continues to pose substantial challenges. In this paper, we propose FoggyDepth that is designed to utilize channel-wise Fourier transform to remedy this limitation. Specifically, to relieve the problem of photometric consistency assumption not holding in foggy scenes within the unsupervised framework, we employ a channel-dimension Fourier transform to obtain channel global statistical information, thereby enhancing the discriminative ability of global representation. Meanwhile, we generate a series of foggy scene samples corresponding to normal training samples and use them for self-supervised training to guide the model to accurately recover depth in foggy conditions. In addition, to further improve the model performance, we utilize a non-local network to capture long-range spatial dependencies in depth estimation. Comprehensive evaluations conducted on the Oxford RobotCar, nuScenes, and Driving Stereo datasets substantiate the precision and reliability of our proposed method. Through a meticulous comparison with existing leading-edge algorithms in depth estimation, our approach demonstrates superior performance, both qualitatively and quantitatively.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 4","pages":"3589-3602"},"PeriodicalIF":11.1000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10772035/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

With the development of computer vision technology, unsupervised depth estimation from single images has experienced significant advancements under normal weather conditions, demonstrating highly promising results. Nevertheless, its efficacy in estimating depth under less-than-optimal weather conditions, particularly those characterized by fog, continues to pose substantial challenges. In this paper, we propose FoggyDepth that is designed to utilize channel-wise Fourier transform to remedy this limitation. Specifically, to relieve the problem of photometric consistency assumption not holding in foggy scenes within the unsupervised framework, we employ a channel-dimension Fourier transform to obtain channel global statistical information, thereby enhancing the discriminative ability of global representation. Meanwhile, we generate a series of foggy scene samples corresponding to normal training samples and use them for self-supervised training to guide the model to accurately recover depth in foggy conditions. In addition, to further improve the model performance, we utilize a non-local network to capture long-range spatial dependencies in depth estimation. Comprehensive evaluations conducted on the Oxford RobotCar, nuScenes, and Driving Stereo datasets substantiate the precision and reliability of our proposed method. Through a meticulous comparison with existing leading-edge algorithms in depth estimation, our approach demonstrates superior performance, both qualitatively and quantitatively.
FoggyDepth:利用信道频率和非局部特征在雾中进行深度估计
随着计算机视觉技术的发展,在正常天气条件下通过单张图像进行无监督深度估算的技术取得了长足的进步,并展示出了极具前景的成果。然而,在不太理想的天气条件下,尤其是以雾为特征的天气条件下,深度估计的有效性仍然面临巨大挑战。在本文中,我们提出了 FoggyDepth,旨在利用信道傅立叶变换来弥补这一局限。具体来说,为了缓解无监督框架下光度一致性假设在雾场景中不成立的问题,我们采用了信道维傅里叶变换来获取信道全局统计信息,从而增强了全局表示的判别能力。同时,我们生成了一系列与正常训练样本相对应的雾场景样本,并将其用于自监督训练,以指导模型在雾环境中准确恢复深度。此外,为了进一步提高模型性能,我们还利用非局部网络捕捉深度估计中的长程空间依赖性。在牛津 RobotCar、nuScenes 和 Driving Stereo 数据集上进行的综合评估证明了我们所提方法的精确性和可靠性。通过与现有的深度估计前沿算法进行细致比较,我们的方法在质量和数量上都表现出了卓越的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
13.80
自引率
27.40%
发文量
660
审稿时长
5 months
期刊介绍: The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信