Towards Depth-Continuous Scene Representation With a Displacement Field for Robust Light Field Depth Estimation

IF 9.7 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Multimedia Pub Date : 2025-01-27 DOI:10.1109/TMM.2025.3535352

Rongshan Chen;Hao Sheng;Da Yang;Ruixuan Cong;Zhenglong Cui;Sizhe Wang;Tun Wang;Mingyuan Zhao

{"title":"Towards Depth-Continuous Scene Representation With a Displacement Field for Robust Light Field Depth Estimation","authors":"Rongshan Chen;Hao Sheng;Da Yang;Ruixuan Cong;Zhenglong Cui;Sizhe Wang;Tun Wang;Mingyuan Zhao","doi":"10.1109/TMM.2025.3535352","DOIUrl":null,"url":null,"abstract":"Light field (LF) captures both spatial and angular information of scenes, enabling accurate depth estimation. However, previous deep learning methods have typically model surface depth only, while ignoring the continuous nature of depth in 3D scenes. In this paper, we use displacement field (DF) to describe this continuous property, and propose a novel depth-continuous scene representation for robust LF depth estimation. Experiments demonstrate that our representation enables the network to generate highly detailed depth maps with fewer parameters and faster speed. Specifically, inspired by signed distance field in 3D object description, we aim to exploit the intrinsic depth-continuous property of 3D scenes using DF, and define a novel depth-continuous scene representation. Then, we introduce a simple yet general learning framework for depth-continuous scene embedding, and the proposed network, DepthDF, achieves state-of-the-art performance on both synthetic and real-world LF datasets, ranking 1st on the HCI 4D Light Field benchmark. Furthermore, previous LF depth estimation methods can also be seamlessly integrated into this framework. Finally, we extend this framework beyond LF depth estimation to various tasks, including multi-view stereo depth inference, LF super-resolution, and LF salient object detection. Experiments demonstrate improved performance when the continuous scene representation is applied, suggesting that our framework can potentially bring insights to more fields.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"3637-3649"},"PeriodicalIF":9.7000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10855497/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Light field (LF) captures both spatial and angular information of scenes, enabling accurate depth estimation. However, previous deep learning methods have typically model surface depth only, while ignoring the continuous nature of depth in 3D scenes. In this paper, we use displacement field (DF) to describe this continuous property, and propose a novel depth-continuous scene representation for robust LF depth estimation. Experiments demonstrate that our representation enables the network to generate highly detailed depth maps with fewer parameters and faster speed. Specifically, inspired by signed distance field in 3D object description, we aim to exploit the intrinsic depth-continuous property of 3D scenes using DF, and define a novel depth-continuous scene representation. Then, we introduce a simple yet general learning framework for depth-continuous scene embedding, and the proposed network, DepthDF, achieves state-of-the-art performance on both synthetic and real-world LF datasets, ranking 1st on the HCI 4D Light Field benchmark. Furthermore, previous LF depth estimation methods can also be seamlessly integrated into this framework. Finally, we extend this framework beyond LF depth estimation to various tasks, including multi-view stereo depth inference, LF super-resolution, and LF salient object detection. Experiments demonstrate improved performance when the continuous scene representation is applied, suggesting that our framework can potentially bring insights to more fields.

查看原文本刊更多论文

基于位移场的场景深度连续表示鲁棒光场深度估计

光场（LF）捕获场景的空间和角度信息，实现准确的深度估计。然而，以往的深度学习方法通常只对表面深度进行建模，而忽略了3D场景中深度的连续性。本文利用位移场（DF）来描述这种连续特性，提出了一种新的深度连续场景表示，用于鲁棒LF深度估计。实验表明，我们的表示方法可以使网络以更少的参数和更快的速度生成高度详细的深度图。具体而言，受三维物体描述中的符号距离场的启发，我们旨在利用DF来挖掘三维场景固有的深度连续特性，并定义一种新的深度连续场景表示。然后，我们为深度连续场景嵌入引入了一个简单而通用的学习框架，并且所提出的网络DepthDF在合成和现实世界的LF数据集上都实现了最先进的性能，在HCI 4D光场基准测试中排名第一。此外，以前的LF深度估计方法也可以无缝地集成到该框架中。最后，我们将该框架从LF深度估计扩展到各种任务，包括多视图立体深度推断、LF超分辨率和LF显著目标检测。实验表明，当应用连续场景表示时，性能有所提高，这表明我们的框架可以为更多领域带来潜在的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Multimedia 工程技术-电信学

CiteScore

11.70

自引率

11.00%

发文量

576

审稿时长

5.5 months

期刊介绍： The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.