CUBE360: Learning Cubic Field Representation for Monocular Panoramic Depth Estimation

IF 4.6 2区计算机科学 Q2 ROBOTICS

IEEE Robotics and Automation Letters Pub Date : 2025-04-22 DOI:10.1109/LRA.2025.3563827

Wenjie Chang;Hao Ai;Tianzhu Zhang;Lin Wang

{"title":"CUBE360: Learning Cubic Field Representation for Monocular Panoramic Depth Estimation","authors":"Wenjie Chang;Hao Ai;Tianzhu Zhang;Lin Wang","doi":"10.1109/LRA.2025.3563827","DOIUrl":null,"url":null,"abstract":"Panoramic depth estimation presents significant challenges due to the severe distortion caused by equirectangular projection (ERP) and the limited availability of panoramic RGB-D datasets. Inspired by the recentsuccess of neural rendering, we propose a self-supervised method, named <bold>CUBE360</b>, that learns a cubic field composed of multiple Multi-Plane Images (MPIs) from a single panoramic image for <bold>continuous</b> depth estimation at any view direction. Our CUBE360 employs cubemap projection to transform an ERP image into six faces and extract the MPIs for each, thereby reducing the memory consumption required for MPIs processing of high-resolution data. An attention-based blending module is then employed to learn correlations among the MPIs of cubic faces, constructing a cubic field representation with color and density information at various depth levels. Furthermore, a dual-sampling strategy is introduced to render novel views from the cubic field at both cubic and planar scales. The entire pipeline is trained using photometric loss calculated from rendered views within a self-supervised learning (SSL) approach, enabling training without depth annotations. Experiments on synthetic and real-world datasets demonstrate the superior performance of CUBE360 compared to previous SSL methods.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"6264-6271"},"PeriodicalIF":4.6000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10974579/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Panoramic depth estimation presents significant challenges due to the severe distortion caused by equirectangular projection (ERP) and the limited availability of panoramic RGB-D datasets. Inspired by the recentsuccess of neural rendering, we propose a self-supervised method, named CUBE360, that learns a cubic field composed of multiple Multi-Plane Images (MPIs) from a single panoramic image for continuous depth estimation at any view direction. Our CUBE360 employs cubemap projection to transform an ERP image into six faces and extract the MPIs for each, thereby reducing the memory consumption required for MPIs processing of high-resolution data. An attention-based blending module is then employed to learn correlations among the MPIs of cubic faces, constructing a cubic field representation with color and density information at various depth levels. Furthermore, a dual-sampling strategy is introduced to render novel views from the cubic field at both cubic and planar scales. The entire pipeline is trained using photometric loss calculated from rendered views within a self-supervised learning (SSL) approach, enabling training without depth annotations. Experiments on synthetic and real-world datasets demonstrate the superior performance of CUBE360 compared to previous SSL methods.

查看原文本刊更多论文

学习立方场表示用于单目全景深度估计

由于等矩形投影（ERP）引起的严重失真和全景RGB-D数据集的有限可用性，全景深度估计面临着重大挑战。受最近神经渲染成功的启发，我们提出了一种自监督方法CUBE360，该方法从单个全景图像中学习由多个多平面图像（MPIs）组成的立方场，用于在任何视图方向上进行连续深度估计。我们的CUBE360采用立方体映射投影将ERP图像转换为六个面，并为每个面提取mpi，从而减少mpi处理高分辨率数据所需的内存消耗。然后使用基于注意力的混合模块来学习立方面mpi之间的相关性，构建具有不同深度层次颜色和密度信息的立方场表示。此外，引入了双重采样策略，在立方体和平面尺度上呈现立方体场的新视图。整个管道使用自监督学习（SSL）方法从渲染视图中计算的光度损失进行训练，从而实现无需深度注释的训练。在合成数据集和真实数据集上的实验表明，与以前的SSL方法相比，CUBE360具有优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Robotics and Automation Letters Computer Science-Computer Science Applications

CiteScore

9.60

自引率

15.40%

发文量

1428

期刊介绍： The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.