Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia
{"title":"Depth on Demand: Streaming Dense Depth from a Low Frame Rate Active Sensor","authors":"Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia","doi":"arxiv-2409.08277","DOIUrl":null,"url":null,"abstract":"High frame rate and accurate depth estimation plays an important role in\nseveral tasks crucial to robotics and automotive perception. To date, this can\nbe achieved through ToF and LiDAR devices for indoor and outdoor applications,\nrespectively. However, their applicability is limited by low frame rate, energy\nconsumption, and spatial sparsity. Depth on Demand (DoD) allows for accurate\ntemporal and spatial depth densification achieved by exploiting a high frame\nrate RGB sensor coupled with a potentially lower frame rate and sparse active\ndepth sensor. Our proposal jointly enables lower energy consumption and denser\nshape reconstruction, by significantly reducing the streaming requirements on\nthe depth sensor thanks to its three core stages: i) multi-modal encoding, ii)\niterative multi-modal integration, and iii) depth decoding. We present extended\nevidence assessing the effectiveness of DoD on indoor and outdoor video\ndatasets, covering both environment scanning and automotive perception use\ncases.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
High frame rate and accurate depth estimation plays an important role in
several tasks crucial to robotics and automotive perception. To date, this can
be achieved through ToF and LiDAR devices for indoor and outdoor applications,
respectively. However, their applicability is limited by low frame rate, energy
consumption, and spatial sparsity. Depth on Demand (DoD) allows for accurate
temporal and spatial depth densification achieved by exploiting a high frame
rate RGB sensor coupled with a potentially lower frame rate and sparse active
depth sensor. Our proposal jointly enables lower energy consumption and denser
shape reconstruction, by significantly reducing the streaming requirements on
the depth sensor thanks to its three core stages: i) multi-modal encoding, ii)
iterative multi-modal integration, and iii) depth decoding. We present extended
evidence assessing the effectiveness of DoD on indoor and outdoor video
datasets, covering both environment scanning and automotive perception use
cases.
高帧率和精确的深度估计在机器人和汽车感知的多项关键任务中发挥着重要作用。迄今为止,可通过分别用于室内和室外应用的 ToF 和激光雷达设备实现这一目标。然而,它们的适用性受到低帧频、能耗和空间稀疏性的限制。按需深度(Depth on Demand,DoD)通过利用高帧率 RGB 传感器和潜在的低帧率稀疏深度传感器,实现了精确的时空深度密集化。我们的方案通过三个核心阶段:i)多模态编码;ii)迭代多模态整合;iii)深度解码,显著降低了对深度传感器的流媒体要求,从而实现了更低的能耗和更密集的形状重建。我们介绍了评估 DoD 在室内和室外视频数据集上有效性的扩展证据,涵盖了环境扫描和汽车感知用例。