A lightweight depth completion network with spatial efficient fusion

IF 4.2 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zhichao Fu , Anran Wu , Zisong Zhuang , Xingjiao Wu , Jun He
{"title":"A lightweight depth completion network with spatial efficient fusion","authors":"Zhichao Fu ,&nbsp;Anran Wu ,&nbsp;Zisong Zhuang ,&nbsp;Xingjiao Wu ,&nbsp;Jun He","doi":"10.1016/j.imavis.2024.105335","DOIUrl":null,"url":null,"abstract":"<div><div>Depth completion is a low-level task rebuilding the dense depth from a sparse set of measurements from LiDAR sensors and corresponding RGB images. Current state-of-the-art depth completion methods used complicated network designs with much computational cost increase, which is incompatible with the realistic-scenario limited computational environment. In this paper, we explore a lightweight and efficient depth completion model named Light-SEF. Light-SEF is a two-stage framework that introduces local fusion and global fusion modules to extract and fuse local and global information in the sparse LiDAR data and RGB images. We also propose a unit convolutional structure named spatial efficient block (SEB), which has a lightweight design and extracts spatial features efficiently. As the unit block of the whole network, SEB is much more cost-efficient compared to the baseline design. Experimental results on the KITTI benchmark demonstrate that our Light-SEF achieves significant declines in computational cost (about 53% parameters, 50% FLOPs &amp; MACs, and 36% running time) while showing competitive results compared to state-of-the-art methods.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"153 ","pages":"Article 105335"},"PeriodicalIF":4.2000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624004402","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Depth completion is a low-level task rebuilding the dense depth from a sparse set of measurements from LiDAR sensors and corresponding RGB images. Current state-of-the-art depth completion methods used complicated network designs with much computational cost increase, which is incompatible with the realistic-scenario limited computational environment. In this paper, we explore a lightweight and efficient depth completion model named Light-SEF. Light-SEF is a two-stage framework that introduces local fusion and global fusion modules to extract and fuse local and global information in the sparse LiDAR data and RGB images. We also propose a unit convolutional structure named spatial efficient block (SEB), which has a lightweight design and extracts spatial features efficiently. As the unit block of the whole network, SEB is much more cost-efficient compared to the baseline design. Experimental results on the KITTI benchmark demonstrate that our Light-SEF achieves significant declines in computational cost (about 53% parameters, 50% FLOPs & MACs, and 36% running time) while showing competitive results compared to state-of-the-art methods.

Abstract Image

具有空间高效融合功能的轻量级深度补全网络
深度补全是一项从激光雷达传感器的稀疏测量数据集和相应的 RGB 图像中重建密集深度的底层任务。目前最先进的深度补全方法采用复杂的网络设计,计算成本大幅增加,不符合现实场景有限的计算环境。在本文中,我们探索了一种名为 Light-SEF 的轻量级高效深度补全模型。Light-SEF 是一个两阶段框架,引入了局部融合和全局融合模块,以提取和融合稀疏激光雷达数据和 RGB 图像中的局部和全局信息。我们还提出了一种名为 "空间高效块(SEB)"的单元卷积结构,该结构设计轻巧,能高效提取空间特征。作为整个网络的单元块,SEB 与基线设计相比更具成本效益。在 KITTI 基准上的实验结果表明,与最先进的方法相比,我们的 Light-SEF 实现了计算成本的显著下降(约 53% 的参数、50% 的 FLOPs & MACs 和 36% 的运行时间),同时显示出具有竞争力的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Image and Vision Computing
Image and Vision Computing 工程技术-工程:电子与电气
CiteScore
8.50
自引率
8.50%
发文量
143
审稿时长
7.8 months
期刊介绍: Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信