DIFNet:深度完井双信息融合网络

IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Kunyang Wu , Jun Lin , Jiawei Miao , Zhengpeng Li , Xiucai Zhang , Genyuan Xing , Yiyao Fan , Jinxin Luo , Huanyu Zhao , Yang Liu , Guanyu Zhang
{"title":"DIFNet:深度完井双信息融合网络","authors":"Kunyang Wu ,&nbsp;Jun Lin ,&nbsp;Jiawei Miao ,&nbsp;Zhengpeng Li ,&nbsp;Xiucai Zhang ,&nbsp;Genyuan Xing ,&nbsp;Yiyao Fan ,&nbsp;Jinxin Luo ,&nbsp;Huanyu Zhao ,&nbsp;Yang Liu ,&nbsp;Guanyu Zhang","doi":"10.1016/j.inffus.2025.103424","DOIUrl":null,"url":null,"abstract":"<div><div>Depth completion, the task of reconstructing dense depth maps from sparse measurements, is crucial for scene understanding and autonomous systems. Leveraging aligned, high-resolution RGB images as guidance is a common and powerful approach, yet the inherent frequency heterogeneity between RGB and sparse depth data presents a significant challenge for effective cross-modal fusion. Conventional methods often employ simplistic fusion strategies that overlook these distinct frequency characteristics, limiting their ability to fully exploit the complementary nature of RGB and depth information. In this paper, we introduce DIFNet: a Dual-Information Fusion Network, based on a novel frequency-aware fusion paradigm focused on image-guided completion. The core of DIFNet is the Dual Stream Modeling (DSM) block, which explicitly decouples and processes high-frequency edge details and low-frequency smooth regions with tailored architectures, leveraging a spatially-aware Mamba architecture for high-frequency streams and densely connected convolutions for low-frequency streams. Furthermore, DIFNet incorporates an innovative Initial Feature Fusion (IFF) layer to facilitate synergistic multi-scale RGB and depth feature integration from the input stage. Extensive evaluations on KITTI and NYUv2 datasets demonstrate that DIFNet achieves state-of-the-art performance in depth completion with competitive computational efficiency, highlighting the efficacy of our frequency-aware dual information fusion strategy. The code for this work is publicly available at <span><span>https://github.com/wuky2000/DIFNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"125 ","pages":"Article 103424"},"PeriodicalIF":15.5000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DIFNet: Dual-Information Fusion Network for depth completion\",\"authors\":\"Kunyang Wu ,&nbsp;Jun Lin ,&nbsp;Jiawei Miao ,&nbsp;Zhengpeng Li ,&nbsp;Xiucai Zhang ,&nbsp;Genyuan Xing ,&nbsp;Yiyao Fan ,&nbsp;Jinxin Luo ,&nbsp;Huanyu Zhao ,&nbsp;Yang Liu ,&nbsp;Guanyu Zhang\",\"doi\":\"10.1016/j.inffus.2025.103424\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Depth completion, the task of reconstructing dense depth maps from sparse measurements, is crucial for scene understanding and autonomous systems. Leveraging aligned, high-resolution RGB images as guidance is a common and powerful approach, yet the inherent frequency heterogeneity between RGB and sparse depth data presents a significant challenge for effective cross-modal fusion. Conventional methods often employ simplistic fusion strategies that overlook these distinct frequency characteristics, limiting their ability to fully exploit the complementary nature of RGB and depth information. In this paper, we introduce DIFNet: a Dual-Information Fusion Network, based on a novel frequency-aware fusion paradigm focused on image-guided completion. The core of DIFNet is the Dual Stream Modeling (DSM) block, which explicitly decouples and processes high-frequency edge details and low-frequency smooth regions with tailored architectures, leveraging a spatially-aware Mamba architecture for high-frequency streams and densely connected convolutions for low-frequency streams. Furthermore, DIFNet incorporates an innovative Initial Feature Fusion (IFF) layer to facilitate synergistic multi-scale RGB and depth feature integration from the input stage. Extensive evaluations on KITTI and NYUv2 datasets demonstrate that DIFNet achieves state-of-the-art performance in depth completion with competitive computational efficiency, highlighting the efficacy of our frequency-aware dual information fusion strategy. The code for this work is publicly available at <span><span>https://github.com/wuky2000/DIFNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"125 \",\"pages\":\"Article 103424\"},\"PeriodicalIF\":15.5000,\"publicationDate\":\"2025-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S156625352500497X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156625352500497X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

深度补全,即从稀疏测量中重建密集深度图的任务,对于场景理解和自主系统至关重要。利用对齐的高分辨率RGB图像作为引导是一种常见且强大的方法,但RGB和稀疏深度数据之间固有的频率异质性对有效的跨模态融合提出了重大挑战。传统的方法通常采用简单的融合策略,忽略了这些不同的频率特征,限制了它们充分利用RGB和深度信息互补性的能力。在本文中,我们介绍了DIFNet:一个双信息融合网络,基于一种新的频率感知融合范式,专注于图像引导补全。DIFNet的核心是双流建模(DSM)块,它通过定制的架构明确地解耦和处理高频边缘细节和低频平滑区域,利用空间感知的Mamba架构来处理高频流,并利用密集连接的卷积来处理低频流。此外,DIFNet集成了一个创新的初始特征融合(IFF)层,以促进从输入阶段开始的多尺度RGB和深度特征的协同集成。对KITTI和NYUv2数据集的广泛评估表明,DIFNet在深度完井方面实现了最先进的性能,具有竞争力的计算效率,突出了我们的频率感知双信息融合策略的有效性。这项工作的代码可在https://github.com/wuky2000/DIFNet上公开获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
DIFNet: Dual-Information Fusion Network for depth completion
Depth completion, the task of reconstructing dense depth maps from sparse measurements, is crucial for scene understanding and autonomous systems. Leveraging aligned, high-resolution RGB images as guidance is a common and powerful approach, yet the inherent frequency heterogeneity between RGB and sparse depth data presents a significant challenge for effective cross-modal fusion. Conventional methods often employ simplistic fusion strategies that overlook these distinct frequency characteristics, limiting their ability to fully exploit the complementary nature of RGB and depth information. In this paper, we introduce DIFNet: a Dual-Information Fusion Network, based on a novel frequency-aware fusion paradigm focused on image-guided completion. The core of DIFNet is the Dual Stream Modeling (DSM) block, which explicitly decouples and processes high-frequency edge details and low-frequency smooth regions with tailored architectures, leveraging a spatially-aware Mamba architecture for high-frequency streams and densely connected convolutions for low-frequency streams. Furthermore, DIFNet incorporates an innovative Initial Feature Fusion (IFF) layer to facilitate synergistic multi-scale RGB and depth feature integration from the input stage. Extensive evaluations on KITTI and NYUv2 datasets demonstrate that DIFNet achieves state-of-the-art performance in depth completion with competitive computational efficiency, highlighting the efficacy of our frequency-aware dual information fusion strategy. The code for this work is publicly available at https://github.com/wuky2000/DIFNet.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信