Kunyang Wu , Jun Lin , Jiawei Miao , Zhengpeng Li , Xiucai Zhang , Genyuan Xing , Yiyao Fan , Jinxin Luo , Huanyu Zhao , Yang Liu , Guanyu Zhang
{"title":"DIFNet:深度完井双信息融合网络","authors":"Kunyang Wu , Jun Lin , Jiawei Miao , Zhengpeng Li , Xiucai Zhang , Genyuan Xing , Yiyao Fan , Jinxin Luo , Huanyu Zhao , Yang Liu , Guanyu Zhang","doi":"10.1016/j.inffus.2025.103424","DOIUrl":null,"url":null,"abstract":"<div><div>Depth completion, the task of reconstructing dense depth maps from sparse measurements, is crucial for scene understanding and autonomous systems. Leveraging aligned, high-resolution RGB images as guidance is a common and powerful approach, yet the inherent frequency heterogeneity between RGB and sparse depth data presents a significant challenge for effective cross-modal fusion. Conventional methods often employ simplistic fusion strategies that overlook these distinct frequency characteristics, limiting their ability to fully exploit the complementary nature of RGB and depth information. In this paper, we introduce DIFNet: a Dual-Information Fusion Network, based on a novel frequency-aware fusion paradigm focused on image-guided completion. The core of DIFNet is the Dual Stream Modeling (DSM) block, which explicitly decouples and processes high-frequency edge details and low-frequency smooth regions with tailored architectures, leveraging a spatially-aware Mamba architecture for high-frequency streams and densely connected convolutions for low-frequency streams. Furthermore, DIFNet incorporates an innovative Initial Feature Fusion (IFF) layer to facilitate synergistic multi-scale RGB and depth feature integration from the input stage. Extensive evaluations on KITTI and NYUv2 datasets demonstrate that DIFNet achieves state-of-the-art performance in depth completion with competitive computational efficiency, highlighting the efficacy of our frequency-aware dual information fusion strategy. The code for this work is publicly available at <span><span>https://github.com/wuky2000/DIFNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"125 ","pages":"Article 103424"},"PeriodicalIF":15.5000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DIFNet: Dual-Information Fusion Network for depth completion\",\"authors\":\"Kunyang Wu , Jun Lin , Jiawei Miao , Zhengpeng Li , Xiucai Zhang , Genyuan Xing , Yiyao Fan , Jinxin Luo , Huanyu Zhao , Yang Liu , Guanyu Zhang\",\"doi\":\"10.1016/j.inffus.2025.103424\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Depth completion, the task of reconstructing dense depth maps from sparse measurements, is crucial for scene understanding and autonomous systems. Leveraging aligned, high-resolution RGB images as guidance is a common and powerful approach, yet the inherent frequency heterogeneity between RGB and sparse depth data presents a significant challenge for effective cross-modal fusion. Conventional methods often employ simplistic fusion strategies that overlook these distinct frequency characteristics, limiting their ability to fully exploit the complementary nature of RGB and depth information. In this paper, we introduce DIFNet: a Dual-Information Fusion Network, based on a novel frequency-aware fusion paradigm focused on image-guided completion. The core of DIFNet is the Dual Stream Modeling (DSM) block, which explicitly decouples and processes high-frequency edge details and low-frequency smooth regions with tailored architectures, leveraging a spatially-aware Mamba architecture for high-frequency streams and densely connected convolutions for low-frequency streams. Furthermore, DIFNet incorporates an innovative Initial Feature Fusion (IFF) layer to facilitate synergistic multi-scale RGB and depth feature integration from the input stage. Extensive evaluations on KITTI and NYUv2 datasets demonstrate that DIFNet achieves state-of-the-art performance in depth completion with competitive computational efficiency, highlighting the efficacy of our frequency-aware dual information fusion strategy. The code for this work is publicly available at <span><span>https://github.com/wuky2000/DIFNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50367,\"journal\":{\"name\":\"Information Fusion\",\"volume\":\"125 \",\"pages\":\"Article 103424\"},\"PeriodicalIF\":15.5000,\"publicationDate\":\"2025-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Fusion\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S156625352500497X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156625352500497X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
DIFNet: Dual-Information Fusion Network for depth completion
Depth completion, the task of reconstructing dense depth maps from sparse measurements, is crucial for scene understanding and autonomous systems. Leveraging aligned, high-resolution RGB images as guidance is a common and powerful approach, yet the inherent frequency heterogeneity between RGB and sparse depth data presents a significant challenge for effective cross-modal fusion. Conventional methods often employ simplistic fusion strategies that overlook these distinct frequency characteristics, limiting their ability to fully exploit the complementary nature of RGB and depth information. In this paper, we introduce DIFNet: a Dual-Information Fusion Network, based on a novel frequency-aware fusion paradigm focused on image-guided completion. The core of DIFNet is the Dual Stream Modeling (DSM) block, which explicitly decouples and processes high-frequency edge details and low-frequency smooth regions with tailored architectures, leveraging a spatially-aware Mamba architecture for high-frequency streams and densely connected convolutions for low-frequency streams. Furthermore, DIFNet incorporates an innovative Initial Feature Fusion (IFF) layer to facilitate synergistic multi-scale RGB and depth feature integration from the input stage. Extensive evaluations on KITTI and NYUv2 datasets demonstrate that DIFNet achieves state-of-the-art performance in depth completion with competitive computational efficiency, highlighting the efficacy of our frequency-aware dual information fusion strategy. The code for this work is publicly available at https://github.com/wuky2000/DIFNet.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.