Xin Jin , Pengcheng Zhu , Dongjian Yu , Michal Wozniak , Qian Jiang , Puming Wang , Wei Zhou
{"title":"Combining depth and frequency features with Mamba for multi-focus image fusion","authors":"Xin Jin , Pengcheng Zhu , Dongjian Yu , Michal Wozniak , Qian Jiang , Puming Wang , Wei Zhou","doi":"10.1016/j.inffus.2025.103355","DOIUrl":null,"url":null,"abstract":"<div><div>Deep neural network (DNN)-based multi-focus image fusion (MFIF) methods have achieved significant success in generating an all-focus image by extracting visual features from multiple partially focused images. However, these methods fail to fully exploit frequency domain and depth information, leading to limitations in handling uniform and boundary regions. To address this issue, we propose a Mamba-based multi-focus image fusion framework to enhance fusion quality. Specifically, we introduce the Wavelet Mamba Module, which applies multi-level wavelet transforms to decompose the image into different frequency components, thereby enhancing contrast differences in focused regions and improving feature extraction across different focal planes. Meanwhile, a depth estimation network is employed to predict the foreground depth map, aiding in the precise identification of boundary regions. Finally, the CSmamba decoder effectively integrates frequency and depth features and leverages channel and spatial attention mechanisms to generate an optimized decision map, enabling precise selection of focused pixels and producing a high-quality fused image. Experimental results on both synthetic and real-world datasets demonstrate that the proposed method outperforms state-of-the-art MFIF approaches in terms of quantitative metrics and visual details, validating that the effective utilization of frequency and depth information can significantly enhance multi-focus image fusion performance.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103355"},"PeriodicalIF":15.5000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525004282","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Deep neural network (DNN)-based multi-focus image fusion (MFIF) methods have achieved significant success in generating an all-focus image by extracting visual features from multiple partially focused images. However, these methods fail to fully exploit frequency domain and depth information, leading to limitations in handling uniform and boundary regions. To address this issue, we propose a Mamba-based multi-focus image fusion framework to enhance fusion quality. Specifically, we introduce the Wavelet Mamba Module, which applies multi-level wavelet transforms to decompose the image into different frequency components, thereby enhancing contrast differences in focused regions and improving feature extraction across different focal planes. Meanwhile, a depth estimation network is employed to predict the foreground depth map, aiding in the precise identification of boundary regions. Finally, the CSmamba decoder effectively integrates frequency and depth features and leverages channel and spatial attention mechanisms to generate an optimized decision map, enabling precise selection of focused pixels and producing a high-quality fused image. Experimental results on both synthetic and real-world datasets demonstrate that the proposed method outperforms state-of-the-art MFIF approaches in terms of quantitative metrics and visual details, validating that the effective utilization of frequency and depth information can significantly enhance multi-focus image fusion performance.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.