F2Fusion: Frequency Feature Fusion Network for Infrared and Visible Image via Contourlet Transform and Mamba-UNet

IF 5.9 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Instrumentation and Measurement Pub Date : 2025-06-18 DOI:10.1109/TIM.2025.3580829

Renhe Liu;Han Wang;Kai Hu;Shaochu Wang;Yu Liu

{"title":"F2Fusion: Frequency Feature Fusion Network for Infrared and Visible Image via Contourlet Transform and Mamba-UNet","authors":"Renhe Liu;Han Wang;Kai Hu;Shaochu Wang;Yu Liu","doi":"10.1109/TIM.2025.3580829","DOIUrl":null,"url":null,"abstract":"To integrate complementary thermal and texture information from source infrared (IR) and visible (VIS) images into a comprehensive fused image, traditional multiscale transform algorithms, and deep neural networks have been extensively explored for IR and VIS image fusion (IVIF). However, existing methods often face difficulties combining the strengths of these two approaches, particularly when it comes to balancing the preservation of salient and texture information in challenging conditions such as low light, glare, and overexposure. This article proposes a novel frequency feature fusion network (F2Fusion) that exploits detailed space-frequency transformation through contourlet transform (CT) and multiscale long-range learning via the Mamba-UNet architecture. The Mamba block is embedded into the multiscale encoder and decoder structures to improve feature extraction and image reconstruction performance. The CT operation replaces the conventional pooling layer in the multiscale encoder, converting spatial features into high- and low-frequency subbands. We then introduce a dual-branch frequency feature fusion module to facilitate the fusion of cross-modality illumination information and fine details based on the distinct characteristics of different frequency subbands. In addition, we design a composite loss function, which includes both gradient and salient constraints, to guide the precise synthesis of salient targets and texture regions. Qualitative and quantitative comparisons across three benchmark datasets demonstrate that the proposed method outperforms recent state-of-the-art (SOTA) fusion techniques. Extended experimental results on downstream object detection tasks further validate the distinct advantages of the proposed architecture for fusion through precise frequency decomposition. The code is available at: <uri>https://github.com/lrh-1994/F2Fusion</uri>","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-17"},"PeriodicalIF":5.9000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11042881/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

To integrate complementary thermal and texture information from source infrared (IR) and visible (VIS) images into a comprehensive fused image, traditional multiscale transform algorithms, and deep neural networks have been extensively explored for IR and VIS image fusion (IVIF). However, existing methods often face difficulties combining the strengths of these two approaches, particularly when it comes to balancing the preservation of salient and texture information in challenging conditions such as low light, glare, and overexposure. This article proposes a novel frequency feature fusion network (F2Fusion) that exploits detailed space-frequency transformation through contourlet transform (CT) and multiscale long-range learning via the Mamba-UNet architecture. The Mamba block is embedded into the multiscale encoder and decoder structures to improve feature extraction and image reconstruction performance. The CT operation replaces the conventional pooling layer in the multiscale encoder, converting spatial features into high- and low-frequency subbands. We then introduce a dual-branch frequency feature fusion module to facilitate the fusion of cross-modality illumination information and fine details based on the distinct characteristics of different frequency subbands. In addition, we design a composite loss function, which includes both gradient and salient constraints, to guide the precise synthesis of salient targets and texture regions. Qualitative and quantitative comparisons across three benchmark datasets demonstrate that the proposed method outperforms recent state-of-the-art (SOTA) fusion techniques. Extended experimental results on downstream object detection tasks further validate the distinct advantages of the proposed architecture for fusion through precise frequency decomposition. The code is available at: https://github.com/lrh-1994/F2Fusion

查看原文本刊更多论文

F2Fusion：基于Contourlet变换和Mamba-UNet的红外与可见光图像频率特征融合网络

为了将红外（IR）和可见光（VIS）图像的互补热信息和纹理信息整合成综合融合图像，传统的多尺度变换算法和深度神经网络被广泛用于红外和可见光图像融合（IVIF）。然而，现有的方法往往难以将这两种方法的优势结合起来，特别是在低光、眩光和过度曝光等具有挑战性的条件下平衡显著性和纹理信息的保存。本文提出了一种新的频率特征融合网络（F2Fusion），该网络通过contourlet变换（CT）和Mamba-UNet结构的多尺度远程学习来利用详细的空频变换。曼巴块嵌入到多尺度编码器和解码器结构中，以提高特征提取和图像重建性能。CT操作取代了传统的多尺度编码器中的池化层，将空间特征转换为高频段和低频子带。然后，我们引入了双分支频率特征融合模块，以促进基于不同频率子带的不同特征的交叉模态照明信息和精细细节的融合。此外，我们设计了一个包含梯度和显著性约束的复合损失函数，以指导显著性目标和纹理区域的精确合成。通过三个基准数据集的定性和定量比较表明，所提出的方法优于最近的最先进的（SOTA）融合技术。下游目标检测任务的扩展实验结果进一步验证了通过精确频率分解进行融合的架构的明显优势。代码可从https://github.com/lrh-1994/F2Fusion获得

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Instrumentation and Measurement 工程技术-工程：电子与电气

CiteScore

9.00

自引率

23.20%

发文量

1294

审稿时长

3.9 months

期刊介绍： Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.