YuXuan Chen , Gang Liu , MengLiang Xing , KaiXin Li , Gang Xiao
{"title":"Denoising diffusion based infrared and visible image fusion with transformer","authors":"YuXuan Chen , Gang Liu , MengLiang Xing , KaiXin Li , Gang Xiao","doi":"10.1016/j.infrared.2025.105834","DOIUrl":null,"url":null,"abstract":"<div><div>Infrared-visible image fusion (IVF) aims to combine the complementary information from infrared and visible images. In the field of image fusion, generative adversarial networks (GANs) have achieved promising results. However, the issues of unstable training and mode collapse remain challenging to resolve. Therefore, we propose a novel IVF method based on a diffusion model combined with fusion knowledge priors, termed DDFT. DDFT is divided into two parts, a pre-fusion module and a diffusion model. Specifically, it first obtains images containing the prior distribution of the fusion task through a pre-fusion module. Subsequently, the forward diffusion process gradually removes distinguishable features from the output of the pre-fusion module, The reverse diffusion process learns the fusion knowledge prior distribution and leverages a Transformer module to capture global features, generating high-quality fused images. Comparative experiments demonstrate that DDFT excels in IVF tasks, especially in preserving weak textures. Generalization experiments illustrate that DDFT preserves image features in both simple and complex environments. Ablation experiments further validated the crucial role of the fusion prior information obtained through pre-fusion in DDFT. Compared to existing diffusion models that are only used as feature extractors or with fixed parameters, DDFT is the first to achieve an end-to-end trainable diffusion framework for directly generating fused images.</div></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"148 ","pages":"Article 105834"},"PeriodicalIF":3.1000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449525001276","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Infrared-visible image fusion (IVF) aims to combine the complementary information from infrared and visible images. In the field of image fusion, generative adversarial networks (GANs) have achieved promising results. However, the issues of unstable training and mode collapse remain challenging to resolve. Therefore, we propose a novel IVF method based on a diffusion model combined with fusion knowledge priors, termed DDFT. DDFT is divided into two parts, a pre-fusion module and a diffusion model. Specifically, it first obtains images containing the prior distribution of the fusion task through a pre-fusion module. Subsequently, the forward diffusion process gradually removes distinguishable features from the output of the pre-fusion module, The reverse diffusion process learns the fusion knowledge prior distribution and leverages a Transformer module to capture global features, generating high-quality fused images. Comparative experiments demonstrate that DDFT excels in IVF tasks, especially in preserving weak textures. Generalization experiments illustrate that DDFT preserves image features in both simple and complex environments. Ablation experiments further validated the crucial role of the fusion prior information obtained through pre-fusion in DDFT. Compared to existing diffusion models that are only used as feature extractors or with fixed parameters, DDFT is the first to achieve an end-to-end trainable diffusion framework for directly generating fused images.
期刊介绍:
The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region.
Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine.
Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.