YuBin Li , Weida Zhan , Jinxin Guo , Depeng Zhu , Yichun Jiang , Yu Chen , Xiaoyu Xu , Deng Han
{"title":"MRFuse: Metric learning and masked autoencoder for fusing real infrared and visible images","authors":"YuBin Li , Weida Zhan , Jinxin Guo , Depeng Zhu , Yichun Jiang , Yu Chen , Xiaoyu Xu , Deng Han","doi":"10.1016/j.optlastec.2025.112971","DOIUrl":null,"url":null,"abstract":"<div><div>The task of infrared and visible image fusion aims to retain the thermal targets from infrared images while preserving the details, brightness, and other important features from visible images. Current methods face challenges such as unclear fusion objectives, difficulty in interpreting the learning process, and uncontrollable auxiliary learning weights. To address these issues, this paper proposes a novel fusion method based on metric learning and masked autoencoders for real infrared and visible image fusion, termed MRFuse. MRFuse operates through a combination of metric mapping space, auxiliary networks, and fusion networks. First, we introduce a Real Degradation Estimation Module (RDEM), which employs a simple neural network to establish a controllable degradation estimation scheme within the metric space. Additionally, to train the metric space, we propose a sample generation method that provides complex training samples for the metric learning pipeline. Next, we present a fusion network based on masked autoencoding. Specifically, we construct hybrid masked infrared and visible image pairs and design a U-shaped ViT encoder–decoder architecture. This architecture leverages hierarchical feature representation and layer-wise fusion to reconstruct high-quality fused images. Finally, to train the fusion network, we design a masked region loss to constrain reconstruction errors within masked regions, and further employ gradient loss, structural consistency loss, and perceptual loss to enhance the quality of the fused images. Extensive experiments demonstrate that MRFuse exhibits superior controllability and excels in suppressing noise, blur, and glare, outperforming other state-of-the-art methods in both subjective and objective evaluations.</div></div>","PeriodicalId":19511,"journal":{"name":"Optics and Laser Technology","volume":"189 ","pages":"Article 112971"},"PeriodicalIF":4.6000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics and Laser Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0030399225005626","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0
Abstract
The task of infrared and visible image fusion aims to retain the thermal targets from infrared images while preserving the details, brightness, and other important features from visible images. Current methods face challenges such as unclear fusion objectives, difficulty in interpreting the learning process, and uncontrollable auxiliary learning weights. To address these issues, this paper proposes a novel fusion method based on metric learning and masked autoencoders for real infrared and visible image fusion, termed MRFuse. MRFuse operates through a combination of metric mapping space, auxiliary networks, and fusion networks. First, we introduce a Real Degradation Estimation Module (RDEM), which employs a simple neural network to establish a controllable degradation estimation scheme within the metric space. Additionally, to train the metric space, we propose a sample generation method that provides complex training samples for the metric learning pipeline. Next, we present a fusion network based on masked autoencoding. Specifically, we construct hybrid masked infrared and visible image pairs and design a U-shaped ViT encoder–decoder architecture. This architecture leverages hierarchical feature representation and layer-wise fusion to reconstruct high-quality fused images. Finally, to train the fusion network, we design a masked region loss to constrain reconstruction errors within masked regions, and further employ gradient loss, structural consistency loss, and perceptual loss to enhance the quality of the fused images. Extensive experiments demonstrate that MRFuse exhibits superior controllability and excels in suppressing noise, blur, and glare, outperforming other state-of-the-art methods in both subjective and objective evaluations.
期刊介绍:
Optics & Laser Technology aims to provide a vehicle for the publication of a broad range of high quality research and review papers in those fields of scientific and engineering research appertaining to the development and application of the technology of optics and lasers. Papers describing original work in these areas are submitted to rigorous refereeing prior to acceptance for publication.
The scope of Optics & Laser Technology encompasses, but is not restricted to, the following areas:
•development in all types of lasers
•developments in optoelectronic devices and photonics
•developments in new photonics and optical concepts
•developments in conventional optics, optical instruments and components
•techniques of optical metrology, including interferometry and optical fibre sensors
•LIDAR and other non-contact optical measurement techniques, including optical methods in heat and fluid flow
•applications of lasers to materials processing, optical NDT display (including holography) and optical communication
•research and development in the field of laser safety including studies of hazards resulting from the applications of lasers (laser safety, hazards of laser fume)
•developments in optical computing and optical information processing
•developments in new optical materials
•developments in new optical characterization methods and techniques
•developments in quantum optics
•developments in light assisted micro and nanofabrication methods and techniques
•developments in nanophotonics and biophotonics
•developments in imaging processing and systems