MRFuse: Metric learning and masked autoencoder for fusing real infrared and visible images

IF 4.6 2区 物理与天体物理 Q1 OPTICS
YuBin Li , Weida Zhan , Jinxin Guo , Depeng Zhu , Yichun Jiang , Yu Chen , Xiaoyu Xu , Deng Han
{"title":"MRFuse: Metric learning and masked autoencoder for fusing real infrared and visible images","authors":"YuBin Li ,&nbsp;Weida Zhan ,&nbsp;Jinxin Guo ,&nbsp;Depeng Zhu ,&nbsp;Yichun Jiang ,&nbsp;Yu Chen ,&nbsp;Xiaoyu Xu ,&nbsp;Deng Han","doi":"10.1016/j.optlastec.2025.112971","DOIUrl":null,"url":null,"abstract":"<div><div>The task of infrared and visible image fusion aims to retain the thermal targets from infrared images while preserving the details, brightness, and other important features from visible images. Current methods face challenges such as unclear fusion objectives, difficulty in interpreting the learning process, and uncontrollable auxiliary learning weights. To address these issues, this paper proposes a novel fusion method based on metric learning and masked autoencoders for real infrared and visible image fusion, termed MRFuse. MRFuse operates through a combination of metric mapping space, auxiliary networks, and fusion networks. First, we introduce a Real Degradation Estimation Module (RDEM), which employs a simple neural network to establish a controllable degradation estimation scheme within the metric space. Additionally, to train the metric space, we propose a sample generation method that provides complex training samples for the metric learning pipeline. Next, we present a fusion network based on masked autoencoding. Specifically, we construct hybrid masked infrared and visible image pairs and design a U-shaped ViT encoder–decoder architecture. This architecture leverages hierarchical feature representation and layer-wise fusion to reconstruct high-quality fused images. Finally, to train the fusion network, we design a masked region loss to constrain reconstruction errors within masked regions, and further employ gradient loss, structural consistency loss, and perceptual loss to enhance the quality of the fused images. Extensive experiments demonstrate that MRFuse exhibits superior controllability and excels in suppressing noise, blur, and glare, outperforming other state-of-the-art methods in both subjective and objective evaluations.</div></div>","PeriodicalId":19511,"journal":{"name":"Optics and Laser Technology","volume":"189 ","pages":"Article 112971"},"PeriodicalIF":4.6000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics and Laser Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0030399225005626","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0

Abstract

The task of infrared and visible image fusion aims to retain the thermal targets from infrared images while preserving the details, brightness, and other important features from visible images. Current methods face challenges such as unclear fusion objectives, difficulty in interpreting the learning process, and uncontrollable auxiliary learning weights. To address these issues, this paper proposes a novel fusion method based on metric learning and masked autoencoders for real infrared and visible image fusion, termed MRFuse. MRFuse operates through a combination of metric mapping space, auxiliary networks, and fusion networks. First, we introduce a Real Degradation Estimation Module (RDEM), which employs a simple neural network to establish a controllable degradation estimation scheme within the metric space. Additionally, to train the metric space, we propose a sample generation method that provides complex training samples for the metric learning pipeline. Next, we present a fusion network based on masked autoencoding. Specifically, we construct hybrid masked infrared and visible image pairs and design a U-shaped ViT encoder–decoder architecture. This architecture leverages hierarchical feature representation and layer-wise fusion to reconstruct high-quality fused images. Finally, to train the fusion network, we design a masked region loss to constrain reconstruction errors within masked regions, and further employ gradient loss, structural consistency loss, and perceptual loss to enhance the quality of the fused images. Extensive experiments demonstrate that MRFuse exhibits superior controllability and excels in suppressing noise, blur, and glare, outperforming other state-of-the-art methods in both subjective and objective evaluations.
MRFuse:用于融合真实红外和可见光图像的度量学习和掩码自动编码器
红外图像与可见光图像融合的目的是保留红外图像中的热目标,同时保留可见光图像中的细节、亮度等重要特征。目前的方法面临融合目标不明确、学习过程难以解释以及辅助学习权不可控等挑战。为了解决这些问题,本文提出了一种基于度量学习和掩码自编码器的红外和可见光图像融合新方法,称为MRFuse。MRFuse通过度量映射空间、辅助网络和融合网络的组合来运行。首先,我们引入了一个真实退化估计模块(Real Degradation Estimation Module, RDEM),该模块使用一个简单的神经网络在度量空间内建立一个可控的退化估计方案。此外,为了训练度量空间,我们提出了一种样本生成方法,为度量学习管道提供复杂的训练样本。其次,提出了一种基于掩码自编码的融合网络。具体而言,我们构建了混合掩模红外和可见光图像对,并设计了一个u型ViT编解码器架构。该体系结构利用分层特征表示和分层融合来重建高质量的融合图像。最后,为了训练融合网络,我们设计了一个掩蔽区域损失来约束掩蔽区域内的重建误差,并进一步使用梯度损失、结构一致性损失和感知损失来提高融合图像的质量。大量实验表明,MRFuse具有优越的可控性,在抑制噪声、模糊和眩光方面表现出色,在主观和客观评估方面都优于其他最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
8.50
自引率
10.00%
发文量
1060
审稿时长
3.4 months
期刊介绍: Optics & Laser Technology aims to provide a vehicle for the publication of a broad range of high quality research and review papers in those fields of scientific and engineering research appertaining to the development and application of the technology of optics and lasers. Papers describing original work in these areas are submitted to rigorous refereeing prior to acceptance for publication. The scope of Optics & Laser Technology encompasses, but is not restricted to, the following areas: •development in all types of lasers •developments in optoelectronic devices and photonics •developments in new photonics and optical concepts •developments in conventional optics, optical instruments and components •techniques of optical metrology, including interferometry and optical fibre sensors •LIDAR and other non-contact optical measurement techniques, including optical methods in heat and fluid flow •applications of lasers to materials processing, optical NDT display (including holography) and optical communication •research and development in the field of laser safety including studies of hazards resulting from the applications of lasers (laser safety, hazards of laser fume) •developments in optical computing and optical information processing •developments in new optical materials •developments in new optical characterization methods and techniques •developments in quantum optics •developments in light assisted micro and nanofabrication methods and techniques •developments in nanophotonics and biophotonics •developments in imaging processing and systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信