MSCS: Multi-stage feature learning with channel-spatial attention mechanism for infrared and visible image fusion

IF 3.1 3区 物理与天体物理 Q2 INSTRUMENTS & INSTRUMENTATION
Zhenghua Huang , Biyun Xu , Menghan Xia , Qian Li , Lianying Zou , Shaoyi Li , Xi Li
{"title":"MSCS: Multi-stage feature learning with channel-spatial attention mechanism for infrared and visible image fusion","authors":"Zhenghua Huang ,&nbsp;Biyun Xu ,&nbsp;Menghan Xia ,&nbsp;Qian Li ,&nbsp;Lianying Zou ,&nbsp;Shaoyi Li ,&nbsp;Xi Li","doi":"10.1016/j.infrared.2024.105514","DOIUrl":null,"url":null,"abstract":"<div><p>The intention of infrared and visible image fusion is to combine the images captured by different modal sensors in the same scene to enhance its understanding. Deep learning has been proven its powerful application in image fusion due to its fine generalization, robustness, and representability of deep features. However, the performance of these deep learning-based methods heavily depends on the illumination condition. Especially in dark or exposed scenes, the fused results are over-smoothness and low-contrast, resulting in inaccuracy of object detection. To address these issues, this paper develops a multi-stage feature learning approach with channel-spatial attention mechanism, namely MSCS, for infrared and visible image fusion. The MSCS is composed of the following four key procedures: Firstly, the infrared and visible images are decomposed into illumination and reflectance components by a proposed network called as Retinex_Net. Then, the components are transported to an encoder for features coding. Next, we propose an adaptive fusion module with attention mechanisms to fuse the features. Finally, the fused image is generated by the decoder for decoding the fused features. Meanwhile, a novel fusion loss function and a multi-stage training strategy are proposed to train the above modules. The subjective and objective results of experiments on <em>TNO</em>, <em>LLVIP</em> and <em>MSRS</em> datasets illustrate that the proposed method is effective and performs better than the state-of-the-art fusion methods on achieving enjoyable results in dark or over-exposure scenes. And the results of further experiments on the fused images for object detection demonstrate that the fusion outputs produced by our MSCS are more beneficial for detection tasks.</p></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"142 ","pages":"Article 105514"},"PeriodicalIF":3.1000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449524003980","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0

Abstract

The intention of infrared and visible image fusion is to combine the images captured by different modal sensors in the same scene to enhance its understanding. Deep learning has been proven its powerful application in image fusion due to its fine generalization, robustness, and representability of deep features. However, the performance of these deep learning-based methods heavily depends on the illumination condition. Especially in dark or exposed scenes, the fused results are over-smoothness and low-contrast, resulting in inaccuracy of object detection. To address these issues, this paper develops a multi-stage feature learning approach with channel-spatial attention mechanism, namely MSCS, for infrared and visible image fusion. The MSCS is composed of the following four key procedures: Firstly, the infrared and visible images are decomposed into illumination and reflectance components by a proposed network called as Retinex_Net. Then, the components are transported to an encoder for features coding. Next, we propose an adaptive fusion module with attention mechanisms to fuse the features. Finally, the fused image is generated by the decoder for decoding the fused features. Meanwhile, a novel fusion loss function and a multi-stage training strategy are proposed to train the above modules. The subjective and objective results of experiments on TNO, LLVIP and MSRS datasets illustrate that the proposed method is effective and performs better than the state-of-the-art fusion methods on achieving enjoyable results in dark or over-exposure scenes. And the results of further experiments on the fused images for object detection demonstrate that the fusion outputs produced by our MSCS are more beneficial for detection tasks.

MSCS:红外与可见光图像融合的多阶段特征学习与通道空间关注机制
红外和可见光图像融合的目的是将不同模式传感器在同一场景中捕捉到的图像结合起来,以增强对场景的理解。深度学习因其精细的泛化、鲁棒性和深度特征的可表示性,已被证明在图像融合中具有强大的应用价值。然而,这些基于深度学习的方法的性能在很大程度上取决于光照条件。特别是在黑暗或曝光的场景中,融合后的结果会过度平滑且对比度低,从而导致物体检测不准确。为解决这些问题,本文针对红外图像和可见光图像的融合,开发了一种具有通道空间注意机制的多阶段特征学习方法,即 MSCS。MSCS 由以下四个关键步骤组成:首先,通过一个名为 Retinex_Net 的拟议网络将红外图像和可见光图像分解为光照分量和反射分量。然后,这些分量被传送到编码器进行特征编码。接着,我们提出了一个具有注意机制的自适应融合模块来融合特征。最后,解码器生成融合图像,对融合特征进行解码。同时,我们还提出了一种新颖的融合损失函数和多阶段训练策略来训练上述模块。在 TNO、LLVIP 和 MSRS 数据集上的主观和客观实验结果表明,所提出的方法是有效的,而且在黑暗或过曝场景中实现令人满意的融合效果方面,其表现优于最先进的融合方法。对融合图像进行物体检测的进一步实验结果表明,我们的 MSCS 所产生的融合输出更有利于检测任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.70
自引率
12.10%
发文量
400
审稿时长
67 days
期刊介绍: The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region. Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine. Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信