MSCS：红外与可见光图像融合的多阶段特征学习与通道空间关注机制

IF 3.1 3区物理与天体物理 Q2 INSTRUMENTS & INSTRUMENTATION

Infrared Physics & Technology Pub Date : 2024-08-22 DOI:10.1016/j.infrared.2024.105514

Zhenghua Huang , Biyun Xu , Menghan Xia , Qian Li , Lianying Zou , Shaoyi Li , Xi Li

{"title":"MSCS：红外与可见光图像融合的多阶段特征学习与通道空间关注机制","authors":"Zhenghua Huang , Biyun Xu , Menghan Xia , Qian Li , Lianying Zou , Shaoyi Li , Xi Li","doi":"10.1016/j.infrared.2024.105514","DOIUrl":null,"url":null,"abstract":"<div><p>The intention of infrared and visible image fusion is to combine the images captured by different modal sensors in the same scene to enhance its understanding. Deep learning has been proven its powerful application in image fusion due to its fine generalization, robustness, and representability of deep features. However, the performance of these deep learning-based methods heavily depends on the illumination condition. Especially in dark or exposed scenes, the fused results are over-smoothness and low-contrast, resulting in inaccuracy of object detection. To address these issues, this paper develops a multi-stage feature learning approach with channel-spatial attention mechanism, namely MSCS, for infrared and visible image fusion. The MSCS is composed of the following four key procedures: Firstly, the infrared and visible images are decomposed into illumination and reflectance components by a proposed network called as Retinex_Net. Then, the components are transported to an encoder for features coding. Next, we propose an adaptive fusion module with attention mechanisms to fuse the features. Finally, the fused image is generated by the decoder for decoding the fused features. Meanwhile, a novel fusion loss function and a multi-stage training strategy are proposed to train the above modules. The subjective and objective results of experiments on <em>TNO</em>, <em>LLVIP</em> and <em>MSRS</em> datasets illustrate that the proposed method is effective and performs better than the state-of-the-art fusion methods on achieving enjoyable results in dark or over-exposure scenes. And the results of further experiments on the fused images for object detection demonstrate that the fusion outputs produced by our MSCS are more beneficial for detection tasks.</p></div>","PeriodicalId":13549,"journal":{"name":"Infrared Physics & Technology","volume":"142 ","pages":"Article 105514"},"PeriodicalIF":3.1000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MSCS: Multi-stage feature learning with channel-spatial attention mechanism for infrared and visible image fusion\",\"authors\":\"Zhenghua Huang , Biyun Xu , Menghan Xia , Qian Li , Lianying Zou , Shaoyi Li , Xi Li\",\"doi\":\"10.1016/j.infrared.2024.105514\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The intention of infrared and visible image fusion is to combine the images captured by different modal sensors in the same scene to enhance its understanding. Deep learning has been proven its powerful application in image fusion due to its fine generalization, robustness, and representability of deep features. However, the performance of these deep learning-based methods heavily depends on the illumination condition. Especially in dark or exposed scenes, the fused results are over-smoothness and low-contrast, resulting in inaccuracy of object detection. To address these issues, this paper develops a multi-stage feature learning approach with channel-spatial attention mechanism, namely MSCS, for infrared and visible image fusion. The MSCS is composed of the following four key procedures: Firstly, the infrared and visible images are decomposed into illumination and reflectance components by a proposed network called as Retinex_Net. Then, the components are transported to an encoder for features coding. Next, we propose an adaptive fusion module with attention mechanisms to fuse the features. Finally, the fused image is generated by the decoder for decoding the fused features. Meanwhile, a novel fusion loss function and a multi-stage training strategy are proposed to train the above modules. The subjective and objective results of experiments on <em>TNO</em>, <em>LLVIP</em> and <em>MSRS</em> datasets illustrate that the proposed method is effective and performs better than the state-of-the-art fusion methods on achieving enjoyable results in dark or over-exposure scenes. And the results of further experiments on the fused images for object detection demonstrate that the fusion outputs produced by our MSCS are more beneficial for detection tasks.</p></div>\",\"PeriodicalId\":13549,\"journal\":{\"name\":\"Infrared Physics & Technology\",\"volume\":\"142 \",\"pages\":\"Article 105514\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infrared Physics & Technology\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1350449524003980\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INSTRUMENTS & INSTRUMENTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infrared Physics & Technology","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350449524003980","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}

引用次数: 0

摘要

红外和可见光图像融合的目的是将不同模式传感器在同一场景中捕捉到的图像结合起来，以增强对场景的理解。深度学习因其精细的泛化、鲁棒性和深度特征的可表示性，已被证明在图像融合中具有强大的应用价值。然而，这些基于深度学习的方法的性能在很大程度上取决于光照条件。特别是在黑暗或曝光的场景中，融合后的结果会过度平滑且对比度低，从而导致物体检测不准确。为解决这些问题，本文针对红外图像和可见光图像的融合，开发了一种具有通道空间注意机制的多阶段特征学习方法，即 MSCS。MSCS 由以下四个关键步骤组成：首先，通过一个名为 Retinex_Net 的拟议网络将红外图像和可见光图像分解为光照分量和反射分量。然后，这些分量被传送到编码器进行特征编码。接着，我们提出了一个具有注意机制的自适应融合模块来融合特征。最后，解码器生成融合图像，对融合特征进行解码。同时，我们还提出了一种新颖的融合损失函数和多阶段训练策略来训练上述模块。在 TNO、LLVIP 和 MSRS 数据集上的主观和客观实验结果表明，所提出的方法是有效的，而且在黑暗或过曝场景中实现令人满意的融合效果方面，其表现优于最先进的融合方法。对融合图像进行物体检测的进一步实验结果表明，我们的 MSCS 所产生的融合输出更有利于检测任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MSCS: Multi-stage feature learning with channel-spatial attention mechanism for infrared and visible image fusion

The intention of infrared and visible image fusion is to combine the images captured by different modal sensors in the same scene to enhance its understanding. Deep learning has been proven its powerful application in image fusion due to its fine generalization, robustness, and representability of deep features. However, the performance of these deep learning-based methods heavily depends on the illumination condition. Especially in dark or exposed scenes, the fused results are over-smoothness and low-contrast, resulting in inaccuracy of object detection. To address these issues, this paper develops a multi-stage feature learning approach with channel-spatial attention mechanism, namely MSCS, for infrared and visible image fusion. The MSCS is composed of the following four key procedures: Firstly, the infrared and visible images are decomposed into illumination and reflectance components by a proposed network called as Retinex_Net. Then, the components are transported to an encoder for features coding. Next, we propose an adaptive fusion module with attention mechanisms to fuse the features. Finally, the fused image is generated by the decoder for decoding the fused features. Meanwhile, a novel fusion loss function and a multi-stage training strategy are proposed to train the above modules. The subjective and objective results of experiments on TNO, LLVIP and MSRS datasets illustrate that the proposed method is effective and performs better than the state-of-the-art fusion methods on achieving enjoyable results in dark or over-exposure scenes. And the results of further experiments on the fused images for object detection demonstrate that the fusion outputs produced by our MSCS are more beneficial for detection tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Infrared Physics & Technology 物理-光学

CiteScore

5.70

自引率

12.10%

发文量

400

审稿时长

67 days

期刊介绍： The Journal covers the entire field of infrared physics and technology: theory, experiment, application, devices and instrumentation. Infrared'' is defined as covering the near, mid and far infrared (terahertz) regions from 0.75um (750nm) to 1mm (300GHz.) Submissions in the 300GHz to 100GHz region may be accepted at the editors discretion if their content is relevant to shorter wavelengths. Submissions must be primarily concerned with and directly relevant to this spectral region. Its core topics can be summarized as the generation, propagation and detection, of infrared radiation; the associated optics, materials and devices; and its use in all fields of science, industry, engineering and medicine. Infrared techniques occur in many different fields, notably spectroscopy and interferometry; material characterization and processing; atmospheric physics, astronomy and space research. Scientific aspects include lasers, quantum optics, quantum electronics, image processing and semiconductor physics. Some important applications are medical diagnostics and treatment, industrial inspection and environmental monitoring.