基于全球背景网络的红外和可见光图像融合

IF 1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC
Yonghong Li, Yu Shi, Xingcheng Pu, Suqiang Zhang
{"title":"基于全球背景网络的红外和可见光图像融合","authors":"Yonghong Li, Yu Shi, Xingcheng Pu, Suqiang Zhang","doi":"10.1117/1.jei.33.5.053016","DOIUrl":null,"url":null,"abstract":"Thermal radiation and texture data from two different sensor types are usually combined in the fusion of infrared and visible images for generating a single image. In recent years, convolutional neural network (CNN) based on deep learning has become the mainstream technology for many infrared and visible image fusion methods, which often extracts shallow features and ignores the role of long-range dependencies in the fusion task. However, due to its local perception characteristics, CNN can only obtain global contextual information by continuously stacking convolutional layers, which leads to low network efficiency and difficulty in optimization. To address this issue, we proposed a global context fusion network (GCFN) to model context using a global attention pool, which adopts a two-stage strategy. First, a GCFN-based autoencoder network is trained for extracting multi-scale local and global contextual features. To effectively incorporate the complementary information of the input image, a dual branch fusion network combining CNN and transformer is designed in the second step. Experimental results on a publicly available dataset demonstrate that the proposed method outperforms nine advanced methods in fusion performance on both subjective and objective metrics.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"23 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Infrared and visible image fusion based on global context network\",\"authors\":\"Yonghong Li, Yu Shi, Xingcheng Pu, Suqiang Zhang\",\"doi\":\"10.1117/1.jei.33.5.053016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Thermal radiation and texture data from two different sensor types are usually combined in the fusion of infrared and visible images for generating a single image. In recent years, convolutional neural network (CNN) based on deep learning has become the mainstream technology for many infrared and visible image fusion methods, which often extracts shallow features and ignores the role of long-range dependencies in the fusion task. However, due to its local perception characteristics, CNN can only obtain global contextual information by continuously stacking convolutional layers, which leads to low network efficiency and difficulty in optimization. To address this issue, we proposed a global context fusion network (GCFN) to model context using a global attention pool, which adopts a two-stage strategy. First, a GCFN-based autoencoder network is trained for extracting multi-scale local and global contextual features. To effectively incorporate the complementary information of the input image, a dual branch fusion network combining CNN and transformer is designed in the second step. Experimental results on a publicly available dataset demonstrate that the proposed method outperforms nine advanced methods in fusion performance on both subjective and objective metrics.\",\"PeriodicalId\":54843,\"journal\":{\"name\":\"Journal of Electronic Imaging\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Electronic Imaging\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1117/1.jei.33.5.053016\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Electronic Imaging","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1117/1.jei.33.5.053016","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

在红外图像和可见光图像的融合过程中,通常会将来自两种不同传感器类型的热辐射和纹理数据结合起来,生成单一图像。近年来,基于深度学习的卷积神经网络(CNN)已成为许多红外图像与可见光图像融合方法的主流技术,它往往提取浅层特征,忽略了长距离依赖关系在融合任务中的作用。然而,由于其局部感知特性,CNN 只能通过不断堆叠卷积层来获取全局上下文信息,导致网络效率低、优化困难。针对这一问题,我们提出了全局上下文融合网络(GCFN),利用全局注意力池建立上下文模型,采用两阶段策略。首先,训练基于 GCFN 的自动编码器网络,以提取多尺度的局部和全局上下文特征。为了有效整合输入图像的互补信息,第二步设计了一个结合 CNN 和变换器的双分支融合网络。在公开数据集上的实验结果表明,所提出的方法在主观和客观指标上的融合性能均优于九种先进方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Infrared and visible image fusion based on global context network
Thermal radiation and texture data from two different sensor types are usually combined in the fusion of infrared and visible images for generating a single image. In recent years, convolutional neural network (CNN) based on deep learning has become the mainstream technology for many infrared and visible image fusion methods, which often extracts shallow features and ignores the role of long-range dependencies in the fusion task. However, due to its local perception characteristics, CNN can only obtain global contextual information by continuously stacking convolutional layers, which leads to low network efficiency and difficulty in optimization. To address this issue, we proposed a global context fusion network (GCFN) to model context using a global attention pool, which adopts a two-stage strategy. First, a GCFN-based autoencoder network is trained for extracting multi-scale local and global contextual features. To effectively incorporate the complementary information of the input image, a dual branch fusion network combining CNN and transformer is designed in the second step. Experimental results on a publicly available dataset demonstrate that the proposed method outperforms nine advanced methods in fusion performance on both subjective and objective metrics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Electronic Imaging
Journal of Electronic Imaging 工程技术-成像科学与照相技术
CiteScore
1.70
自引率
27.30%
发文量
341
审稿时长
4.0 months
期刊介绍: The Journal of Electronic Imaging publishes peer-reviewed papers in all technology areas that make up the field of electronic imaging and are normally considered in the design, engineering, and applications of electronic imaging systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信