Xiao-jing Shi, Zhen Wang, Xinping Pan, Junjie Li, Ke Wang
{"title":"Dual-branch visible and infrared image fusion transformer","authors":"Xiao-jing Shi, Zhen Wang, Xinping Pan, Junjie Li, Ke Wang","doi":"10.1117/12.2691207","DOIUrl":null,"url":null,"abstract":"The process of combining features from two images of different sources to generate a new image is called image fusion. In order to adapt to different application scenarios, deep learning was widely used. However, existing fusion networks focued on the extraction of local information, neglected the long-term dependencies. In order to improve the defect, a fusion network based on Transformer was proposed. To accommodate our experimental equipment, we made some modifications to Transformer. A dual-branch autoencoder network was designed with detail and semantic branches, the fusion layer consists of CNN and Transformer, and the decoder reconstructs the features to get the fused image. A new loss function was proposed to train the network. Based on the results, an infrared feature compensation network was designed to enhance the fusion effect. In several metrics that we focus on, we compared with several other algorithms. As the experiments on some datasets, our method had improvement on SCD, SSIM and MS-SSIM metrics, and was basically equal to other algorithms on saliency-based structural similarity, weighted quality assessment, and dge-based structural similarity. From the experimental results, we can see that our method was feasible.","PeriodicalId":114868,"journal":{"name":"International Conference on Optoelectronic Information and Computer Engineering (OICE)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Optoelectronic Information and Computer Engineering (OICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2691207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The process of combining features from two images of different sources to generate a new image is called image fusion. In order to adapt to different application scenarios, deep learning was widely used. However, existing fusion networks focued on the extraction of local information, neglected the long-term dependencies. In order to improve the defect, a fusion network based on Transformer was proposed. To accommodate our experimental equipment, we made some modifications to Transformer. A dual-branch autoencoder network was designed with detail and semantic branches, the fusion layer consists of CNN and Transformer, and the decoder reconstructs the features to get the fused image. A new loss function was proposed to train the network. Based on the results, an infrared feature compensation network was designed to enhance the fusion effect. In several metrics that we focus on, we compared with several other algorithms. As the experiments on some datasets, our method had improvement on SCD, SSIM and MS-SSIM metrics, and was basically equal to other algorithms on saliency-based structural similarity, weighted quality assessment, and dge-based structural similarity. From the experimental results, we can see that our method was feasible.