Infrared/Visible Light Fire Image Fusion Method Based on Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer

IF 2.4 2区 农林科学 Q1 FORESTRY
Forests Pub Date : 2024-06-01 DOI:10.3390/f15060976
Haicheng Wei, Xinping Fu, Zhuokang Wang, Jing Zhao
{"title":"Infrared/Visible Light Fire Image Fusion Method Based on Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer","authors":"Haicheng Wei, Xinping Fu, Zhuokang Wang, Jing Zhao","doi":"10.3390/f15060976","DOIUrl":null,"url":null,"abstract":"To address issues of detail loss, limited matching datasets, and low fusion accuracy in infrared/visible light fire image fusion, a novel method based on the Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer (VTW-GAN) is proposed. The algorithm employs a generator and discriminator network architecture, integrating the efficient global representation capability of Transformers with wavelet-guided pooling for extracting finer-grained features and reconstructing higher-quality fusion images. To overcome the shortage of image data, transfer learning is utilized to apply the well-trained model to fire image fusion, thereby improving fusion precision. The experimental results demonstrate that VTW-GAN outperforms the DenseFuse, IFCNN, U2Fusion, SwinFusion, and TGFuse methods in both objective and subjective aspects. Specifically, on the KAIST dataset, the fusion images show significant improvements in Entropy (EN), Mutual Information (MI), and Quality Assessment based on Gradient-based Fusion (Qabf) by 2.78%, 11.89%, and 10.45%, respectively, over the next-best values. On the Corsican Fire dataset, compared to data-limited fusion models, the transfer-learned fusion images enhance the Standard Deviation (SD) and MI by 10.69% and 11.73%, respectively, and compared to other methods, they perform well in Average Gradient (AG), SD, and MI, improving them by 3.43%, 4.84%, and 4.21%, respectively, from the next-best values. Compared with DenseFuse, the operation efficiency is improved by 78.3%. The method achieves favorable subjective image outcomes and is effective for fire-detection applications.","PeriodicalId":12339,"journal":{"name":"Forests","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Forests","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.3390/f15060976","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"FORESTRY","Score":null,"Total":0}
引用次数: 0

Abstract

To address issues of detail loss, limited matching datasets, and low fusion accuracy in infrared/visible light fire image fusion, a novel method based on the Generative Adversarial Network of Wavelet-Guided Pooling Vision Transformer (VTW-GAN) is proposed. The algorithm employs a generator and discriminator network architecture, integrating the efficient global representation capability of Transformers with wavelet-guided pooling for extracting finer-grained features and reconstructing higher-quality fusion images. To overcome the shortage of image data, transfer learning is utilized to apply the well-trained model to fire image fusion, thereby improving fusion precision. The experimental results demonstrate that VTW-GAN outperforms the DenseFuse, IFCNN, U2Fusion, SwinFusion, and TGFuse methods in both objective and subjective aspects. Specifically, on the KAIST dataset, the fusion images show significant improvements in Entropy (EN), Mutual Information (MI), and Quality Assessment based on Gradient-based Fusion (Qabf) by 2.78%, 11.89%, and 10.45%, respectively, over the next-best values. On the Corsican Fire dataset, compared to data-limited fusion models, the transfer-learned fusion images enhance the Standard Deviation (SD) and MI by 10.69% and 11.73%, respectively, and compared to other methods, they perform well in Average Gradient (AG), SD, and MI, improving them by 3.43%, 4.84%, and 4.21%, respectively, from the next-best values. Compared with DenseFuse, the operation efficiency is improved by 78.3%. The method achieves favorable subjective image outcomes and is effective for fire-detection applications.
基于小波引导集合视觉变换器生成式对抗网络的红外/可见光火灾图像融合方法
为了解决红外/可见光火灾图像融合中细节丢失、匹配数据集有限和融合精度低等问题,提出了一种基于小波引导集合视觉变换器生成对抗网络(VTW-GAN)的新方法。该算法采用生成器和判别器网络结构,将变换器的高效全局表示能力与小波引导池化技术相结合,以提取更细粒度的特征并重建更高质量的融合图像。为了克服图像数据不足的问题,利用迁移学习将训练有素的模型应用到火图像融合中,从而提高融合精度。实验结果表明,VTW-GAN 在客观和主观方面都优于 DenseFuse、IFCNN、U2Fusion、SwinFusion 和 TGFuse 方法。具体来说,在 KAIST 数据集上,融合图像在熵 (EN)、互信息 (MI) 和基于梯度融合的质量评估 (Qabf) 方面比次佳值分别提高了 2.78%、11.89% 和 10.45%。在科西嘉火灾数据集上,与数据有限的融合模型相比,迁移学习融合图像的标准偏差(SD)和MI分别提高了10.69%和11.73%;与其他方法相比,迁移学习融合图像在平均梯度(AG)、SD和MI方面表现良好,分别比次佳值提高了3.43%、4.84%和4.21%。与 DenseFuse 相比,操作效率提高了 78.3%。该方法获得了良好的主观图像效果,在火灾探测应用中非常有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Forests
Forests FORESTRY-
CiteScore
4.40
自引率
17.20%
发文量
1823
审稿时长
19.02 days
期刊介绍: Forests (ISSN 1999-4907) is an international and cross-disciplinary scholarly journal of forestry and forest ecology. It publishes research papers, short communications and review papers. There is no restriction on the length of the papers. Our aim is to encourage scientists to publish their experimental and theoretical research in as much detail as possible. Full experimental and/or methodical details must be provided for research articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信