Enhanced visible light detail and infrared thermal radiation for dual-mode imaging system via multi-information interaction

IF 3.1 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation Pub Date : 2025-09-13 DOI:10.1016/j.jvcir.2025.104583

Xiaosong Liu, Huaibin Qiu, Zhuolin Ou, Jiazhen Dou, Jianglei Di, Yuwen Qin

{"title":"Enhanced visible light detail and infrared thermal radiation for dual-mode imaging system via multi-information interaction","authors":"Xiaosong Liu, Huaibin Qiu, Zhuolin Ou, Jiazhen Dou, Jianglei Di, Yuwen Qin","doi":"10.1016/j.jvcir.2025.104583","DOIUrl":null,"url":null,"abstract":"<div><div>In the field of dual-mode optical imaging, image fusion techniques offer significant advantages such as improved spatial resolution and the suppression of redundant information, particularly for visible and infrared image. However, existing fusion methods often overlook the interaction between multiple feature information during the extraction and fusion stages, resulting in the inability to extract both visible light detail and infrared thermal radiation effectively. To address this challenge, we construct a dual-mode imaging system and propose an image fusion method that incorporates Convolution-Swin-Transformer Blocks (CSTBs). The block combines Convolution and Shifted Window Transformer to improve the interaction and extraction between local and global information within the images. On the other hand, our proposed method strengthens the comprehensive interaction and fusion between shallow pixel-level information and deeper semantic representation by fusing local and global feature information at various layers. Furthermore, we introduce a multi-component loss function that balances the complementary features extracted from the source images, with a particular focus on enhancing edge texture, structure, and brightness information. Experimental results demonstrate that our method achieves superior performance in simultaneously enhancing both texture details and thermal radiation. This is evidenced by results on two publicly available datasets, as well as the Target_GDUT dataset captured using our dual-mode optical imaging system.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"112 ","pages":"Article 104583"},"PeriodicalIF":3.1000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S104732032500197X","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In the field of dual-mode optical imaging, image fusion techniques offer significant advantages such as improved spatial resolution and the suppression of redundant information, particularly for visible and infrared image. However, existing fusion methods often overlook the interaction between multiple feature information during the extraction and fusion stages, resulting in the inability to extract both visible light detail and infrared thermal radiation effectively. To address this challenge, we construct a dual-mode imaging system and propose an image fusion method that incorporates Convolution-Swin-Transformer Blocks (CSTBs). The block combines Convolution and Shifted Window Transformer to improve the interaction and extraction between local and global information within the images. On the other hand, our proposed method strengthens the comprehensive interaction and fusion between shallow pixel-level information and deeper semantic representation by fusing local and global feature information at various layers. Furthermore, we introduce a multi-component loss function that balances the complementary features extracted from the source images, with a particular focus on enhancing edge texture, structure, and brightness information. Experimental results demonstrate that our method achieves superior performance in simultaneously enhancing both texture details and thermal radiation. This is evidenced by results on two publicly available datasets, as well as the Target_GDUT dataset captured using our dual-mode optical imaging system.

查看原文本刊更多论文

通过多信息交互增强双模成像系统的可见光细节和红外热辐射

在双模光学成像领域，图像融合技术在提高空间分辨率和抑制冗余信息等方面具有显著的优势，特别是对于可见光和红外图像。然而，现有的融合方法在提取和融合阶段往往忽略了多个特征信息之间的相互作用，导致无法有效地提取可见光细节和红外热辐射。为了解决这一挑战，我们构建了一个双模成像系统，并提出了一种包含卷积-旋转-变压器块（CSTBs）的图像融合方法。该块结合了卷积和移位窗口变换，改善了图像局部和全局信息的交互和提取。另一方面，我们提出的方法通过融合各层的局部和全局特征信息，加强了浅像素级信息与深层语义表示之间的全面交互和融合。此外，我们引入了一个多分量损失函数来平衡从源图像中提取的互补特征，特别注重增强边缘纹理、结构和亮度信息。实验结果表明，该方法在纹理细节增强和热辐射增强两方面都取得了较好的效果。两个公开可用的数据集的结果以及使用我们的双模光学成像系统捕获的Target_GDUT数据集证明了这一点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Visual Communication and Image Representation 工程技术-计算机：软件工程

CiteScore

5.40

自引率

11.50%

发文量

188

审稿时长

9.9 months

期刊介绍： The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.