基于全尺度特征融合和余弦对比学习的无监督红外图像着色对抗网络

IF 6.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-06-18 DOI:10.1016/j.neucom.2025.130713

Tingting Liu , Yujue Cai , Guiping Chen , Hongguang Wei , Junqi Bai , Yuan Liu , Xiubao Sui , Qian Chen

{"title":"基于全尺度特征融合和余弦对比学习的无监督红外图像着色对抗网络","authors":"Tingting Liu , Yujue Cai , Guiping Chen , Hongguang Wei , Junqi Bai , Yuan Liu , Xiubao Sui , Qian Chen","doi":"10.1016/j.neucom.2025.130713","DOIUrl":null,"url":null,"abstract":"<div><div>Thermal infrared images, unaffected by lighting and haze, are widely used in security surveillance, autonomous vehicles, and nighttime traffic monitoring. However, their grayscale nature lacks color and texture details, limiting applications in image recognition and object detection. Converting infrared images to daytime color enhances visual perception and broadens their utility. Despite advancements in infrared image colorization, challenges such as texture distortion, detail blurring, and poor image quality persist. To address these issues, a novel unsupervised learning framework, termed Cosine Contrastive Learning Generative Adversarial Network (CCLGAN), is proposed. Firstly, the traditional UNet architecture is improved by introducing full-scale skip connections and deep supervision. Full-scale skip connections integrate low-level details with high-level semantic features, while deep supervision aids in learning hierarchical feature maps. Additionally, a parameter-free neuron-based 3D attention mechanism is incorporated into the Mamba module to capture long-range dependencies and enable effective feature selection and fusion. Secondly, a novel contrastive loss function is designed, incorporating cosine distance metrics into the traditional contrastive loss framework. By maximizing cosine decision margins and normalizing, intra-class variance is minimized, and inter-class variance is maximized, ensuring consistency between input infrared image patches and output color image patches. Finally, extensive comparative analysis on common datasets demonstrates that the proposed method outperforms existing state-of-the-art techniques in colorization performance. This research advances infrared image processing and enhances the visual quality of converted images. The code is available at <span><span>https://github.com/LTTdouble/CCLGAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"649 ","pages":"Article 130713"},"PeriodicalIF":6.5000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adversarial network for unsupervised infrared image colorization based on full-scale feature fusion and cosine contrastive learning\",\"authors\":\"Tingting Liu , Yujue Cai , Guiping Chen , Hongguang Wei , Junqi Bai , Yuan Liu , Xiubao Sui , Qian Chen\",\"doi\":\"10.1016/j.neucom.2025.130713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Thermal infrared images, unaffected by lighting and haze, are widely used in security surveillance, autonomous vehicles, and nighttime traffic monitoring. However, their grayscale nature lacks color and texture details, limiting applications in image recognition and object detection. Converting infrared images to daytime color enhances visual perception and broadens their utility. Despite advancements in infrared image colorization, challenges such as texture distortion, detail blurring, and poor image quality persist. To address these issues, a novel unsupervised learning framework, termed Cosine Contrastive Learning Generative Adversarial Network (CCLGAN), is proposed. Firstly, the traditional UNet architecture is improved by introducing full-scale skip connections and deep supervision. Full-scale skip connections integrate low-level details with high-level semantic features, while deep supervision aids in learning hierarchical feature maps. Additionally, a parameter-free neuron-based 3D attention mechanism is incorporated into the Mamba module to capture long-range dependencies and enable effective feature selection and fusion. Secondly, a novel contrastive loss function is designed, incorporating cosine distance metrics into the traditional contrastive loss framework. By maximizing cosine decision margins and normalizing, intra-class variance is minimized, and inter-class variance is maximized, ensuring consistency between input infrared image patches and output color image patches. Finally, extensive comparative analysis on common datasets demonstrates that the proposed method outperforms existing state-of-the-art techniques in colorization performance. This research advances infrared image processing and enhances the visual quality of converted images. The code is available at <span><span>https://github.com/LTTdouble/CCLGAN</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"649 \",\"pages\":\"Article 130713\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225013852\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225013852","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

热红外图像不受光照和雾霾的影响，被广泛用于安全监控、自动驾驶汽车、夜间交通监控等领域。然而，它们的灰度性质缺乏颜色和纹理细节，限制了在图像识别和目标检测中的应用。将红外图像转换为白天的颜色可以增强视觉感知并拓宽其用途。尽管在红外图像着色方面取得了进步，但纹理失真、细节模糊和图像质量差等挑战仍然存在。为了解决这些问题，提出了一种新的无监督学习框架，称为余弦对比学习生成对抗网络（CCLGAN）。首先，通过引入全尺寸跳过连接和深度监督，对传统UNet体系结构进行了改进。全尺寸跳跃连接集成了低级细节和高级语义特征，而深度监督有助于学习层次特征图。此外，一个基于无参数神经元的3D注意机制被整合到Mamba模块中，以捕获远程依赖关系，并实现有效的特征选择和融合。其次，设计了一种新的对比损失函数，将余弦距离度量引入到传统的对比损失框架中；通过最大化余弦决策边际和归一化，最小化类内方差，最大化类间方差，保证了输入红外图像patch和输出彩色图像patch的一致性。最后，对常见数据集的广泛比较分析表明，所提出的方法在着色性能方面优于现有的最先进的技术。该研究促进了红外图像处理，提高了转换后图像的视觉质量。代码可在https://github.com/LTTdouble/CCLGAN上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adversarial network for unsupervised infrared image colorization based on full-scale feature fusion and cosine contrastive learning

Thermal infrared images, unaffected by lighting and haze, are widely used in security surveillance, autonomous vehicles, and nighttime traffic monitoring. However, their grayscale nature lacks color and texture details, limiting applications in image recognition and object detection. Converting infrared images to daytime color enhances visual perception and broadens their utility. Despite advancements in infrared image colorization, challenges such as texture distortion, detail blurring, and poor image quality persist. To address these issues, a novel unsupervised learning framework, termed Cosine Contrastive Learning Generative Adversarial Network (CCLGAN), is proposed. Firstly, the traditional UNet architecture is improved by introducing full-scale skip connections and deep supervision. Full-scale skip connections integrate low-level details with high-level semantic features, while deep supervision aids in learning hierarchical feature maps. Additionally, a parameter-free neuron-based 3D attention mechanism is incorporated into the Mamba module to capture long-range dependencies and enable effective feature selection and fusion. Secondly, a novel contrastive loss function is designed, incorporating cosine distance metrics into the traditional contrastive loss framework. By maximizing cosine decision margins and normalizing, intra-class variance is minimized, and inter-class variance is maximized, ensuring consistency between input infrared image patches and output color image patches. Finally, extensive comparative analysis on common datasets demonstrates that the proposed method outperforms existing state-of-the-art techniques in colorization performance. This research advances infrared image processing and enhances the visual quality of converted images. The code is available at https://github.com/LTTdouble/CCLGAN.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.