{"title":"基于改进u-net和条件扩散的多光谱可见-热红外图像转换","authors":"Mahroosh Banday, Brejesh Lall","doi":"10.1016/j.neucom.2025.131006","DOIUrl":null,"url":null,"abstract":"<div><div>Translating images from visible spectrum to thermal IR (TIR) domain to achieve precise and realistic representations of TIR images is a challenging task. Thermal infrared imaging is of great significance in scenarios where vision is severely impaired especially in difficult lighting conditions such as night, haze, fog or cloudy weather. With these advantages, infrared imaging finds extensive applicability in navigation, surveillance, object detection, product inspection, agriculture as well as remote sensing. In order to build high performance deep models for such wide range of applications, it is necessary to have large amount of TIR data for training. However, there is unavailability of sufficient IR based datasets due to high cost of thermal infrared camera setups. While large number of visible image datasets are available, this scarcity of TIR datasets can be addressed by translating visible images to their TIR counterparts. In this paper, we leverage the widely available visible range data to propose two visible to TIR domain translation approaches, one is modified U-Net based non-generative approach called TIR-UNet and the other is conditional diffusion based generative approach that also uses U-Net as neural backbone for synthesizing TIR images. Both the proposed methods have been evaluated on four benchmark datasets and demonstrate high qualitative as well as quantitative performance in generating perceptually realistic, visually plausible and high quality TIR equivalents of given visible images. Compared to state-of-the-art methods which include U-Net and powerful GAN variants, our methods achieve remarkable performance increase on the metrics of MSE, PSNR and SSIM for both day and night images.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 131006"},"PeriodicalIF":6.5000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi spectral visible-thermal IR image translation using improved u-net & conditional diffusion\",\"authors\":\"Mahroosh Banday, Brejesh Lall\",\"doi\":\"10.1016/j.neucom.2025.131006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Translating images from visible spectrum to thermal IR (TIR) domain to achieve precise and realistic representations of TIR images is a challenging task. Thermal infrared imaging is of great significance in scenarios where vision is severely impaired especially in difficult lighting conditions such as night, haze, fog or cloudy weather. With these advantages, infrared imaging finds extensive applicability in navigation, surveillance, object detection, product inspection, agriculture as well as remote sensing. In order to build high performance deep models for such wide range of applications, it is necessary to have large amount of TIR data for training. However, there is unavailability of sufficient IR based datasets due to high cost of thermal infrared camera setups. While large number of visible image datasets are available, this scarcity of TIR datasets can be addressed by translating visible images to their TIR counterparts. In this paper, we leverage the widely available visible range data to propose two visible to TIR domain translation approaches, one is modified U-Net based non-generative approach called TIR-UNet and the other is conditional diffusion based generative approach that also uses U-Net as neural backbone for synthesizing TIR images. Both the proposed methods have been evaluated on four benchmark datasets and demonstrate high qualitative as well as quantitative performance in generating perceptually realistic, visually plausible and high quality TIR equivalents of given visible images. Compared to state-of-the-art methods which include U-Net and powerful GAN variants, our methods achieve remarkable performance increase on the metrics of MSE, PSNR and SSIM for both day and night images.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"651 \",\"pages\":\"Article 131006\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225016789\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225016789","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Multi spectral visible-thermal IR image translation using improved u-net & conditional diffusion
Translating images from visible spectrum to thermal IR (TIR) domain to achieve precise and realistic representations of TIR images is a challenging task. Thermal infrared imaging is of great significance in scenarios where vision is severely impaired especially in difficult lighting conditions such as night, haze, fog or cloudy weather. With these advantages, infrared imaging finds extensive applicability in navigation, surveillance, object detection, product inspection, agriculture as well as remote sensing. In order to build high performance deep models for such wide range of applications, it is necessary to have large amount of TIR data for training. However, there is unavailability of sufficient IR based datasets due to high cost of thermal infrared camera setups. While large number of visible image datasets are available, this scarcity of TIR datasets can be addressed by translating visible images to their TIR counterparts. In this paper, we leverage the widely available visible range data to propose two visible to TIR domain translation approaches, one is modified U-Net based non-generative approach called TIR-UNet and the other is conditional diffusion based generative approach that also uses U-Net as neural backbone for synthesizing TIR images. Both the proposed methods have been evaluated on four benchmark datasets and demonstrate high qualitative as well as quantitative performance in generating perceptually realistic, visually plausible and high quality TIR equivalents of given visible images. Compared to state-of-the-art methods which include U-Net and powerful GAN variants, our methods achieve remarkable performance increase on the metrics of MSE, PSNR and SSIM for both day and night images.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.