{"title":"通过可控温度编码实现多种可见光到热图像的转换","authors":"Lei Zhao;Mengwei Li;Bo Li;Xingxing Wei","doi":"10.1109/TMM.2025.3543053","DOIUrl":null,"url":null,"abstract":"Translating readily available visible (VIS) images into thermal infrared (TIR) images effectively alleviates the shortage of TIR data. While current methods have yielded commendable results, they fall short in generating diverse and realistic thermal infrared images, primarily due to insufficient consideration of temperature variations. In this paper, we propose a Thermally Controlled GAN (TC-GAN) that leverages VIS images to generate diverse TIR images, with the ability to control the relative temperatures of multiple objects, particularly those with temperature variations. Firstly, we introduce the physical coding module, which employs a conditional variational autoencoder GAN to learn the distributions of relative temperature information for the objects and environmental state information. Then, the physical information can be obtained by sampling the distribution. When this information is fused with the visible image, it facilitates the generation of diverse TIR images. To ensure authenticity and strengthen the physical constraints across different regions of the image, we introduce a self-attention mechanism in the generator that prioritizes the relative temperature relationships within the image. Additionally, we utilize a local discriminator that focuses on objects with actively changing temperatures and their interactions with the surrounding environment, thereby reducing the discontinuity between the target and the background. Experiments on the Drone Vehicle and AVIID datasets show that our approach outperforms mainstream diversity generation methods in terms of authenticity and diversity.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"27 ","pages":"5685-5695"},"PeriodicalIF":9.7000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diverse Visible-to-Thermal Image Translation via Controllable Temperature Encoding\",\"authors\":\"Lei Zhao;Mengwei Li;Bo Li;Xingxing Wei\",\"doi\":\"10.1109/TMM.2025.3543053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Translating readily available visible (VIS) images into thermal infrared (TIR) images effectively alleviates the shortage of TIR data. While current methods have yielded commendable results, they fall short in generating diverse and realistic thermal infrared images, primarily due to insufficient consideration of temperature variations. In this paper, we propose a Thermally Controlled GAN (TC-GAN) that leverages VIS images to generate diverse TIR images, with the ability to control the relative temperatures of multiple objects, particularly those with temperature variations. Firstly, we introduce the physical coding module, which employs a conditional variational autoencoder GAN to learn the distributions of relative temperature information for the objects and environmental state information. Then, the physical information can be obtained by sampling the distribution. When this information is fused with the visible image, it facilitates the generation of diverse TIR images. To ensure authenticity and strengthen the physical constraints across different regions of the image, we introduce a self-attention mechanism in the generator that prioritizes the relative temperature relationships within the image. Additionally, we utilize a local discriminator that focuses on objects with actively changing temperatures and their interactions with the surrounding environment, thereby reducing the discontinuity between the target and the background. Experiments on the Drone Vehicle and AVIID datasets show that our approach outperforms mainstream diversity generation methods in terms of authenticity and diversity.\",\"PeriodicalId\":13273,\"journal\":{\"name\":\"IEEE Transactions on Multimedia\",\"volume\":\"27 \",\"pages\":\"5685-5695\"},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2025-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multimedia\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10897888/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10897888/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Diverse Visible-to-Thermal Image Translation via Controllable Temperature Encoding
Translating readily available visible (VIS) images into thermal infrared (TIR) images effectively alleviates the shortage of TIR data. While current methods have yielded commendable results, they fall short in generating diverse and realistic thermal infrared images, primarily due to insufficient consideration of temperature variations. In this paper, we propose a Thermally Controlled GAN (TC-GAN) that leverages VIS images to generate diverse TIR images, with the ability to control the relative temperatures of multiple objects, particularly those with temperature variations. Firstly, we introduce the physical coding module, which employs a conditional variational autoencoder GAN to learn the distributions of relative temperature information for the objects and environmental state information. Then, the physical information can be obtained by sampling the distribution. When this information is fused with the visible image, it facilitates the generation of diverse TIR images. To ensure authenticity and strengthen the physical constraints across different regions of the image, we introduce a self-attention mechanism in the generator that prioritizes the relative temperature relationships within the image. Additionally, we utilize a local discriminator that focuses on objects with actively changing temperatures and their interactions with the surrounding environment, thereby reducing the discontinuity between the target and the background. Experiments on the Drone Vehicle and AVIID datasets show that our approach outperforms mainstream diversity generation methods in terms of authenticity and diversity.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.