{"title":"基于视觉转换器的CGAN的sar -光学图像转换","authors":"Shinyoung Park;Hojung Lee;Seongwook Lee","doi":"10.1109/JSEN.2025.3555933","DOIUrl":null,"url":null,"abstract":"In this article, we propose a modified conditional generative adversarial network (CGAN) for synthetic aperture radar (SAR)-to-optical image translation. SAR imaging is widely employed in remote sensing due to its all-weather capability, yet its inability to capture color information can limit the extraction of crucial details. Consequently, several attempts aim to convert SAR images into optical images to overcome these shortcomings. Our research adopts a multilevel approach that jointly incorporates high-level and low-level terrain extracted from SAR images and color features. Our proposed method is composed of multiscale vision Transformer (viT) blocks, enhanced loss function based on perceptual differences, and a two-phase transfer learning technique to guide the network properly. First, we obtain the multiaspect features of SAR images by using the intermediate viT processor and viT bottleneck. Second, we add a perceptual loss function that is driven from the pretrained VGG-19 network to overcome the blurring nature of traditional L1 loss. Finally, we employ a two-phase transfer learning using patched grayscale optical images and noise-removed images pre-processed by the Lucy-Richardson filter. Compared to previous studies that use only a small portion of the SEN1-2 dataset under 20000 image pairs, our model is trained on 75724 image pairs from this dataset. It ensures a significantly broader and more representative coverage of conditions and scenes compared to previous studies. The output image of our proposed method stands out with the highest peak signal-to-noise ratio (PSNR) of 15.96 dB and structural similarity index measure (SSIM) of 0.2805, along with the lowest mean-squared error (mse) 0.0363 and Fréchet inception distance (FID) score of 142.1333, proving that it generates superior images compared to conventional models in all respects.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"25 10","pages":"18503-18514"},"PeriodicalIF":4.3000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SAR-to-Optical Image Translation Using Vision Transformer-Based CGAN\",\"authors\":\"Shinyoung Park;Hojung Lee;Seongwook Lee\",\"doi\":\"10.1109/JSEN.2025.3555933\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, we propose a modified conditional generative adversarial network (CGAN) for synthetic aperture radar (SAR)-to-optical image translation. SAR imaging is widely employed in remote sensing due to its all-weather capability, yet its inability to capture color information can limit the extraction of crucial details. Consequently, several attempts aim to convert SAR images into optical images to overcome these shortcomings. Our research adopts a multilevel approach that jointly incorporates high-level and low-level terrain extracted from SAR images and color features. Our proposed method is composed of multiscale vision Transformer (viT) blocks, enhanced loss function based on perceptual differences, and a two-phase transfer learning technique to guide the network properly. First, we obtain the multiaspect features of SAR images by using the intermediate viT processor and viT bottleneck. Second, we add a perceptual loss function that is driven from the pretrained VGG-19 network to overcome the blurring nature of traditional L1 loss. Finally, we employ a two-phase transfer learning using patched grayscale optical images and noise-removed images pre-processed by the Lucy-Richardson filter. Compared to previous studies that use only a small portion of the SEN1-2 dataset under 20000 image pairs, our model is trained on 75724 image pairs from this dataset. It ensures a significantly broader and more representative coverage of conditions and scenes compared to previous studies. The output image of our proposed method stands out with the highest peak signal-to-noise ratio (PSNR) of 15.96 dB and structural similarity index measure (SSIM) of 0.2805, along with the lowest mean-squared error (mse) 0.0363 and Fréchet inception distance (FID) score of 142.1333, proving that it generates superior images compared to conventional models in all respects.\",\"PeriodicalId\":447,\"journal\":{\"name\":\"IEEE Sensors Journal\",\"volume\":\"25 10\",\"pages\":\"18503-18514\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Sensors Journal\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10948876/\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/10948876/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
SAR-to-Optical Image Translation Using Vision Transformer-Based CGAN
In this article, we propose a modified conditional generative adversarial network (CGAN) for synthetic aperture radar (SAR)-to-optical image translation. SAR imaging is widely employed in remote sensing due to its all-weather capability, yet its inability to capture color information can limit the extraction of crucial details. Consequently, several attempts aim to convert SAR images into optical images to overcome these shortcomings. Our research adopts a multilevel approach that jointly incorporates high-level and low-level terrain extracted from SAR images and color features. Our proposed method is composed of multiscale vision Transformer (viT) blocks, enhanced loss function based on perceptual differences, and a two-phase transfer learning technique to guide the network properly. First, we obtain the multiaspect features of SAR images by using the intermediate viT processor and viT bottleneck. Second, we add a perceptual loss function that is driven from the pretrained VGG-19 network to overcome the blurring nature of traditional L1 loss. Finally, we employ a two-phase transfer learning using patched grayscale optical images and noise-removed images pre-processed by the Lucy-Richardson filter. Compared to previous studies that use only a small portion of the SEN1-2 dataset under 20000 image pairs, our model is trained on 75724 image pairs from this dataset. It ensures a significantly broader and more representative coverage of conditions and scenes compared to previous studies. The output image of our proposed method stands out with the highest peak signal-to-noise ratio (PSNR) of 15.96 dB and structural similarity index measure (SSIM) of 0.2805, along with the lowest mean-squared error (mse) 0.0363 and Fréchet inception distance (FID) score of 142.1333, proving that it generates superior images compared to conventional models in all respects.
期刊介绍:
The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following:
-Sensor Phenomenology, Modelling, and Evaluation
-Sensor Materials, Processing, and Fabrication
-Chemical and Gas Sensors
-Microfluidics and Biosensors
-Optical Sensors
-Physical Sensors: Temperature, Mechanical, Magnetic, and others
-Acoustic and Ultrasonic Sensors
-Sensor Packaging
-Sensor Networks
-Sensor Applications
-Sensor Systems: Signals, Processing, and Interfaces
-Actuators and Sensor Power Systems
-Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting
-Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data)
-Sensors in Industrial Practice