SAR-to-Optical Image Translation Using Vision Transformer-Based CGAN

IF 4.3 2区 综合性期刊 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Shinyoung Park;Hojung Lee;Seongwook Lee
{"title":"SAR-to-Optical Image Translation Using Vision Transformer-Based CGAN","authors":"Shinyoung Park;Hojung Lee;Seongwook Lee","doi":"10.1109/JSEN.2025.3555933","DOIUrl":null,"url":null,"abstract":"In this article, we propose a modified conditional generative adversarial network (CGAN) for synthetic aperture radar (SAR)-to-optical image translation. SAR imaging is widely employed in remote sensing due to its all-weather capability, yet its inability to capture color information can limit the extraction of crucial details. Consequently, several attempts aim to convert SAR images into optical images to overcome these shortcomings. Our research adopts a multilevel approach that jointly incorporates high-level and low-level terrain extracted from SAR images and color features. Our proposed method is composed of multiscale vision Transformer (viT) blocks, enhanced loss function based on perceptual differences, and a two-phase transfer learning technique to guide the network properly. First, we obtain the multiaspect features of SAR images by using the intermediate viT processor and viT bottleneck. Second, we add a perceptual loss function that is driven from the pretrained VGG-19 network to overcome the blurring nature of traditional L1 loss. Finally, we employ a two-phase transfer learning using patched grayscale optical images and noise-removed images pre-processed by the Lucy-Richardson filter. Compared to previous studies that use only a small portion of the SEN1-2 dataset under 20000 image pairs, our model is trained on 75724 image pairs from this dataset. It ensures a significantly broader and more representative coverage of conditions and scenes compared to previous studies. The output image of our proposed method stands out with the highest peak signal-to-noise ratio (PSNR) of 15.96 dB and structural similarity index measure (SSIM) of 0.2805, along with the lowest mean-squared error (mse) 0.0363 and Fréchet inception distance (FID) score of 142.1333, proving that it generates superior images compared to conventional models in all respects.","PeriodicalId":447,"journal":{"name":"IEEE Sensors Journal","volume":"25 10","pages":"18503-18514"},"PeriodicalIF":4.3000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Sensors Journal","FirstCategoryId":"103","ListUrlMain":"https://ieeexplore.ieee.org/document/10948876/","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

In this article, we propose a modified conditional generative adversarial network (CGAN) for synthetic aperture radar (SAR)-to-optical image translation. SAR imaging is widely employed in remote sensing due to its all-weather capability, yet its inability to capture color information can limit the extraction of crucial details. Consequently, several attempts aim to convert SAR images into optical images to overcome these shortcomings. Our research adopts a multilevel approach that jointly incorporates high-level and low-level terrain extracted from SAR images and color features. Our proposed method is composed of multiscale vision Transformer (viT) blocks, enhanced loss function based on perceptual differences, and a two-phase transfer learning technique to guide the network properly. First, we obtain the multiaspect features of SAR images by using the intermediate viT processor and viT bottleneck. Second, we add a perceptual loss function that is driven from the pretrained VGG-19 network to overcome the blurring nature of traditional L1 loss. Finally, we employ a two-phase transfer learning using patched grayscale optical images and noise-removed images pre-processed by the Lucy-Richardson filter. Compared to previous studies that use only a small portion of the SEN1-2 dataset under 20000 image pairs, our model is trained on 75724 image pairs from this dataset. It ensures a significantly broader and more representative coverage of conditions and scenes compared to previous studies. The output image of our proposed method stands out with the highest peak signal-to-noise ratio (PSNR) of 15.96 dB and structural similarity index measure (SSIM) of 0.2805, along with the lowest mean-squared error (mse) 0.0363 and Fréchet inception distance (FID) score of 142.1333, proving that it generates superior images compared to conventional models in all respects.
基于视觉转换器的CGAN的sar -光学图像转换
在本文中,我们提出了一种改进的条件生成对抗网络(CGAN)用于合成孔径雷达(SAR)到光学图像的转换。SAR成像由于其全天候能力被广泛应用于遥感,但其无法捕获颜色信息,限制了关键细节的提取。因此,一些尝试旨在将SAR图像转换为光学图像以克服这些缺点。我们的研究采用多层方法,将从SAR图像中提取的高低地形和颜色特征结合起来。我们提出的方法由多尺度视觉变换(viT)块、基于感知差异的增强损失函数和两阶段迁移学习技术组成,以适当地引导网络。首先,利用中间viT处理器和viT瓶颈获得SAR图像的多面向特征;其次,我们添加了一个由预训练的VGG-19网络驱动的感知损失函数,以克服传统L1损失的模糊性。最后,我们采用了一种两阶段迁移学习方法,该方法使用了经过Lucy-Richardson滤波器预处理的灰度光学图像和去噪图像。与之前的研究只使用SEN1-2数据集的一小部分20000对图像相比,我们的模型是在该数据集的75724对图像上进行训练的。与以前的研究相比,它确保了更广泛和更具代表性的条件和场景覆盖。该方法的输出图像峰值信噪比(PSNR)最高为15.96 dB,结构相似性指数(SSIM)为0.2805,均方误差(mse)最低为0.0363,fr起始距离(FID)评分最低为142.1333,证明该方法生成的图像在各方面都优于传统模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Sensors Journal
IEEE Sensors Journal 工程技术-工程:电子与电气
CiteScore
7.70
自引率
14.00%
发文量
2058
审稿时长
5.2 months
期刊介绍: The fields of interest of the IEEE Sensors Journal are the theory, design , fabrication, manufacturing and applications of devices for sensing and transducing physical, chemical and biological phenomena, with emphasis on the electronics and physics aspect of sensors and integrated sensors-actuators. IEEE Sensors Journal deals with the following: -Sensor Phenomenology, Modelling, and Evaluation -Sensor Materials, Processing, and Fabrication -Chemical and Gas Sensors -Microfluidics and Biosensors -Optical Sensors -Physical Sensors: Temperature, Mechanical, Magnetic, and others -Acoustic and Ultrasonic Sensors -Sensor Packaging -Sensor Networks -Sensor Applications -Sensor Systems: Signals, Processing, and Interfaces -Actuators and Sensor Power Systems -Sensor Signal Processing for high precision and stability (amplification, filtering, linearization, modulation/demodulation) and under harsh conditions (EMC, radiation, humidity, temperature); energy consumption/harvesting -Sensor Data Processing (soft computing with sensor data, e.g., pattern recognition, machine learning, evolutionary computation; sensor data fusion, processing of wave e.g., electromagnetic and acoustic; and non-wave, e.g., chemical, gravity, particle, thermal, radiative and non-radiative sensor data, detection, estimation and classification based on sensor data) -Sensors in Industrial Practice
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信