Kexin Zhang;Lingling Li;Licheng Jiao;Xu Liu;Wenping Ma;Fang Liu;Shuyuan Yang
{"title":"CSCT: Channel–Spatial Coherent Transformer for Remote Sensing Image Super-Resolution","authors":"Kexin Zhang;Lingling Li;Licheng Jiao;Xu Liu;Wenping Ma;Fang Liu;Shuyuan Yang","doi":"10.1109/TGRS.2025.3540260","DOIUrl":null,"url":null,"abstract":"Remote sensing image super-resolution (RSISR) techniques are crucial in practice as an economical approach to enhancing the resolution of remote sensing images (RSIs). The scale of structural information and the richness of texture details in RSIs far exceed those in natural images. Therefore, accurately restoring and preserving edge and detail information are a critical challenge in the super-resolution (SR) process. Currently, convolutional neural network (CNN)-based methods primarily rely on local feature extraction, which fails to effectively capture and integrate global contextual information. Generative adversarial network (GAN)-based methods, while improving the visual quality, often suffer from artifacts and training instability, adversely affecting image quality. Moreover, these approaches struggle to accurately represent high-frequency features, leading to blurriness or distortion when reconstructing fine details and edges. To address these limitations, we introduce the channel–spatial coherent transformer (CSCT). The core of CSCT includes the channel–spatial coherent attention (CSCA) and the frequency-gated feed-forward network (FGFN), which work synergistically to enhance edge and detail preservation while significantly improving overall image clarity. CSCA efficiently aggregates channel and spatial information, while FGFN adaptively adjusts frequency information to enhance high-frequency details and suppress low-frequency noise. Moreover, this article leverages advanced data augmentation methods that markedly boost RSISR performance, offering new avenues for further exploration. The empirical analysis across several remote sensing SR benchmark datasets reveals that our approach excels in detail restoration, effectively reduces artifacts and noise, and significantly enhances the quality of SR images.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-14"},"PeriodicalIF":8.6000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10879083/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Remote sensing image super-resolution (RSISR) techniques are crucial in practice as an economical approach to enhancing the resolution of remote sensing images (RSIs). The scale of structural information and the richness of texture details in RSIs far exceed those in natural images. Therefore, accurately restoring and preserving edge and detail information are a critical challenge in the super-resolution (SR) process. Currently, convolutional neural network (CNN)-based methods primarily rely on local feature extraction, which fails to effectively capture and integrate global contextual information. Generative adversarial network (GAN)-based methods, while improving the visual quality, often suffer from artifacts and training instability, adversely affecting image quality. Moreover, these approaches struggle to accurately represent high-frequency features, leading to blurriness or distortion when reconstructing fine details and edges. To address these limitations, we introduce the channel–spatial coherent transformer (CSCT). The core of CSCT includes the channel–spatial coherent attention (CSCA) and the frequency-gated feed-forward network (FGFN), which work synergistically to enhance edge and detail preservation while significantly improving overall image clarity. CSCA efficiently aggregates channel and spatial information, while FGFN adaptively adjusts frequency information to enhance high-frequency details and suppress low-frequency noise. Moreover, this article leverages advanced data augmentation methods that markedly boost RSISR performance, offering new avenues for further exploration. The empirical analysis across several remote sensing SR benchmark datasets reveals that our approach excels in detail restoration, effectively reduces artifacts and noise, and significantly enhances the quality of SR images.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.