CR-former: Single-Image Cloud Removal With Focused Taylor Attention

IF 7.5 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2024-11-26 DOI:10.1109/TGRS.2024.3506780

Yang Wu;Ye Deng;Sanping Zhou;Yuhan Liu;Wenli Huang;Jinjun Wang

{"title":"CR-former: Single-Image Cloud Removal With Focused Taylor Attention","authors":"Yang Wu;Ye Deng;Sanping Zhou;Yuhan Liu;Wenli Huang;Jinjun Wang","doi":"10.1109/TGRS.2024.3506780","DOIUrl":null,"url":null,"abstract":"Cloud removal aims to restore high-quality images from cloud-contaminated captures, which is essential in remote sensing applications. Effectively modeling the long-range relationships between image features is key to achieving high-quality cloud-free images. While self-attention mechanisms excel at modeling long-distance relationships, their computational complexity scales quadratically with image resolution, limiting their applicability to high-resolution remote sensing images. Current cloud removal methods have mitigated this issue by restricting the global receptive field to smaller regions or adopting channel attention to model long-range relationships. However, these methods either compromise pixel-level long-range dependencies or lose spatial information, potentially leading to structural inconsistencies in restored images. In this work, we propose the focused Taylor attention (FT-Attention), which captures pixel-level long-range relationships without limiting the spatial extent of attention and achieves the \n<inline-formula> <tex-math>$\\mathcal {O}(N)$ </tex-math></inline-formula>\n computational complexity, where N represents the image resolution. Specifically, we utilize Taylor series expansions to reduce the computational complexity of the attention mechanism from \n<inline-formula> <tex-math>$\\mathcal {O}(N^{2})$ </tex-math></inline-formula>\n to \n<inline-formula> <tex-math>$\\mathcal {O}(N)$ </tex-math></inline-formula>\n, enabling efficient capture of pixel relationships directly in high-resolution images. Additionally, to fully leverage the informative pixel, we develop a new normalization function for the query and key, which produces more distinguishable attention weights, enhancing focus on important features. Building on FT-Attention, we design a U-net style network, termed the CR-former, specifically for cloud removal. Extensive experimental results on representative cloud removal datasets demonstrate the superior performance of our CR-former. The code is available at \n<uri>https://github.com/wuyang2691/CR-former</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-14"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10767603/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Cloud removal aims to restore high-quality images from cloud-contaminated captures, which is essential in remote sensing applications. Effectively modeling the long-range relationships between image features is key to achieving high-quality cloud-free images. While self-attention mechanisms excel at modeling long-distance relationships, their computational complexity scales quadratically with image resolution, limiting their applicability to high-resolution remote sensing images. Current cloud removal methods have mitigated this issue by restricting the global receptive field to smaller regions or adopting channel attention to model long-range relationships. However, these methods either compromise pixel-level long-range dependencies or lose spatial information, potentially leading to structural inconsistencies in restored images. In this work, we propose the focused Taylor attention (FT-Attention), which captures pixel-level long-range relationships without limiting the spatial extent of attention and achieves the

$\mathcal {O}(N)$

computational complexity, where N represents the image resolution. Specifically, we utilize Taylor series expansions to reduce the computational complexity of the attention mechanism from

$\mathcal {O}(N^{2})$

$\mathcal {O}(N)$

, enabling efficient capture of pixel relationships directly in high-resolution images. Additionally, to fully leverage the informative pixel, we develop a new normalization function for the query and key, which produces more distinguishable attention weights, enhancing focus on important features. Building on FT-Attention, we design a U-net style network, termed the CR-former, specifically for cloud removal. Extensive experimental results on representative cloud removal datasets demonstrate the superior performance of our CR-former. The code is available at https://github.com/wuyang2691/CR-former .

查看原文本刊更多论文

CR-former：通过聚焦泰勒注意力去除单张图像上的云雾

清除云的目的是从被云污染的图像中恢复高质量图像，这在遥感应用中是必不可少的。有效地对图像特征之间的长期关系进行建模是实现高质量无云图像的关键。虽然自注意机制擅长模拟远距离关系，但其计算复杂度随图像分辨率呈二次增长，限制了其对高分辨率遥感图像的适用性。当前的云移除方法通过将全局接受域限制在较小的区域或采用通道关注来模拟远程关系来缓解这一问题。然而，这些方法要么损害像素级的远程依赖关系，要么丢失空间信息，可能导致恢复图像的结构不一致。在这项工作中，我们提出了聚焦泰勒注意（FT-Attention），它在不限制注意的空间范围的情况下捕获像素级的远程关系，并实现了$\mathcal {O}(N)$的计算复杂度，其中N表示图像分辨率。具体来说，我们利用泰勒级数展开将注意力机制的计算复杂度从$\mathcal {O}(N^{2})$降低到$\mathcal {O}(N)$，从而能够有效地直接捕获高分辨率图像中的像素关系。此外，为了充分利用信息像素，我们为查询和键开发了一个新的规范化函数，它产生了更多可区分的关注权重，增强了对重要特征的关注。在FT-Attention的基础上，我们设计了一个U-net风格的网络，称为CR-former，专门用于云移除。在具有代表性的云去除数据集上的大量实验结果证明了我们的CR-former的优越性能。代码可在https://github.com/wuyang2691/CR-former上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.