CR-former:通过聚焦泰勒注意力去除单张图像上的云雾

IF 7.5 1区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Yang Wu;Ye Deng;Sanping Zhou;Yuhan Liu;Wenli Huang;Jinjun Wang
{"title":"CR-former:通过聚焦泰勒注意力去除单张图像上的云雾","authors":"Yang Wu;Ye Deng;Sanping Zhou;Yuhan Liu;Wenli Huang;Jinjun Wang","doi":"10.1109/TGRS.2024.3506780","DOIUrl":null,"url":null,"abstract":"Cloud removal aims to restore high-quality images from cloud-contaminated captures, which is essential in remote sensing applications. Effectively modeling the long-range relationships between image features is key to achieving high-quality cloud-free images. While self-attention mechanisms excel at modeling long-distance relationships, their computational complexity scales quadratically with image resolution, limiting their applicability to high-resolution remote sensing images. Current cloud removal methods have mitigated this issue by restricting the global receptive field to smaller regions or adopting channel attention to model long-range relationships. However, these methods either compromise pixel-level long-range dependencies or lose spatial information, potentially leading to structural inconsistencies in restored images. In this work, we propose the focused Taylor attention (FT-Attention), which captures pixel-level long-range relationships without limiting the spatial extent of attention and achieves the \n<inline-formula> <tex-math>$\\mathcal {O}(N)$ </tex-math></inline-formula>\n computational complexity, where N represents the image resolution. Specifically, we utilize Taylor series expansions to reduce the computational complexity of the attention mechanism from \n<inline-formula> <tex-math>$\\mathcal {O}(N^{2})$ </tex-math></inline-formula>\n to \n<inline-formula> <tex-math>$\\mathcal {O}(N)$ </tex-math></inline-formula>\n, enabling efficient capture of pixel relationships directly in high-resolution images. Additionally, to fully leverage the informative pixel, we develop a new normalization function for the query and key, which produces more distinguishable attention weights, enhancing focus on important features. Building on FT-Attention, we design a U-net style network, termed the CR-former, specifically for cloud removal. Extensive experimental results on representative cloud removal datasets demonstrate the superior performance of our CR-former. The code is available at \n<uri>https://github.com/wuyang2691/CR-former</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-14"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CR-former: Single-Image Cloud Removal With Focused Taylor Attention\",\"authors\":\"Yang Wu;Ye Deng;Sanping Zhou;Yuhan Liu;Wenli Huang;Jinjun Wang\",\"doi\":\"10.1109/TGRS.2024.3506780\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cloud removal aims to restore high-quality images from cloud-contaminated captures, which is essential in remote sensing applications. Effectively modeling the long-range relationships between image features is key to achieving high-quality cloud-free images. While self-attention mechanisms excel at modeling long-distance relationships, their computational complexity scales quadratically with image resolution, limiting their applicability to high-resolution remote sensing images. Current cloud removal methods have mitigated this issue by restricting the global receptive field to smaller regions or adopting channel attention to model long-range relationships. However, these methods either compromise pixel-level long-range dependencies or lose spatial information, potentially leading to structural inconsistencies in restored images. In this work, we propose the focused Taylor attention (FT-Attention), which captures pixel-level long-range relationships without limiting the spatial extent of attention and achieves the \\n<inline-formula> <tex-math>$\\\\mathcal {O}(N)$ </tex-math></inline-formula>\\n computational complexity, where N represents the image resolution. Specifically, we utilize Taylor series expansions to reduce the computational complexity of the attention mechanism from \\n<inline-formula> <tex-math>$\\\\mathcal {O}(N^{2})$ </tex-math></inline-formula>\\n to \\n<inline-formula> <tex-math>$\\\\mathcal {O}(N)$ </tex-math></inline-formula>\\n, enabling efficient capture of pixel relationships directly in high-resolution images. Additionally, to fully leverage the informative pixel, we develop a new normalization function for the query and key, which produces more distinguishable attention weights, enhancing focus on important features. Building on FT-Attention, we design a U-net style network, termed the CR-former, specifically for cloud removal. Extensive experimental results on representative cloud removal datasets demonstrate the superior performance of our CR-former. The code is available at \\n<uri>https://github.com/wuyang2691/CR-former</uri>\\n.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"62 \",\"pages\":\"1-14\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10767603/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10767603/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

清除云的目的是从被云污染的图像中恢复高质量图像,这在遥感应用中是必不可少的。有效地对图像特征之间的长期关系进行建模是实现高质量无云图像的关键。虽然自注意机制擅长模拟远距离关系,但其计算复杂度随图像分辨率呈二次增长,限制了其对高分辨率遥感图像的适用性。当前的云移除方法通过将全局接受域限制在较小的区域或采用通道关注来模拟远程关系来缓解这一问题。然而,这些方法要么损害像素级的远程依赖关系,要么丢失空间信息,可能导致恢复图像的结构不一致。在这项工作中,我们提出了聚焦泰勒注意(FT-Attention),它在不限制注意的空间范围的情况下捕获像素级的远程关系,并实现了$\mathcal {O}(N)$的计算复杂度,其中N表示图像分辨率。具体来说,我们利用泰勒级数展开将注意力机制的计算复杂度从$\mathcal {O}(N^{2})$降低到$\mathcal {O}(N)$,从而能够有效地直接捕获高分辨率图像中的像素关系。此外,为了充分利用信息像素,我们为查询和键开发了一个新的规范化函数,它产生了更多可区分的关注权重,增强了对重要特征的关注。在FT-Attention的基础上,我们设计了一个U-net风格的网络,称为CR-former,专门用于云移除。在具有代表性的云去除数据集上的大量实验结果证明了我们的CR-former的优越性能。代码可在https://github.com/wuyang2691/CR-former上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CR-former: Single-Image Cloud Removal With Focused Taylor Attention
Cloud removal aims to restore high-quality images from cloud-contaminated captures, which is essential in remote sensing applications. Effectively modeling the long-range relationships between image features is key to achieving high-quality cloud-free images. While self-attention mechanisms excel at modeling long-distance relationships, their computational complexity scales quadratically with image resolution, limiting their applicability to high-resolution remote sensing images. Current cloud removal methods have mitigated this issue by restricting the global receptive field to smaller regions or adopting channel attention to model long-range relationships. However, these methods either compromise pixel-level long-range dependencies or lose spatial information, potentially leading to structural inconsistencies in restored images. In this work, we propose the focused Taylor attention (FT-Attention), which captures pixel-level long-range relationships without limiting the spatial extent of attention and achieves the $\mathcal {O}(N)$ computational complexity, where N represents the image resolution. Specifically, we utilize Taylor series expansions to reduce the computational complexity of the attention mechanism from $\mathcal {O}(N^{2})$ to $\mathcal {O}(N)$ , enabling efficient capture of pixel relationships directly in high-resolution images. Additionally, to fully leverage the informative pixel, we develop a new normalization function for the query and key, which produces more distinguishable attention weights, enhancing focus on important features. Building on FT-Attention, we design a U-net style network, termed the CR-former, specifically for cloud removal. Extensive experimental results on representative cloud removal datasets demonstrate the superior performance of our CR-former. The code is available at https://github.com/wuyang2691/CR-former .
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Geoscience and Remote Sensing
IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理
CiteScore
11.50
自引率
28.00%
发文量
1912
审稿时长
4.0 months
期刊介绍: IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信