TC3Net：用于单图像超分辨率的变压器和卷积耦合对比网络。

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2025-06-23 DOI:10.1109/tnnls.2025.3577669

Licheng Liu,Qibin Zhang,Tingyun Liu,C L Philip Chen

{"title":"TC3Net：用于单图像超分辨率的变压器和卷积耦合对比网络。","authors":"Licheng Liu,Qibin Zhang,Tingyun Liu,C L Philip Chen","doi":"10.1109/tnnls.2025.3577669","DOIUrl":null,"url":null,"abstract":"The convolutional neural network (CNN) and transformer have gained significant attention in the field of single image super-resolution (SISR), owing to their powerful capacity in nonlinear feature extraction. Nonetheless, these two types of approaches hold their own limitations. For instance, the interaction between convolutional kernels and image content is agnostic in CNN, while the computational complexity increases quadratically along with the spatial resolution in the transformer. To address these concerns, in this article, we propose a novel unified framework named transformer and convolution coupled contrastive network (TC3Net) for SISR, which holds a triple-branch structure to integrate the merits of both CNN and transformer. The proposed TC3Net is mainly composed of several stacked CNN feature extraction (CFE) blocks, transformer feature extraction (TFE) blocks, and coupled contrastive blocks (CCBs) for diverse feature extraction. Particularly, the CCB that consists of the coupled attention block (CAB) and the local-global feature extraction (LGFE) block is designed to fuse feature maps and extract coupled information for better image reconstruction. Moreover, a contrastive loss between the transformer and CNN feature maps is further introduced to enhance their discriminative characteristics and complement the fused features. Experimental results demonstrate that TC3Net outperforms several state-of-the-art (SOTA) methods in the aspect of achieving a better balance between model size and performance.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"33 1","pages":""},"PeriodicalIF":8.9000,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TC3Net: Transformer and Convolution Coupled Contrastive Network for Single Image Super-Resolution.\",\"authors\":\"Licheng Liu,Qibin Zhang,Tingyun Liu,C L Philip Chen\",\"doi\":\"10.1109/tnnls.2025.3577669\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The convolutional neural network (CNN) and transformer have gained significant attention in the field of single image super-resolution (SISR), owing to their powerful capacity in nonlinear feature extraction. Nonetheless, these two types of approaches hold their own limitations. For instance, the interaction between convolutional kernels and image content is agnostic in CNN, while the computational complexity increases quadratically along with the spatial resolution in the transformer. To address these concerns, in this article, we propose a novel unified framework named transformer and convolution coupled contrastive network (TC3Net) for SISR, which holds a triple-branch structure to integrate the merits of both CNN and transformer. The proposed TC3Net is mainly composed of several stacked CNN feature extraction (CFE) blocks, transformer feature extraction (TFE) blocks, and coupled contrastive blocks (CCBs) for diverse feature extraction. Particularly, the CCB that consists of the coupled attention block (CAB) and the local-global feature extraction (LGFE) block is designed to fuse feature maps and extract coupled information for better image reconstruction. Moreover, a contrastive loss between the transformer and CNN feature maps is further introduced to enhance their discriminative characteristics and complement the fused features. Experimental results demonstrate that TC3Net outperforms several state-of-the-art (SOTA) methods in the aspect of achieving a better balance between model size and performance.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"33 1\",\"pages\":\"\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tnnls.2025.3577669\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tnnls.2025.3577669","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

卷积神经网络（CNN）和变压器以其强大的非线性特征提取能力在单幅图像超分辨率（SISR）领域得到了广泛的关注。尽管如此，这两种方法都有各自的局限性。例如，在CNN中，卷积核与图像内容之间的相互作用是不可知的，而在变压器中，计算复杂度随着空间分辨率的增加呈二次增长。为了解决这些问题，在本文中，我们提出了一种新的统一框架，称为变压器和卷积耦合对比网络（TC3Net），用于SISR，它具有三支路结构，以集成CNN和变压器的优点。提出的TC3Net主要由多个堆叠CNN特征提取（CFE）块、变压器特征提取（TFE）块和耦合对比块（ccb）组成，用于多种特征提取。其中，CCB由耦合注意块（CAB）和局部-全局特征提取块（LGFE）组成，用于融合特征映射和提取耦合信息，以实现更好的图像重建。此外，还引入了变压器和CNN特征映射之间的对比损耗，增强了它们的判别特性，补充了融合的特征。实验结果表明，TC3Net在实现模型大小和性能之间更好的平衡方面优于几种最先进的（SOTA）方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TC3Net: Transformer and Convolution Coupled Contrastive Network for Single Image Super-Resolution.

The convolutional neural network (CNN) and transformer have gained significant attention in the field of single image super-resolution (SISR), owing to their powerful capacity in nonlinear feature extraction. Nonetheless, these two types of approaches hold their own limitations. For instance, the interaction between convolutional kernels and image content is agnostic in CNN, while the computational complexity increases quadratically along with the spatial resolution in the transformer. To address these concerns, in this article, we propose a novel unified framework named transformer and convolution coupled contrastive network (TC3Net) for SISR, which holds a triple-branch structure to integrate the merits of both CNN and transformer. The proposed TC3Net is mainly composed of several stacked CNN feature extraction (CFE) blocks, transformer feature extraction (TFE) blocks, and coupled contrastive blocks (CCBs) for diverse feature extraction. Particularly, the CCB that consists of the coupled attention block (CAB) and the local-global feature extraction (LGFE) block is designed to fuse feature maps and extract coupled information for better image reconstruction. Moreover, a contrastive loss between the transformer and CNN feature maps is further introduced to enhance their discriminative characteristics and complement the fused features. Experimental results demonstrate that TC3Net outperforms several state-of-the-art (SOTA) methods in the aspect of achieving a better balance between model size and performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.