{"title":"TCC-SemCom: A Transformer-CNN Complementary Block-Based Image Semantic Communication","authors":"Guo Cheng;Baolin Chong;Hancheng Lu","doi":"10.1109/LCOMM.2025.3538486","DOIUrl":null,"url":null,"abstract":"Semantic communication (SemCom), as a paradigm beyond bit communication, is regarded as an effective solution to address the challenges posed by the growing volume of vision-based traffic. Existing semantic image communication methods are mostly based on convolutional neural networks (CNNs) or Transformers, which focus on different structural semantics. Specifically, CNNs with local convolution operations excel at capturing local semantic features, while Transformers based on multi-head attention mechanism, are better at modeling long-range dependencies and global semantic information. To effectively fuse these two models and leverage both advantages, we propose a parallel Transformer-CNN complementary (TCC) block, where CNNs and Transformers are combined to enhance the extraction of both local and global semantic information. Furthermore, we propose a TCC-based SemCom (TCC-SemCom) scheme for wireless image transmission. Experimental results verify that TCC-SemCom significantly outperforms existing schemes in terms of peak signal-to-noise ratio (PSNR) and multi-scale structural similarity index (MS-SSIM).","PeriodicalId":13197,"journal":{"name":"IEEE Communications Letters","volume":"29 3","pages":"625-629"},"PeriodicalIF":3.7000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Communications Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10870291/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Semantic communication (SemCom), as a paradigm beyond bit communication, is regarded as an effective solution to address the challenges posed by the growing volume of vision-based traffic. Existing semantic image communication methods are mostly based on convolutional neural networks (CNNs) or Transformers, which focus on different structural semantics. Specifically, CNNs with local convolution operations excel at capturing local semantic features, while Transformers based on multi-head attention mechanism, are better at modeling long-range dependencies and global semantic information. To effectively fuse these two models and leverage both advantages, we propose a parallel Transformer-CNN complementary (TCC) block, where CNNs and Transformers are combined to enhance the extraction of both local and global semantic information. Furthermore, we propose a TCC-based SemCom (TCC-SemCom) scheme for wireless image transmission. Experimental results verify that TCC-SemCom significantly outperforms existing schemes in terms of peak signal-to-noise ratio (PSNR) and multi-scale structural similarity index (MS-SSIM).
期刊介绍:
The IEEE Communications Letters publishes short papers in a rapid publication cycle on advances in the state-of-the-art of communication over different media and channels including wire, underground, waveguide, optical fiber, and storage channels. Both theoretical contributions (including new techniques, concepts, and analyses) and practical contributions (including system experiments and prototypes, and new applications) are encouraged. This journal focuses on the physical layer and the link layer of communication systems.