Xiang Peng;Zhijin Qin;Xiaoming Tao;Jianhua Lu;Khaled B. Letaief
{"title":"A Robust Image Semantic Communication System With Multi-Scale Vision Transformer","authors":"Xiang Peng;Zhijin Qin;Xiaoming Tao;Jianhua Lu;Khaled B. Letaief","doi":"10.1109/JSAC.2025.3531413","DOIUrl":null,"url":null,"abstract":"Semantic communications have demonstrated exceptional performance across various tasks, yet they are susceptible to semantic impairments due to the inherent vulnerability of deep neural networks. This paper focuses on semantic impairments in images, particularly those stemming from adversarial perturbations. We introduce a novel metric for quantifying the level of semantic impairment and create a semantic impairment dataset. Furthermore, we propose a deep learning enabled semantic communication system for robust image transmission, termed as DeepSC-RI. The proposed system harnesses a multi-scale semantic extractor with a dual-branch design tailored for extracting semantics with varying granularity, thereby boosting the robustness of the system. The fine-grained branch incorporates a semantic importance evaluation module to identify and prioritize crucial semantics through self-attention score manipulations, while the coarse-grained branch adopts a hierarchical approach for progressively capturing the robust semantics. These two streams of semantics are seamlessly integrated via an advanced cross-attention-based semantic fusion module. Experimental results highlight the superior performance of DeepSC-RI under diverse channel conditions, across various levels of semantic impairment intensity, and in multiple tasks.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 4","pages":"1278-1291"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10854360/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Semantic communications have demonstrated exceptional performance across various tasks, yet they are susceptible to semantic impairments due to the inherent vulnerability of deep neural networks. This paper focuses on semantic impairments in images, particularly those stemming from adversarial perturbations. We introduce a novel metric for quantifying the level of semantic impairment and create a semantic impairment dataset. Furthermore, we propose a deep learning enabled semantic communication system for robust image transmission, termed as DeepSC-RI. The proposed system harnesses a multi-scale semantic extractor with a dual-branch design tailored for extracting semantics with varying granularity, thereby boosting the robustness of the system. The fine-grained branch incorporates a semantic importance evaluation module to identify and prioritize crucial semantics through self-attention score manipulations, while the coarse-grained branch adopts a hierarchical approach for progressively capturing the robust semantics. These two streams of semantics are seamlessly integrated via an advanced cross-attention-based semantic fusion module. Experimental results highlight the superior performance of DeepSC-RI under diverse channel conditions, across various levels of semantic impairment intensity, and in multiple tasks.