{"title":"Improved Nonlinear Transform Source-Channel Coding to Catalyze Semantic Communications","authors":"Sixian Wang;Jincheng Dai;Xiaoqi Qin;Zhongwei Si;Kai Niu;Ping Zhang","doi":"10.1109/JSTSP.2023.3304140","DOIUrl":null,"url":null,"abstract":"Recent deep learning methods have led to increased interest in solving high-efficiency end-to-end transmission problems. These methods, we call \n<italic>nonlinear transform source-channel coding (NTSCC)</i>\n, extract the semantic latent features of source signal, and learn entropy model to guide the joint source-channel coding with variable rate to transmit latent features over wireless channels. In this article, we propose a comprehensive framework for improving NTSCC, thereby higher system coding gain, better model compatibility, more flexible adaptation strategy aligned with semantic guidance are all achieved. This new sophisticated NTSCC model is now ready to support large-size data interaction in emerging XR, which catalyzes the application of semantic communications. Specifically, we propose three useful improvement approaches. First, we introduce a contextual entropy model to better capture the spatial correlations among the semantic latent features, thereby more accurate rate allocation and contextual joint source-channel coding method are developed accordingly to enable higher coding gain. On that basis, we further propose a response network architecture to formulate \n<italic>compatible</i>\n NTSCC, i.e., once-learned model supports various bandwidth ratios and channel states that benefits practical deployment greatly. Following this, we propose an online latent feature editing mechanism to enable more flexible coding rate allocation aligned with some specific semantic guidance. By comprehensively applying the above three improvement methods for NTSCC, a deployment-friendly semantic coded transmission system stands out finally. Our improved NTSCC system has been experimentally verified to achieve a better rate-distortion efficiency versus the state-of-the-art engineered VTM + 5G LDPC coded transmission system with lower processing latency.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"1022-1037"},"PeriodicalIF":8.7000,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10214392/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 1
Abstract
Recent deep learning methods have led to increased interest in solving high-efficiency end-to-end transmission problems. These methods, we call
nonlinear transform source-channel coding (NTSCC)
, extract the semantic latent features of source signal, and learn entropy model to guide the joint source-channel coding with variable rate to transmit latent features over wireless channels. In this article, we propose a comprehensive framework for improving NTSCC, thereby higher system coding gain, better model compatibility, more flexible adaptation strategy aligned with semantic guidance are all achieved. This new sophisticated NTSCC model is now ready to support large-size data interaction in emerging XR, which catalyzes the application of semantic communications. Specifically, we propose three useful improvement approaches. First, we introduce a contextual entropy model to better capture the spatial correlations among the semantic latent features, thereby more accurate rate allocation and contextual joint source-channel coding method are developed accordingly to enable higher coding gain. On that basis, we further propose a response network architecture to formulate
compatible
NTSCC, i.e., once-learned model supports various bandwidth ratios and channel states that benefits practical deployment greatly. Following this, we propose an online latent feature editing mechanism to enable more flexible coding rate allocation aligned with some specific semantic guidance. By comprehensively applying the above three improvement methods for NTSCC, a deployment-friendly semantic coded transmission system stands out finally. Our improved NTSCC system has been experimentally verified to achieve a better rate-distortion efficiency versus the state-of-the-art engineered VTM + 5G LDPC coded transmission system with lower processing latency.
期刊介绍:
The IEEE Journal of Selected Topics in Signal Processing (JSTSP) focuses on the Field of Interest of the IEEE Signal Processing Society, which encompasses the theory and application of various signal processing techniques. These techniques include filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals using digital or analog devices. The term "signal" covers a wide range of data types, including audio, video, speech, image, communication, geophysical, sonar, radar, medical, musical, and others.
The journal format allows for in-depth exploration of signal processing topics, enabling the Society to cover both established and emerging areas. This includes interdisciplinary fields such as biomedical engineering and language processing, as well as areas not traditionally associated with engineering.