{"title":"Improved Nonlinear Transform Source-Channel Coding to Catalyze Semantic Communications","authors":"Sixian Wang;Jincheng Dai;Xiaoqi Qin;Zhongwei Si;Kai Niu;Ping Zhang","doi":"10.1109/JSTSP.2023.3304140","DOIUrl":"10.1109/JSTSP.2023.3304140","url":null,"abstract":"Recent deep learning methods have led to increased interest in solving high-efficiency end-to-end transmission problems. These methods, we call \u0000<italic>nonlinear transform source-channel coding (NTSCC)</i>\u0000, extract the semantic latent features of source signal, and learn entropy model to guide the joint source-channel coding with variable rate to transmit latent features over wireless channels. In this article, we propose a comprehensive framework for improving NTSCC, thereby higher system coding gain, better model compatibility, more flexible adaptation strategy aligned with semantic guidance are all achieved. This new sophisticated NTSCC model is now ready to support large-size data interaction in emerging XR, which catalyzes the application of semantic communications. Specifically, we propose three useful improvement approaches. First, we introduce a contextual entropy model to better capture the spatial correlations among the semantic latent features, thereby more accurate rate allocation and contextual joint source-channel coding method are developed accordingly to enable higher coding gain. On that basis, we further propose a response network architecture to formulate \u0000<italic>compatible</i>\u0000 NTSCC, i.e., once-learned model supports various bandwidth ratios and channel states that benefits practical deployment greatly. Following this, we propose an online latent feature editing mechanism to enable more flexible coding rate allocation aligned with some specific semantic guidance. By comprehensively applying the above three improvement methods for NTSCC, a deployment-friendly semantic coded transmission system stands out finally. Our improved NTSCC system has been experimentally verified to achieve a better rate-distortion efficiency versus the state-of-the-art engineered VTM + 5G LDPC coded transmission system with lower processing latency.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"1022-1037"},"PeriodicalIF":7.5,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136029380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuelin Liu;Jiebin Yan;Zheng Wan;Yuming Fang;Zhou Wang
{"title":"A Quality-of-Experience Database for Adaptive Omnidirectional Video Streaming","authors":"Xuelin Liu;Jiebin Yan;Zheng Wan;Yuming Fang;Zhou Wang","doi":"10.1109/JSTSP.2023.3300529","DOIUrl":"10.1109/JSTSP.2023.3300529","url":null,"abstract":"Recent advances in virtual reality (VR) technologies and devices have enabled new forms of media content, such as omnidirectional video (ODV) that attracts increasing attention of both academic and industrial communities. Omnidirectional video, which is also called \u0000<inline-formula><tex-math>$360^circ$</tex-math></inline-formula>\u0000 video, represents panoramic spherical video that can give users an immersive viewing experience. Compared with traditional 2D video, the complex characteristic (high resolution, bandwidth intensive, etc.) of ODVs brings up new challenges to stream them under volatile network conditions and model the quality-of-experience (QoE) of end-users. To meet the requirements of practical VR applications, it is desired to design effective methods for adaptive bitrate streaming (ABR) and QoE evaluation of ODVs. In this article, we establish a large omnidirectional video streaming QoE database (VRQoE-JUFE), containing 1,440 adaptive streaming ODVs generated with diverse content. A comprehensive subjective experiment is conducted, where viewing behaviors and human opinions of total 180 subjects are collected. We provide a thorough statistical data analysis and carry out performance evaluation of the-state-of-art objective QoE models on the proposed database. The results suggest that QoE modeling for ODV streaming is an extremely challenging problem and there is a large space for improvement. Many interesting observations are made that may shed light on the improvement in both omnidirectional video QoE modeling and ABR strategies in the future. The annotated dataset from the tests is made publicly available for the research community.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"949-963"},"PeriodicalIF":7.5,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81739407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic Communications With Variable-Length Coding for Extended Reality","authors":"Bowen Zhang;Zhijin Qin;Geoffrey Ye Li","doi":"10.1109/JSTSP.2023.3300509","DOIUrl":"10.1109/JSTSP.2023.3300509","url":null,"abstract":"Wireless extended reality (XR) has attracted wide attentions as a promising technology to improve users' mobility and quality of experience. However, the ultra-high data rate requirement of wireless XR has hindered its development for many years. To overcome this challenge, we develop a semantic communication framework, where semantically-unimportant information is highly-compressed or discarded in semantic coders, significantly improving the transmission efficiency. Besides, considering the fact that some source content may have less amount of semantic information or have higher tolerance to channel noise, we propose a universal variable-length semantic-channel coding method. In particular, we first use a rate allocation network to estimate the best code length for semantic information and then adjust the coding process accordingly. By adopting some proxy functions, the whole framework is trained in an end-to-end manner. Numerical results show that our semantic system significantly outperforms traditional transmission methods and the proposed variable-length coding scheme is superior to the fixed-length coding methods.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"1038-1051"},"PeriodicalIF":7.5,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10198383","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79633368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Covert Wireless Communications for Augmented Reality Systems With Dual Cooperative UAVs","authors":"Guo Yang;Yuwen Qian;Ke Ren;Zhen Mei;Feng Shu;Xiangwei Zhou;Wen Wu","doi":"10.1109/JSTSP.2023.3299116","DOIUrl":"10.1109/JSTSP.2023.3299116","url":null,"abstract":"Unmanned aerial vehicle (UAV) aided augmented reality (AR) has developed rapidly in recent years and has become a promising technology in disaster rescue, transportation, agriculture, and environmental monitoring. However, the information leakage is challenging the usage of UAV-aided AR systems with wireless communications. In this article, a dual UAVs assisted covert communication system (CCS) is proposed, where one UAV transmits the covert message to ground receivers and a cooperative UAV performs as a jammer to interfere with the malicious eavesdropper. However, flying UAVs result in fast channel fading between the UAVs and ground nodes, which reduces the covert rate of the CCSs. To maximize the average covert rate, we formulated a non-convex optimization problem under the constraint that the detection error probability (DEP) of the monitor is minimum. Furthermore, the problem is decomposed into three subproblems and transformed into convex, and these subproblems are solved alternately by designing an iterative algorithm. Simulation results reveal that the average covert rate performance of the proposed optimization algorithm can respectively achieve 16% and 40% gains than those without covert constraint and without the cooperative UAV used as a jammer.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"1119-1130"},"PeriodicalIF":7.5,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73176703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Knowledge-Data Driven Channel Semantic Acquisition and Beamforming for Cell-Free Massive MIMO","authors":"Zhen Gao;Shicong Liu;Yu Su;Zhongxiang Li;Dezhi Zheng","doi":"10.1109/JSTSP.2023.3299175","DOIUrl":"https://doi.org/10.1109/JSTSP.2023.3299175","url":null,"abstract":"This article focuses on advancing outdoor wireless systems to better support ubiquitous extended reality (XR) applications, and close the gap with current indoor wireless transmission capabilities. We propose a hybrid knowledge-data driven method for channel semantic acquisition and multi-user beamforming in cell-free massive multiple-input multiple-output (MIMO) systems. Specifically, we firstly propose a data-driven multiple layer perceptron (MLP)-Mixer-based auto-encoder for channel semantic acquisition, where the pilot signals, CSI quantizer for channel semantic embedding, and CSI reconstruction for channel semantic extraction are jointly optimized in an end-to-end manner. Moreover, based on the acquired channel semantic, we further propose a knowledge-driven deep-unfolding multi-user beamformer, which is capable of achieving good spectral efficiency with robustness to imperfect CSI in outdoor XR scenarios. By unfolding conventional successive over-relaxation (SOR)-based linear beamforming scheme with deep learning, the proposed beamforming scheme is capable of adaptively learning the optimal parameters to accelerate convergence and improve the robustness to imperfect CSI. The proposed deep unfolding beamforming scheme can be used for access points (APs) with fully-digital array and APs with hybrid analog-digital array. Simulation results demonstrate the effectiveness of our proposed scheme in improving the accuracy of channel acquisition, as well as reducing complexity in both CSI acquisition and beamformer design. The proposed beamforming method achieves approximately 96% of the converged spectrum efficiency performance after only three iterations in downlink transmission, demonstrating its efficacy and potential to improve outdoor XR applications.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"964-979"},"PeriodicalIF":7.5,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138138069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Di Wu;Tong Shen;Feng Shu;Yuanyuan Wu;Lingling Zhu;Siling Feng;Mengxing Huang;Jiangzhou Wang
{"title":"Secure Hybrid Analog and Digital Beamforming for mmWave XR Communications With Mixed-DAC","authors":"Di Wu;Tong Shen;Feng Shu;Yuanyuan Wu;Lingling Zhu;Siling Feng;Mengxing Huang;Jiangzhou Wang","doi":"10.1109/JSTSP.2023.3298474","DOIUrl":"10.1109/JSTSP.2023.3298474","url":null,"abstract":"To achieve a balance between performance and implementation complexity in extended reality (XR)-aided millimeter wave (mmWave) communication, secure hybrid digital and analog (HDA) beamforming with mixed digital-to-analog converters (DACs) is established by partially replacing costly full-resolution DACs with some cheap low-resolution DACs. We focus on secure HDA beamforming for such a system. Furthermore, XR technology is aided to improve the operating efficiency in this complex scenario. First, a closed-form approximation of the average secrecy rate (ASR) is derived. To maximize ASR with partial eavesdropping channel knowledge available, we propose an algorithm of skillfully utilizing the alternating iteration to design beamformers of analog, confidential message (CM) and artificial noise (AN). Given the analog and CM/AN beamformers, the updated AN/CM beamformer is addressed by a gradient descent algorithm. Then, given the beamformers of CM and AN, Dinkelbach and Majorization-Minimization are combined to optimize analog beamformer. Simulation results show that the proposed algorithm achieves much better ASR performance than existing methods in the medium and high signal-to-noise ratio regions.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"995-1006"},"PeriodicalIF":7.5,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85453322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinxi Li;Yutong Xu;Yang Cao;Jiaxin Zhu;Desheng Wang
{"title":"Utility-Driven Joint Caching and Bitrate Allocation for Real-Time Immersive Videos","authors":"Jinxi Li;Yutong Xu;Yang Cao;Jiaxin Zhu;Desheng Wang","doi":"10.1109/JSTSP.2023.3295597","DOIUrl":"10.1109/JSTSP.2023.3295597","url":null,"abstract":"Real-time immersive video demands high network bandwidth and low transmission delay. Limited communication resources make it time-consuming to deliver immersive videos in cloud service scenarios. To overcome this, we design a utility-driven \u0000<italic>JOint Caching and Bitrate allocation (JOCB)</i>\u0000 algorithm for the real-time immersive video to better utilize network and caching resources through the Mobile Edge Computing (MEC) technique. Firstly, we coin a concept, the unfreshness indicator, to reflect the obsolescence level of cached tiles in MEC. Secondly, we define the Quality of Immersive videos (QoI) to evaluate the users' experience, including content characteristics, unfreshness levels, and spatial and temporal quality loss. Thirdly, we formulate the system utility that increases effective quality at the cost of transmission loss. The utility optimization problem can be formulated as an integer programming problem and decomposed into the cache update subproblem and the viewing probability-based adaptive bitrate allocation subproblem, which are solved by the branch-and-bound algorithm and the greedy algorithm, respectively. We have implemented an immersive video transmission system to perform experiments. Both simulation and experimental results further imply that \u0000<italic>JOCB</i>\u0000 can achieve utility maximization through balancing the transmission cost and the QoI.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"1106-1118"},"PeriodicalIF":7.5,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72769091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minrui Xu;Dusit Niyato;Junlong Chen;Hongliang Zhang;Jiawen Kang;Zehui Xiong;Shiwen Mao;Zhu Han
{"title":"Generative AI-Empowered Simulation for Autonomous Driving in Vehicular Mixed Reality Metaverses","authors":"Minrui Xu;Dusit Niyato;Junlong Chen;Hongliang Zhang;Jiawen Kang;Zehui Xiong;Shiwen Mao;Zhu Han","doi":"10.1109/JSTSP.2023.3293650","DOIUrl":"10.1109/JSTSP.2023.3293650","url":null,"abstract":"In the vehicular mixed reality (MR) Metaverse, the discrepancy between physical and virtual entities can be overcome by fusing the physical and virtual environments with multi-dimensional communications in autonomous driving systems. Assisted by digital twin (DT) technologies, connected autonomous vehicles (AVs), roadside units (RSUs), and virtual simulators can maintain the vehicular MR Metaverse via simulations for sharing data and making driving decisions collaboratively. However, it is challenging and costly to enable large-scale traffic and driving simulation via realistic data collection and fusion from the physical world for online prediction and offline training in autonomous driving systems. In this paper, we propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data via simulations for improving driving safety and traffic control efficiency. First, we propose a multi-task DT offloading model for the reliable execution of heterogeneous DT tasks with different requirements at RSUs. Then, based on the preferences of AV's DTs and real-world data, virtual simulators can synthesize unlimited conditioned driving and traffic datasets for improved robustness. Finally, we propose a multi-task enhanced auction-based mechanism to provide fine-grained incentives for RSUs on providing resources for autonomous driving. The property analysis and experimental results demonstrate that the proposed mechanism and architecture are strategy-proof and effective.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 5","pages":"1064-1079"},"PeriodicalIF":7.5,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136297688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yejian Lyu;Zhiqiang Yuan;Fengchun Zhang;Pekka Kyösti;Wei Fan
{"title":"Virtual Antenna Array for W-Band Channel Sounding: Design, Implementation, and Experimental Validation","authors":"Yejian Lyu;Zhiqiang Yuan;Fengchun Zhang;Pekka Kyösti;Wei Fan","doi":"10.1109/JSTSP.2023.3301135","DOIUrl":"https://doi.org/10.1109/JSTSP.2023.3301135","url":null,"abstract":"Sub-Terahertz (sub-THz) (i.e., 100–300 GHz) communication is envisioned as one of the key components for future beyond fifth-generation (B5G) communication systems due to its large untapped bandwidth. Sub-THz channel measurements are essential for building accurate and realistic sub-THz channel models. Virtual antenna array (VAA) scheme has been widely employed for radio channel sounding purposes in the literature. However, its application for the W-band (i.e., 75–110 GHz) has been rarely discussed due to system phase instability issues. To tackle this problem, a long-range phase-compensated vector network analyzer (VNA)-based channel sounder at the W-band is proposed. First, the back-to-back measurement of the developed channel sounder is carried out with the presence of cable bending, where the initial phase variation beyond \u0000<inline-formula><tex-math>$180^{circ }$</tex-math></inline-formula>\u0000 range due to cable effects can be well corrected to within \u0000<inline-formula><tex-math>$10^{circ }$</tex-math></inline-formula>\u0000 range with the proposed phase-compensation scheme, clearly validating its effectiveness. To examine how well it works in practical deployment scenarios, the proposed channel sounder is then employed for channel sounding with two measurement distances, covering both near-field (with a line-of-sight (LoS) distance of 7.3 m) and long-range (with a LoS distance of 84.5 m) cases. Based on the measured data, a high-resolution channel parameter estimator is applied to extract the channel multipath parameters for the large-scale VAA at the W-band, both in the near-field and long-range scenarios, respectively. The high-resolution algorithm was extended to support virtual arrays composed of both omnidirectional antenna and directive antenna in this work. The conventional directional scanning scheme (DSS) measurement is adopted as the reference measurement to validate the effectiveness and robustness of the developed channel sounder. In the end, to demonstrate the state-of-art channel sounding capabilities of the developed channel sounder, ultra-wideband (UWB) channel measurements at 104.5 GHz with 11 GHz bandwidth using the VAA scheme are conducted in a hall scenario with the measurement range up to 58 m with omnidirectional antennas, and the channel parameters are extracted using the validated high-resolution channel parameter estimator for channel modeling purposes.","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 4","pages":"729-744"},"PeriodicalIF":7.5,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50397107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Signal Processing Society Information","authors":"","doi":"10.1109/JSTSP.2023.3306635","DOIUrl":"https://doi.org/10.1109/JSTSP.2023.3306635","url":null,"abstract":"","PeriodicalId":13038,"journal":{"name":"IEEE Journal of Selected Topics in Signal Processing","volume":"17 4","pages":"C3-C3"},"PeriodicalIF":7.5,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/4200690/10284021/10284026.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50274207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}