Yun Tian;Jingkai Ying;Zhijin Qin;Ye Jin;Xiaoming Tao
{"title":"Synchronous Multi-Modal Semantic Communication System With Packet-Level Coding","authors":"Yun Tian;Jingkai Ying;Zhijin Qin;Ye Jin;Xiaoming Tao","doi":"10.1109/TWC.2025.3534995","DOIUrl":null,"url":null,"abstract":"Although the semantic communication with joint semantic-channel coding design has shown promising performance in transmitting data of different modalities over physical layer channels, the synchronization and packet-level forward error correction (FEC) of multimodal semantics have not been well studied. Synchronizing multimodal features in both the semantic and time domains is challenging due to the independent design of semantic encoders. In this paper, we take the facial video and speech transmission as an example and propose a Synchronous Multi-modal Semantic Communication System with Packet-Level Coding (SyncSC). To achieve semantic and time synchronization, 3D Morphable Mode (3DMM) coefficients and text are transmitted as semantics. We propose a semantic codec that achieves similar reconstruction quality with lower bandwidth. The visual-guided speech synthesis is designed to synchronize video, text and speech. We propose a packet-Level FEC method for video semantics, called PacSC, that maintains visual quality even at high packet loss rates. For text packets, a text packet loss concealment module, called TextPC, based on Bidirectional Encoder Representations from Transformers (BERT) is proposed, which improves the performance of traditional FEC methods. Simulation results show that SyncSC reduces transmission overhead while ensuring high-quality synchronous transmission of video and speech over the packet loss network.","PeriodicalId":13431,"journal":{"name":"IEEE Transactions on Wireless Communications","volume":"24 5","pages":"3684-3697"},"PeriodicalIF":10.7000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Wireless Communications","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10872781/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Although the semantic communication with joint semantic-channel coding design has shown promising performance in transmitting data of different modalities over physical layer channels, the synchronization and packet-level forward error correction (FEC) of multimodal semantics have not been well studied. Synchronizing multimodal features in both the semantic and time domains is challenging due to the independent design of semantic encoders. In this paper, we take the facial video and speech transmission as an example and propose a Synchronous Multi-modal Semantic Communication System with Packet-Level Coding (SyncSC). To achieve semantic and time synchronization, 3D Morphable Mode (3DMM) coefficients and text are transmitted as semantics. We propose a semantic codec that achieves similar reconstruction quality with lower bandwidth. The visual-guided speech synthesis is designed to synchronize video, text and speech. We propose a packet-Level FEC method for video semantics, called PacSC, that maintains visual quality even at high packet loss rates. For text packets, a text packet loss concealment module, called TextPC, based on Bidirectional Encoder Representations from Transformers (BERT) is proposed, which improves the performance of traditional FEC methods. Simulation results show that SyncSC reduces transmission overhead while ensuring high-quality synchronous transmission of video and speech over the packet loss network.
期刊介绍:
The IEEE Transactions on Wireless Communications is a prestigious publication that showcases cutting-edge advancements in wireless communications. It welcomes both theoretical and practical contributions in various areas. The scope of the Transactions encompasses a wide range of topics, including modulation and coding, detection and estimation, propagation and channel characterization, and diversity techniques. The journal also emphasizes the physical and link layer communication aspects of network architectures and protocols.
The journal is open to papers on specific topics or non-traditional topics related to specific application areas. This includes simulation tools and methodologies, orthogonal frequency division multiplexing, MIMO systems, and wireless over optical technologies.
Overall, the IEEE Transactions on Wireless Communications serves as a platform for high-quality manuscripts that push the boundaries of wireless communications and contribute to advancements in the field.