Yongyi Miao, Zhongdang Li, Yang Wang, Die Hu, Jun Yan, Youfang Wang
{"title":"VQ-DeepVSC:用于视频语义通信的双级矢量量化框架","authors":"Yongyi Miao, Zhongdang Li, Yang Wang, Die Hu, Jun Yan, Youfang Wang","doi":"arxiv-2409.03393","DOIUrl":null,"url":null,"abstract":"In response to the rapid growth of global videomtraffic and the limitations\nof traditional wireless transmission systems, we propose a novel dual-stage\nvector quantization framework, VQ-DeepVSC, tailored to enhance video\ntransmission over wireless channels. In the first stage, we design the adaptive\nkeyframe extractor and interpolator, deployed respectively at the transmitter\nand receiver, which intelligently select key frames to minimize inter-frame\nredundancy and mitigate the cliff-effect under challenging channel conditions.\nIn the second stage, we propose the semantic vector quantization encoder and\ndecoder, placed respectively at the transmitter and receiver, which efficiently\ncompress key frames using advanced indexing and spatial normalization modules\nto reduce redundancy. Additionally, we propose adjustable index selection and\nrecovery modules, enhancing compression efficiency and enabling flexible\ncompression ratio adjustment. Compared to the joint source-channel coding\n(JSCC) framework, the proposed framework exhibits superior compatibility with\ncurrent digital communication systems. Experimental results demonstrate that\nVQ-DeepVSC achieves substantial improvements in both Multi-Scale Structural\nSimilarity (MS-SSIM) and Learned Perceptual Image Patch Similarity (LPIPS)\nmetrics than the H.265 standard, particularly under low channel signal-to-noise\nratio (SNR) or multi-path channels, highlighting the significantly enhanced\ntransmission capabilities of our approach.","PeriodicalId":501280,"journal":{"name":"arXiv - CS - Networking and Internet Architecture","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VQ-DeepVSC: A Dual-Stage Vector Quantization Framework for Video Semantic Communication\",\"authors\":\"Yongyi Miao, Zhongdang Li, Yang Wang, Die Hu, Jun Yan, Youfang Wang\",\"doi\":\"arxiv-2409.03393\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In response to the rapid growth of global videomtraffic and the limitations\\nof traditional wireless transmission systems, we propose a novel dual-stage\\nvector quantization framework, VQ-DeepVSC, tailored to enhance video\\ntransmission over wireless channels. In the first stage, we design the adaptive\\nkeyframe extractor and interpolator, deployed respectively at the transmitter\\nand receiver, which intelligently select key frames to minimize inter-frame\\nredundancy and mitigate the cliff-effect under challenging channel conditions.\\nIn the second stage, we propose the semantic vector quantization encoder and\\ndecoder, placed respectively at the transmitter and receiver, which efficiently\\ncompress key frames using advanced indexing and spatial normalization modules\\nto reduce redundancy. Additionally, we propose adjustable index selection and\\nrecovery modules, enhancing compression efficiency and enabling flexible\\ncompression ratio adjustment. Compared to the joint source-channel coding\\n(JSCC) framework, the proposed framework exhibits superior compatibility with\\ncurrent digital communication systems. Experimental results demonstrate that\\nVQ-DeepVSC achieves substantial improvements in both Multi-Scale Structural\\nSimilarity (MS-SSIM) and Learned Perceptual Image Patch Similarity (LPIPS)\\nmetrics than the H.265 standard, particularly under low channel signal-to-noise\\nratio (SNR) or multi-path channels, highlighting the significantly enhanced\\ntransmission capabilities of our approach.\",\"PeriodicalId\":501280,\"journal\":{\"name\":\"arXiv - CS - Networking and Internet Architecture\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Networking and Internet Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03393\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Networking and Internet Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
VQ-DeepVSC: A Dual-Stage Vector Quantization Framework for Video Semantic Communication
In response to the rapid growth of global videomtraffic and the limitations
of traditional wireless transmission systems, we propose a novel dual-stage
vector quantization framework, VQ-DeepVSC, tailored to enhance video
transmission over wireless channels. In the first stage, we design the adaptive
keyframe extractor and interpolator, deployed respectively at the transmitter
and receiver, which intelligently select key frames to minimize inter-frame
redundancy and mitigate the cliff-effect under challenging channel conditions.
In the second stage, we propose the semantic vector quantization encoder and
decoder, placed respectively at the transmitter and receiver, which efficiently
compress key frames using advanced indexing and spatial normalization modules
to reduce redundancy. Additionally, we propose adjustable index selection and
recovery modules, enhancing compression efficiency and enabling flexible
compression ratio adjustment. Compared to the joint source-channel coding
(JSCC) framework, the proposed framework exhibits superior compatibility with
current digital communication systems. Experimental results demonstrate that
VQ-DeepVSC achieves substantial improvements in both Multi-Scale Structural
Similarity (MS-SSIM) and Learned Perceptual Image Patch Similarity (LPIPS)
metrics than the H.265 standard, particularly under low channel signal-to-noise
ratio (SNR) or multi-path channels, highlighting the significantly enhanced
transmission capabilities of our approach.