用于VVC数据驱动环内滤波器的qp -自适应双径残差集成变频器。

IF 3.4 3区综合性期刊 Q2 CHEMISTRY, ANALYTICAL

Sensors Pub Date : 2025-07-07 DOI:10.3390/s25134234

Cheng-Hsuan Yeh, Chi-Ting Ni, Kuan-Yu Huang, Zheng-Wei Wu, Cheng-Pin Peng, Pei-Yin Chen

{"title":"用于VVC数据驱动环内滤波器的qp -自适应双径残差集成变频器。","authors":"Cheng-Hsuan Yeh, Chi-Ting Ni, Kuan-Yu Huang, Zheng-Wei Wu, Cheng-Pin Peng, Pei-Yin Chen","doi":"10.3390/s25134234","DOIUrl":null,"url":null,"abstract":"As AI-enabled embedded systems such as smart TVs and edge devices demand efficient video processing, Versatile Video Coding (VVC/H.266) becomes essential for bandwidth-constrained Multimedia Internet of Things (M-IoT) applications. However, its block-based coding often introduces compression artifacts. While CNN-based methods effectively reduce these artifacts, maintaining robust performance across varying quantization parameters (QPs) remains challenging. Recent QP-adaptive designs like QA-Filter show promise but are still limited. This paper proposes DRIFT, a QP-adaptive in-loop filtering network for VVC. DRIFT combines a lightweight frequency fusion CNN (LFFCNN) for local enhancement and a Swin Transformer-based global skip connection for capturing long-range dependencies. LFFCNN leverages octave convolution and introduces a novel residual block (FFRB) that integrates multiscale extraction, QP adaptivity, frequency fusion, and spatial-channel attention. A QP estimator (QPE) is further introduced to mitigate double enhancement in inter-coded frames. Experimental results demonstrate that DRIFT achieves BD rate reductions of 6.56% (intra) and 4.83% (inter), with an up to 10.90% gain on the BasketballDrill sequence. Additionally, LFFCNN reduces the model size by 32% while slightly improving the coding performance over QA-Filter.","PeriodicalId":21698,"journal":{"name":"Sensors","volume":"25 13","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12252514/pdf/","citationCount":"0","resultStr":"{\"title\":\"QP-Adaptive Dual-Path Residual Integrated Frequency Transformer for Data-Driven In-Loop Filter in VVC.\",\"authors\":\"Cheng-Hsuan Yeh, Chi-Ting Ni, Kuan-Yu Huang, Zheng-Wei Wu, Cheng-Pin Peng, Pei-Yin Chen\",\"doi\":\"10.3390/s25134234\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As AI-enabled embedded systems such as smart TVs and edge devices demand efficient video processing, Versatile Video Coding (VVC/H.266) becomes essential for bandwidth-constrained Multimedia Internet of Things (M-IoT) applications. However, its block-based coding often introduces compression artifacts. While CNN-based methods effectively reduce these artifacts, maintaining robust performance across varying quantization parameters (QPs) remains challenging. Recent QP-adaptive designs like QA-Filter show promise but are still limited. This paper proposes DRIFT, a QP-adaptive in-loop filtering network for VVC. DRIFT combines a lightweight frequency fusion CNN (LFFCNN) for local enhancement and a Swin Transformer-based global skip connection for capturing long-range dependencies. LFFCNN leverages octave convolution and introduces a novel residual block (FFRB) that integrates multiscale extraction, QP adaptivity, frequency fusion, and spatial-channel attention. A QP estimator (QPE) is further introduced to mitigate double enhancement in inter-coded frames. Experimental results demonstrate that DRIFT achieves BD rate reductions of 6.56% (intra) and 4.83% (inter), with an up to 10.90% gain on the BasketballDrill sequence. Additionally, LFFCNN reduces the model size by 32% while slightly improving the coding performance over QA-Filter.\",\"PeriodicalId\":21698,\"journal\":{\"name\":\"Sensors\",\"volume\":\"25 13\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12252514/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sensors\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.3390/s25134234\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sensors","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.3390/s25134234","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}

引用次数: 0

摘要

由于支持ai的嵌入式系统（如智能电视和边缘设备）需要高效的视频处理，通用视频编码（VVC/H.266）对于带宽受限的多媒体物联网（M-IoT）应用至关重要。然而，它的基于块的编码经常引入压缩工件。虽然基于cnn的方法有效地减少了这些伪像，但在不同量化参数（qp）之间保持鲁棒性仍然是一个挑战。最近的qp自适应设计，如QA-Filter，显示出了希望，但仍然有限。提出了一种用于VVC的qp自适应环内滤波网络DRIFT。DRIFT结合了用于局部增强的轻量级频率融合CNN （LFFCNN）和用于捕获远程依赖关系的基于Swin变压器的全局跳过连接。LFFCNN利用了八度卷积，并引入了一种新的残差块（FFRB），该残差块集成了多尺度提取、QP自适应、频率融合和空间信道关注。进一步引入QP估计器（QPE）来缓解编码间帧的双重增强。实验结果表明，在BasketballDrill序列上，DRIFT实现了6.56% （intra）和4.83% （inter）的BD率降低，增益高达10.90%。此外，LFFCNN减少了32%的模型大小，同时比QA-Filter略微提高了编码性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

QP-Adaptive Dual-Path Residual Integrated Frequency Transformer for Data-Driven In-Loop Filter in VVC.

As AI-enabled embedded systems such as smart TVs and edge devices demand efficient video processing, Versatile Video Coding (VVC/H.266) becomes essential for bandwidth-constrained Multimedia Internet of Things (M-IoT) applications. However, its block-based coding often introduces compression artifacts. While CNN-based methods effectively reduce these artifacts, maintaining robust performance across varying quantization parameters (QPs) remains challenging. Recent QP-adaptive designs like QA-Filter show promise but are still limited. This paper proposes DRIFT, a QP-adaptive in-loop filtering network for VVC. DRIFT combines a lightweight frequency fusion CNN (LFFCNN) for local enhancement and a Swin Transformer-based global skip connection for capturing long-range dependencies. LFFCNN leverages octave convolution and introduces a novel residual block (FFRB) that integrates multiscale extraction, QP adaptivity, frequency fusion, and spatial-channel attention. A QP estimator (QPE) is further introduced to mitigate double enhancement in inter-coded frames. Experimental results demonstrate that DRIFT achieves BD rate reductions of 6.56% (intra) and 4.83% (inter), with an up to 10.90% gain on the BasketballDrill sequence. Additionally, LFFCNN reduces the model size by 32% while slightly improving the coding performance over QA-Filter.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sensors 工程技术-电化学

CiteScore

7.30

自引率

12.80%

发文量

8430

审稿时长

1.7 months

期刊介绍： Sensors (ISSN 1424-8220) provides an advanced forum for the science and technology of sensors and biosensors. It publishes reviews (including comprehensive reviews on the complete sensors products), regular research papers and short notes. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.