Yucheng Jiang , Songping Mai , Peng Zhang , Junwei Hu , Jie Yu , Jian Cheng
{"title":"Enhancing real-time UHD intra-frame coding with parallel–serial hybrid neural networks","authors":"Yucheng Jiang , Songping Mai , Peng Zhang , Junwei Hu , Jie Yu , Jian Cheng","doi":"10.1016/j.displa.2025.103034","DOIUrl":null,"url":null,"abstract":"<div><div>The primary objective of a video encoder is to achieve both high real-time performance and a high compression ratio. Delivering these capabilities in a cost-effective hardware environment is crucial for practical applications. Numerous institutions have developed highly-optimized implementations for the mainstream video coding standards, such as x265 for HEVC, VVenC for VVC, and uAVS3e for AVS3. However, these implementations are still not capable of performing real-time encoding of 4K/8K UHD videos without significantly reducing compression complexity. This paper presents a parallel–serial hybrid neural network scheme, specifically tailored to expedite intra-frame block partitioning decisions. The parallel network is designed to extract effective features while minimizing the impact of network inference time. Simultaneously, the lightweight serial network effectively overcomes accuracy issue related to the data dependency introduced by the reconstructed pixels. The proposed enhancement scheme is integrated into the uAVS3 Real-Time encoder. The experimental results for CTC 4K UHD sequences demonstrate a significant increase in encoding speed (+30.2%) and an improvement in encoding quality, as evidenced by a 0.24% reduction in BD-BR. Compared to the previous work, we achieve the optimal trade-off in these two critical metrics. Furthermore, we integrated the enhanced encoder into the FFmpeg framework, enabling an efficient video encoding system capable of achieving 4K@50FPS and 8K@9FPS on affordable hardware configurations.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103034"},"PeriodicalIF":3.7000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S014193822500071X","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The primary objective of a video encoder is to achieve both high real-time performance and a high compression ratio. Delivering these capabilities in a cost-effective hardware environment is crucial for practical applications. Numerous institutions have developed highly-optimized implementations for the mainstream video coding standards, such as x265 for HEVC, VVenC for VVC, and uAVS3e for AVS3. However, these implementations are still not capable of performing real-time encoding of 4K/8K UHD videos without significantly reducing compression complexity. This paper presents a parallel–serial hybrid neural network scheme, specifically tailored to expedite intra-frame block partitioning decisions. The parallel network is designed to extract effective features while minimizing the impact of network inference time. Simultaneously, the lightweight serial network effectively overcomes accuracy issue related to the data dependency introduced by the reconstructed pixels. The proposed enhancement scheme is integrated into the uAVS3 Real-Time encoder. The experimental results for CTC 4K UHD sequences demonstrate a significant increase in encoding speed (+30.2%) and an improvement in encoding quality, as evidenced by a 0.24% reduction in BD-BR. Compared to the previous work, we achieve the optimal trade-off in these two critical metrics. Furthermore, we integrated the enhanced encoder into the FFmpeg framework, enabling an efficient video encoding system capable of achieving 4K@50FPS and 8K@9FPS on affordable hardware configurations.
期刊介绍:
Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface.
Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.