Man M. Ho, Heming Sun, Zhiqiang Zhang, Jinjia Zhou
{"title":"On Pre-chewing Compression Degradation for Learned Video Compression","authors":"Man M. Ho, Heming Sun, Zhiqiang Zhang, Jinjia Zhou","doi":"10.1109/VCIP56404.2022.10008873","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence (AI) needs huge amounts of data, and so does Learned Restoration for Video Compression. There are two main problems regarding training data. 1) Preparing training compression degradation using a video codec (e.g., Versatile Video Coding - VVC) costs a considerable resource. Significantly, the more Quantization Parameters (QPs) we compress with, the more coding time and storage are required. 2) The common way of training a newly initialized Restoration Network on pure compression degradation at the beginning is not effective. To solve these problems, we propose a Degradation Network to pre-chew (generalize and learn to synthesize) the real compression degradation, then present a hybrid training scheme that allows a Restoration Network to be trained on unlimited videos without compression. Concretely, we propose a QP-wise Degradation Network to learn how to compress video frames like VVC in real-time and can transform the degradation output between QPs linearly. The real compression degradation is thus pre-chewed as our Degradation Network can synthesize the more generalized degradation for a newly initialized Restoration Network to learn easier. To diversify training video content without compression and avoid overfitting, we design a Training Framework for Semi-Compression Degradation (TF-SCD) to train our model on many fake compressed videos together with real compressed videos. As a result, the Restoration Network can quickly jump to the near-best optimum at the beginning of training, proving our promising scheme of using pre-chewed data for the very first steps of training. In other words, a newly initialized Learned Video Compression can be warmed up efficiently but effectively with our pre-trained Degradation Network. Besides, our proposed TF-SCD can further enhance the restoration performance in a specific range of QPs and provide a better generalization about QPs compared with the common way of training a restoration model. Our work is available at https://minhmanho.github.io/prechewing_degradation.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial Intelligence (AI) needs huge amounts of data, and so does Learned Restoration for Video Compression. There are two main problems regarding training data. 1) Preparing training compression degradation using a video codec (e.g., Versatile Video Coding - VVC) costs a considerable resource. Significantly, the more Quantization Parameters (QPs) we compress with, the more coding time and storage are required. 2) The common way of training a newly initialized Restoration Network on pure compression degradation at the beginning is not effective. To solve these problems, we propose a Degradation Network to pre-chew (generalize and learn to synthesize) the real compression degradation, then present a hybrid training scheme that allows a Restoration Network to be trained on unlimited videos without compression. Concretely, we propose a QP-wise Degradation Network to learn how to compress video frames like VVC in real-time and can transform the degradation output between QPs linearly. The real compression degradation is thus pre-chewed as our Degradation Network can synthesize the more generalized degradation for a newly initialized Restoration Network to learn easier. To diversify training video content without compression and avoid overfitting, we design a Training Framework for Semi-Compression Degradation (TF-SCD) to train our model on many fake compressed videos together with real compressed videos. As a result, the Restoration Network can quickly jump to the near-best optimum at the beginning of training, proving our promising scheme of using pre-chewed data for the very first steps of training. In other words, a newly initialized Learned Video Compression can be warmed up efficiently but effectively with our pre-trained Degradation Network. Besides, our proposed TF-SCD can further enhance the restoration performance in a specific range of QPs and provide a better generalization about QPs compared with the common way of training a restoration model. Our work is available at https://minhmanho.github.io/prechewing_degradation.