Navid Mahmoudian Bidgoli, Thomas Maugey, A. Roumy, F. Nasiri, F. Payan
{"title":"A geometry-aware compression of 3D mesh texture with random access","authors":"Navid Mahmoudian Bidgoli, Thomas Maugey, A. Roumy, F. Nasiri, F. Payan","doi":"10.1109/PCS48520.2019.8954519","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954519","url":null,"abstract":"A 3D mesh object is usually represented as a combination of several entities including geometrical information (i.e., the triangles and their position in space) and a texture atlas/map (i.e. a giant 2D image containing all the texture information that is mapped to the 3D object at the rendering stage). This atlas is usually compressed using a conventional 2D image coder, thus without taking into account the geometrical information. Moreover, the whole image is usually decoded even though only a subpart of the mesh is observed by a user. In this paper, we propose a novel approach to compress a texture atlas of a 3D model that enables random access during decoding, and nevertheless takes into account the correlation driven by the geometrical information. The experimental results demonstrate the benefits of the proposed coder.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129548690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hui Su, Mingliang Chen, A. Bokov, D. Mukherjee, Yunqing Wang, Yue Chen
{"title":"Machine Learning Accelerated Transform Search For AV1","authors":"Hui Su, Mingliang Chen, A. Bokov, D. Mukherjee, Yunqing Wang, Yue Chen","doi":"10.1109/PCS48520.2019.8954514","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954514","url":null,"abstract":"AV1 is the state-of-the-art open and royalty-free video compression format that achieves significant bitrate savings over previous generation of video codecs. One of AV1’s major improvement over its predecessor VP9 is the support of more diverse and flexible transform size and kernel selection. However, it also drastically increases the search space for transform unit rate-distortion optimization in AV1 encoders. Unlike conventional encoder speed features that are based on heuristics, we propose a machine learning (ML) based approach to accelerate the transform size and kernel search for AV1. The ML models use input features extracted from the prediction residue block such as standard deviation, correlation and energy distribution. The output of the models indicates the estimated likelihood of which transform size and kernel would be selected as the optimal choice. Based on the ML models, the encoder can prune out the transform size and kernel candidates that are unlikely to be selected and save unnecessary computation to compute their rate-distortion cost. The proposed approach is implemented and tested on the AV1 reference library libaom. The experimental results show that satisfactory encoding speed improvement can be achieved with extremely low compression performance loss. The framework and methodology can also be easily migrated to other video codecs and implementations.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129316759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chen Zhu, Li Song, Rong Xie, Jingning Han, Yaowu Xu
{"title":"JND-based Perceptual Rate Distortion Optimization for AV1 Encoder","authors":"Chen Zhu, Li Song, Rong Xie, Jingning Han, Yaowu Xu","doi":"10.1109/PCS48520.2019.8954513","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954513","url":null,"abstract":"AV1 is the next-generation open video coding format, and it can achieve significant coding efficiency with novel coding tools. It supports Lagrangian rate distortion optimization (RDO) method to optimize the coding performance. However, the distortion and the Lagrangian multiplier used in RDO ignore the characteristics of human visual system (HVS), which leads to insufficiency for perceptual video coding. To solve this problem, a perceptual RDO scheme based on the Just Noticeable Distortion (JND) threshold of HVS is proposed. The JND for each pixel is first measured according to three perceptual features: luminance adaptation, masking effects and structure sensitivity. Based on the observation that the regions with smaller distortion visibility thresholds are more sensitive to HVS, a JND-based Lagrangian multiplier is derived to adaptively adjust the rate-distortion (RD) performance for each coding block. Experiments demonstrate that the proposed method can achieve an average SSIM-based −3.93% BD-Rate saving compared with the original AV1 encoder, which effectively improve the coding performance.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114422773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intra block copy in Versatile Video Coding with Reference Sample Memory Reuse","authors":"Xiaozhong Xu, Xiang Li, Shan Liu","doi":"10.1109/PCS48520.2019.8954512","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954512","url":null,"abstract":"Screen contents such as online gaming streaming, remote desktop and WIFI display, become popular in current mainstream video applications. In versatile video coding (VVC), the most recent international video coding standard development, coding tools have been evaluated for optimizing screen content materials. Intra block copy (IBC) has shown its effectiveness in coding of computer-generated contents such as texts and graphics. Therefore, it has been previously included into the HEVC standard version 4, extensions for screen content coding (SCC). A constrained version of IBC mode has also been adopted in the VVC standard where the compensation range is limited within the current coding-tree unit (CTU), assuming a 1-CTU size of memory is allocated for storing IBC’s reference samples. In this paper, methods are proposed to efficiently utilize this reference sample memory for IBC mode such that effectively the search range for IBC mode can be increased without requiring more memory to store the reference samples. As a result, significant coding efficiency improvement over the traditional 1-CTU search range setting can be achieved. One of the proposed memory reuse strategies is considered practical for implementation and therefore has been included in the VVC standard.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130088371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Max Bläser, Han Gao, S. Esenlik, E. Alshina, Zhijie Zhao, Christian Rohlfing, E. Steinbach
{"title":"Low-Complexity Geometric Inter-Prediction for Versatile Video Coding","authors":"Max Bläser, Han Gao, S. Esenlik, E. Alshina, Zhijie Zhao, Christian Rohlfing, E. Steinbach","doi":"10.1109/PCS48520.2019.8954504","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954504","url":null,"abstract":"Non-rectangular block partitioning is a well-known method for improved inter-picture prediction in video coding, enabling better spatial adaptation to the signal properties. This contribution presents the most recent proposal of geometric inter-prediction (GIP) made to the Versatile Video Coding (VVC) standardization activity led by the Joint Video Experts Team (JVET). Implemented in the latest test model VTM-5.0 and evaluated according to the JVET Common Test Conditions, the proposed low-complexity GIP scheme provides objective luma BD-rate reductions of 0.22 % for random access and 0.44 % for low-delay test cases at 7% encoder runtime increase and negligible decoder runtime increase. The coding gain is provided by non-triangular partitioned blocks and in the presence of multiple other VVC coding tools. Furthermore, BD-rate reductions of 2.58 % and 2.78 % can be achieved specifically for pure screen content by employing an adaptive blending filter.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134252596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An IBP-CNN Based Fast Block Partition For Intra Prediction","authors":"Wenpeng Ren, Jia Su, Chang Sun, Zhiping Shi","doi":"10.1109/PCS48520.2019.8954522","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954522","url":null,"abstract":"The increase of block size to 64×64 in HEVC leads to the increase of computational complexity of intra prediction. The convolution neural network (CNN) shows advantages in the extraction and application of image features than the traditional intra prediction optimization algorithm which is developed manually. For reducing the computational complexity of intra prediction, a CNN-based algorithm, intra block partition CNN (IBP-CNN) is proposed in this paper to get the block partition. First, a database which is consisted of coding tree unit (CTU) images and label images is established. The position of pixels in the label images are consistent with those in CTU images. Second, the texture features are analyzed by IBP-CNN to get the block partition. Then the output of the network is adjusted according to the quadtree structure of HEVC to facilitate the calculation of rate distortion (RD) cost. The method proposed in this paper reduces the average coding time of about 59.07% and the average BD-rate is about 1.55%.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114340859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dandan Ding, Guangyao Chen, D. Mukherjee, Urvang Joshi, Yue Chen
{"title":"A CNN-based In-loop Filtering Approach for AV1 Video Codec","authors":"Dandan Ding, Guangyao Chen, D. Mukherjee, Urvang Joshi, Yue Chen","doi":"10.1109/PCS48520.2019.8954565","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954565","url":null,"abstract":"In-loop filter using Convolutional Neural Network (CNN) has lately attracted lots of attention in video coding. CNN models may be trained to learn how to restore degradation introduced by compression in pictures, and hence effectively help improve the coding efficiency. State-of-the-art work in this field generally employs a single network to enhance reconstructed frames mainly in intra coding. In this paper, we develop a depth-variable network handling both intra and inter coding. The depth of our network is varied with the distortion levels of reconstructed frames. Moreover, we leverage a skip enhancing strategy for inter coding, which improves both the coding efficiency and the resulting visual quality, while maintaining low computational complexity. We apply our approach to AV1, a newly released video coding standard from AOM. Experimental results show that our approach achieves an average BD-rate reduction of 7.27% and 5.57% for intra and inter modes, respectively, compared to AV1 anchor. The code and model of our approach are published in our Github website [1].","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114422878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intra Frame Prediction for Video Coding Using a Conditional Autoencoder Approach","authors":"Fabian Brand, Jürgen Seiler, André Kaup","doi":"10.1109/PCS48520.2019.8954546","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954546","url":null,"abstract":"Intra prediction is a vital component of most modern image and video codecs. State of the art video codecs like High Efficiency Video Coding (HEVC) or the upcoming Versatile Video Coding (VVC) use a high number of directional modes. With the recent advances in deep learning, it is now possible to use artificial neural networks for intra frame prediction. Previously published approaches usually add additional ANN based modes or replace all modes by training several networks. In our approach, we use a single autoencoder network to first compress the original with help of already transmitted pixels to four parameters. We then use the parameters together with this support area to generate a prediction for the block. This way, we are able to replace all angular intra modes by a single ANN. In the experiments we compare our method with the intra prediction method currently used in the VVC Test Model (VTM). Using our method, we are able to gain up to 0.85 dB prediction PSNR with a comparable amount of side information or reduce the amount of side information by 2 bit per prediction unit with similar PSNR.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"441 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132014317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantization of Depth in Simulcast and Multiview Coding of Stereoscopic Video plus Depth Using HEVC, VVC and MV-HEVC","authors":"Y. Al-Obaidi, T. Grajek, M. Domański","doi":"10.1109/PCS48520.2019.8954539","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954539","url":null,"abstract":"Virtual reality, free-viewpoint television, virtual navigation, 360º video are the areas of research and technology that need efficient compression of multiview video plus depth acquired by cameras with arbitrary positions. Astonishingly, proliferation of 3D extensions of AVC and HEVC technology is very low. Therefore, in this paper, we present a study on independent coding of views and depth maps. A simple technique is proposed to estimate quantization step for depth as a function of the quantization step for multiview video. This technique is studied in the context of multiview video plus depth acquired using cameras located around a scene. The approach is based on simple modeling of the relation between quantization parameters for depth and multiview video. The experimental results are obtained for stereoscopic video with two respective depth maps. For standard MPEG test sequences, the results demonstrate usefulness of the approach for HEVC, VVC, MV-HEVC codecs.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130871404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CNN Accelerated Intra Video Coding, Where Is the Upper Bound?","authors":"Y. Huang, Li Song, E. Izquierdo","doi":"10.1109/PCS48520.2019.8954494","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954494","url":null,"abstract":"The very high complexity of the High Efficiency Video Coding standard (HEVC) is the main hurdle for its wide deployment and use. To tackle this problem, a number of recent research outcomes exploit Convolutional Neural Network (CNN) in each HEVC module for reducing the coding complexity. In this paper an effective method to analyse the potential of CNN techniques to reduce the computational cost of HEVC is proposed. A theoretical upper bound for the effectiveness of this approach in common HEVC modules is investigated. The theoretical maximum of learning-based complexity reduction in HEVC and possible reasons for Rate-Distortion (RD) loss are investigated. On the basis of this analysis, an Intra Video Coding Acceleration (IVCA) scheme is proposed, where Border Considered CNN (BC-CNN) based Coding Unit (CU) partition and heuristic Prediction Unit (PU) partition are seamlessly integrated. According to the experimental results, 66.7% of intra coding time can be saved with negligible 1.71% Bjøntegaard delta bit-rate (BDBR) loss. These results partially demonstrate the superiority of the proposed technique against other state-of-the-art approaches aiming at reducing HEVC complexity in intra mode.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130877707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}