Christian R. Helmrich, S. Bosse, Mischa Siekmann, H. Schwarz, D. Marpe, T. Wiegand
{"title":"Perceptually Optimized Bit-Allocation and Associated Distortion Measure for Block-Based Image or Video Coding","authors":"Christian R. Helmrich, S. Bosse, Mischa Siekmann, H. Schwarz, D. Marpe, T. Wiegand","doi":"10.1109/DCC.2019.00025","DOIUrl":"https://doi.org/10.1109/DCC.2019.00025","url":null,"abstract":"It is well known that input-invariant quantization in perceptual image or video coding often leads to visually suboptimal results and that quantization parameter adaptation (QPA) based on a model of the human visual system can improve subjective coding quality. This paper introduces a simple low-complexity QPA algorithm, controlled using a block-wise perceptually weighted distortion measure representing a generalization of the PSNR metric. The weighting scheme of this WPSNR metric is based on a psychovisual model. It directly leads to a perceptually adapted scaling of the block-wise Lagrange parameter used in the bit-allocation process in the encoder and, consequently, to a block-wise QPA. Unlike prior QPA approaches, the proposal avoids classifications of picture regions and easily extends from still-image or grayscale to video or chromatic coding. The WPSNR metric also uses fewer algorithmic operations than e. g. the multiscale structural similarity measure (MS-SSIM). Due to the results of two formal subjective tests indicating its visual benefit, the QPA proposal has been adopted into VTM, the currently developed Versatile Video Coding (VVC) reference software.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130528072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuzhang Lin, Feng Liu, Miguel Hernández-Cabronero, Eze Ahanonu, M. Marcellin, A. Bilgin, A. Ashok
{"title":"Perception-Optimized Encoding for Visually Lossy Image Compression","authors":"Yuzhang Lin, Feng Liu, Miguel Hernández-Cabronero, Eze Ahanonu, M. Marcellin, A. Bilgin, A. Ashok","doi":"10.1109/DCC.2019.00104","DOIUrl":"https://doi.org/10.1109/DCC.2019.00104","url":null,"abstract":"We propose a compression encoding method to perceptually optimize the image quality based on a novel quality metric, which emulates how the human visual system form opinion of a compressed image. Compared to the existing perceptual-optimized compression methods, which usually aim to minimize the detectability of compression artifacts and are sub-optimal in visually lossless regime, the proposed encoder aims to operate in the visually lossy regime. We implement the proposed encoder within the JPEG 2000 standard, and demonstrate its advantage over both detectability-based and conventional MSE encoders.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116001941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Li Li, Zhu Li, Yue Li, B. Kathariya, S. Bhattacharyya
{"title":"Incremental Deep Neural Network Pruning Based on Hessian Approximation","authors":"Li Li, Zhu Li, Yue Li, B. Kathariya, S. Bhattacharyya","doi":"10.1109/DCC.2019.00102","DOIUrl":"https://doi.org/10.1109/DCC.2019.00102","url":null,"abstract":"In this paper, based on the Hessian approximation, an incremental pruning method is proposed to compress the deep neural network. The proposed method starts from the idea of using the Hessian to measure the \"importance\" of each weight in a deep neural network, and it mainly has the following key contributions. First, we propose to use the second moment in Adam optimizer as a measure of the \"importance\" of each weight to avoid calculating the Hessian matrix. Second, an incremental method is proposed to prune the neural network step by step. The incremental method can adjust the remaining non-zero weights of the whole network after each pruning to help boost the performance of the pruned network. Last but not least, the proposed method applies an automatically-generated global threshold for all the weights among all the layers, which achieves the inter-layer bit allocation automatically. Such a method can improve performance and save the complexity of adjusting the pruning threshold layer by layer. We perform a number of experiments on MNIST and ImageNet using commonly used neural networks such as AlexNet and VGG16 to show the benefits of the proposed algorithm. The experimental results show that the proposed algorithm is able to compress the network significantly with almost no loss of accuracy, which demonstrates the effectiveness of the proposed algorithm.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121287622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Video Coding with Trellis-Coded Quantization","authors":"H. Schwarz, Tung Nguyen, D. Marpe, T. Wiegand","doi":"10.1109/DCC.2019.00026","DOIUrl":"https://doi.org/10.1109/DCC.2019.00026","url":null,"abstract":"In state-of-the-art video coding, the prediction error signals are transmitted using transform coding, which consists of an orthogonal transform, scalar quantization, and entropy coding of the quantization indexes. We show that the coding efficiency of transform coding can be improved by replacing scalar quantization with trellis-coded quantization (TCQ) and using advanced entropy coding techniques for coding the quantization indexes. The proposed approach was implemented into the first test model (VTM-1) of the new standardization project Versatile Video Coding (VVC). Our coding experiments yielded average bit-rate savings of 4.9% for intra-only coding and 3.3% for typical random access configurations, where bit-rate savings of 3.5% (intra-only) and 2.4% (random access) can be attributed to the usage of TCQ. These coding gains are obtained at a 5-10% increase in encoder run time and without any change in decoder run time.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129419911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongfei Zhang, Gang Wang, Rui Tian, Mai Xu, C.-C. Jay Kuo
{"title":"Texture-Classification Accelerated CNN Scheme for Fast Intra CU Partition in HEVC","authors":"Yongfei Zhang, Gang Wang, Rui Tian, Mai Xu, C.-C. Jay Kuo","doi":"10.1109/DCC.2019.00032","DOIUrl":"https://doi.org/10.1109/DCC.2019.00032","url":null,"abstract":"High Efficiency Video Coding (HEVC) achieves significant coding performance over H.264. However, the performance gain is achieved at the cost of substantially higher encoding complexity, in which the coding tree unit (CTU) partition is one of the most time-consuming parts due to the rate-distortion optimization-based ergodic search of all possible quad-tree partitions. To address this problem, this paper proposes a texture-classification accelerated convolutional neural network (CNN)-based fast intra CU partition scheme to reduce the encoding complexity for intra-coding in HEVC, by taking into consideration of the heterogeneous texture characteristics into the CNN-based classification. First, a threshold-based texture classification model is developed to identify the heterogeneous and homogeneous CTUs, through jointly consideration of the CU depth, quantization parameter and texture complexity. Second, three different CNN structures are designed and trained to predict the CU partition mode for each CU layer in the heterogeneous CTUs. Finally, extensive experimental results show that the proposed scheme can reduce intra-mode encoding time by 62.13% with negligible BD-rate loss of 2.01%, consistently outperforming two state-of-the-art CNN-based schemes in terms of both coding performance and complexity reduction.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127455876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-view Multi-modality Priors Residual Network of Depth Video Enhancement for Bandwidth Limited Asymmetric Coding Framework","authors":"Siqi Chen, Qiong Liu, You Yang","doi":"10.1109/DCC.2019.00072","DOIUrl":"https://doi.org/10.1109/DCC.2019.00072","url":null,"abstract":"Asymmetric coding methodology for multi-view video plus depth is a promising technique for future three-dimensional and multi-view driven visual applications for its superior coding performance in bandwidth limited conditions. Since the depth video suffers from asymmetric distortions corresponding to viewpoint, it's a challenge in smooth and quality consistent content based interaction. To solve this challenge, we propose a residual learning framework to enhance the quality of compression distorted multi-view depth video. In this work, we exploit the correlation between viewpoints to restore the target viewpoint depth maps by using multi-modality priors, which are depth maps from adjacent viewpoints with better quality and color frames in the same viewpoint. A residual network is designed to fully exploit the contribution from these priors. Experimental results show the superiority of our framework in the quality improvement on both decoded depth video and synthesized virtual viewpoint images.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122975151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selective Dynamic Compression","authors":"S. T. Klein, Elina Opalinsky, Dana Shapira","doi":"10.1109/DCC.2019.00095","DOIUrl":"https://doi.org/10.1109/DCC.2019.00095","url":null,"abstract":"Dynamic compression methods continuously update the model of the underlying text file to be compressed according to the already processed part of the file, assuming that such a model accurately predicts the distribution in the remaining part. Since this premise is not necessarily true, we suggest to update the model only selectively. We give empirical evidence that this hardly affects the compression efficiency, while it obviously may save processing time and allow the use of the compression scheme for cryptographic applications.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121843095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Ganguly, W. Hon, Yu-An Huang, S. Pissis, R. Shah, Sharma V. Thankachan
{"title":"Parameterized Text Indexing with One Wildcard","authors":"A. Ganguly, W. Hon, Yu-An Huang, S. Pissis, R. Shah, Sharma V. Thankachan","doi":"10.1109/DCC.2019.00023","DOIUrl":"https://doi.org/10.1109/DCC.2019.00023","url":null,"abstract":"Two equal-length strings X and Y over an alphabet Σ of size σ are a parameterized match iff X can be transformed to Y by renaming the character X[i] to the character Y[i] for 1 ≤ i ≤ |X| using a one-to-one function from the set of characters in X to the set of characters in Y. The parameterized text indexing problem is defined as: Index a text T of n characters over an alphabet set Σ of size σ, such that whenever a pattern P[1, p] comes as a query, we can report all occ parameterized occurrences of P in T. A position i ∊ [1, n] is a parameterized occurrence of P in T, iff P and T[i,(i+p-1)] are a parameterized match. We study an interesting generalization of this problem, where the pattern contains one wildcard character ϕ ∉ Σ that matches with any other character in Σ. Therefore, for a pattern P[1, p] = P_1ϕP_2, our task is to report all positions i in T, such that the string P_1 P_2 and the string obtained by concatenating T[i,(i+|P_1|-1)] and T[(i+|P_1|+1),(i+p-1)] are a parameterized match. We show that such queries can be answered in optimal O(p+occ) time per query using an O(n log n) space index. We then show how to compress our index into O(n log σ) space but with a higher query cost of O(p(log log n+logσ)+occ logσ).","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129848482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meng Wang, Junru Li, Li Zhang, Kai Zhang, Hongbin Liu, Shiqi Wang, S. Kwong, Siwei Ma
{"title":"Extended Quad-Tree Partitioning for Future Video Coding","authors":"Meng Wang, Junru Li, Li Zhang, Kai Zhang, Hongbin Liu, Shiqi Wang, S. Kwong, Siwei Ma","doi":"10.1109/DCC.2019.00038","DOIUrl":"https://doi.org/10.1109/DCC.2019.00038","url":null,"abstract":"The quad-tree plus binary-tree (QTBT) coding unit (CU) partitioning structure, which has been adopted to the next generation video coding standard, shows promising coding performance when compared with the conventional quad-tree structure in HEVC. In this paper, we propose the Extended Quad-tree (EQT) partitioning, which further extends the QTBT scheme and increases the partitioning exibility. More specifcally, EQT splits a parent CU into four sub-CUs of dierent sizes, which can adequately model the local image content that cannot be elaborately characterized with QTBT. Meanwhile, EQT partitioning allows the interleaving with BT partitioning for enhanced adaptability. Experimental results on the JEM7-QTBT-Only platform show that EQT brings better coding performance with 3.17%, 3.20% and 3.06% BD-Rate gains under random access, low-delay P and low-delay B configurations, respectively.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123657570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Point Cloud Geometry Coding Using Planes and Octree Representation Models","authors":"Antoine Dricot, J. Ascenso","doi":"10.1109/DCC.2019.00081","DOIUrl":"https://doi.org/10.1109/DCC.2019.00081","url":null,"abstract":"Nowadays, point clouds are considered as a promising representation model for future 3D applications, such as augmented reality. However, large amounts of data are required to offer high quality immersive experiences, and therefore, compression efficiency is an important factor. Many of the current coding techniques rely on octree data structures, that offer several advantages such as scalability with layers offering multiple levels of detail. Although octrees are capable to efficiently represent a point cloud, geometric representations may also play an important role. In this context, the objective of this paper is to propose a hybrid point cloud compression solution based on plane estimation and coding that enhances the octree structure, thus exploiting two representation models for point clouds. In the proposed solution, the octree partitioning is adaptive and includes this novel plane coding mode for leaf nodes at different layers of the octree. A large increase in coding efficiency against the pure octree coding case is observed with average BD-rate gains of 35%, up to 68% reported. Moreover, in several cases it can also significantly outperform state-of-the-art solutions that are nowadays considered for standardization.","PeriodicalId":167723,"journal":{"name":"2019 Data Compression Conference (DCC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127912877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}