Yue Li, R. Mathew, Dominic Rüfenacht, A. Naman, D. Taubman
{"title":"Consistent Disparity Synthesis for Inter-View Prediction in Lightfield Compression","authors":"Yue Li, R. Mathew, Dominic Rüfenacht, A. Naman, D. Taubman","doi":"10.1109/PCS48520.2019.8954506","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954506","url":null,"abstract":"For efficient compression of lightfields that involve many views, it has been found preferable to explicitly communicate disparity/depth information at only a small subset of the view locations. In this study, we focus solely on inter-view prediction, which is fundamental to multi-view imagery compression, and itself depends upon the synthesis of disparity at new view locations. Current HDCA standardization activities consider a framework known as WaSP, that hierarchically predicts views, independently synthesizing the required disparity maps at the reference views for each prediction step. A potentially better approach is to progressively construct a unified multi-layered base-model for consistent disparity synthesis across many views. This paper improves significantly upon an existing base-model approach, demonstrating superior performance to WaSP. More generally, the paper investigates the implications of texture warping and disparity synthesis methods.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125081489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Dziembowski, Dawid Mieloch, O. Stankiewicz, M. Domański, Gwangsoon Lee, Jeongil Seo
{"title":"Virtual View Synthesis for 3DoF+ Video","authors":"A. Dziembowski, Dawid Mieloch, O. Stankiewicz, M. Domański, Gwangsoon Lee, Jeongil Seo","doi":"10.1109/PCS48520.2019.8954502","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954502","url":null,"abstract":"The paper reports a new view synthesis method for omnidirectional video with the ability to slightly displace a virtual viewpoint, i.e. the paper describes a novel synthesis method for 3DoF+ 360 video. This new method is noteworthy because of its high versatility and reliability: the method is appropriate for both perspective and omnidirectional input views, is able to render both perspective and omnidirectional views, and the produced synthetic views differ from the respective ground truth images less than with other view synthesis methods. These important features result from several innovations, e.g., prioritization of input views, efficient inpainting adapted to equirectangular projections, and efficient color correction. The experimental results demonstrate that the new method outperforms the state-of-the-art methods used hitherto.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"461 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122737137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Complementary Motion Vector for Motion Prediction in Video Coding with Long-Term Reference","authors":"Jue Mao, Hualong Yu, Xiaoding Gao, Lu Yu","doi":"10.1109/PCS48520.2019.8954511","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954511","url":null,"abstract":"In HEVC, there are two types of motion vector (MV) when long-term reference is enabled: short-term MV (SMV) pointing to short-term reference frame and long-term MV (LMV) pointing to long-term reference frame. And cross-class prediction between SMV and LMV is not allowed because of their low correlation. Therefore, MV predictor candidates of current block would be inadequate when neighboring MVs are in different types. This paper proposes a complementary MV to enrich MV predictor candidates for current block. There would be two types of MV for each neighboring inter block. In addition to MV used in motion compensation, a complementary MV of the other type is derived by reconstructed pixels. This paper also proposed a reliability-based MV predictor candidate list construction method to improve the prediction efficiency of complementary MVs. Experimental results show that the proposed method can achieve 1.18% coding performance improvement on average.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117251945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Derived Tree Block Partition for AVS3 Intra Coding","authors":"Liqiang Wang, Xiaoran Cao, Benben Niu, Quanhe Yu, Jianhua Zheng, Yun He","doi":"10.1109/PCS48520.2019.8954542","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954542","url":null,"abstract":"AVS3 is the next generation video coding standard and is being developed by the audio and video standard (AVS) working group of China. As the successor of AVS2, AVS3 is designed to achieve a great coding efficiency beyond AVS2 as well as H.265/HEVC. In this paper, we propose a novel block partition method for intra coding, named derived tree block partition (DTBP), based on the block partition structure in AVS3. By further splitting a leaf node of the partition tree to quad split types or asymmetric types, the signaling efficiency and the flexibility of partitioning asymmetric texture can be improved. When implemented on HPM3.1, up to 1.14% and an average 0.61% BD-rates are achieved under All-Intra configuration.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131058210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyu Xu, Jian Qian, Li Yu, Hongkui Wang, Xing Zeng, Zhengang Li, Ning Wang
{"title":"Dense Inception Attention Neural Network for In-Loop Filter","authors":"Xiaoyu Xu, Jian Qian, Li Yu, Hongkui Wang, Xing Zeng, Zhengang Li, Ning Wang","doi":"10.1109/PCS48520.2019.8954499","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954499","url":null,"abstract":"Recently, deep learning technology has made significant progresses in high efficiency video coding (HEVC), especially in in-loop filter. In this paper, we propose a dense inception attention network (DIA_Net) to delve into image information and model capacity. The DIA_Net contains multiple inception blocks which have different size kernels so as to dig out various scales information. Meanwhile, attention mechanism including spatial attention and channel attention is utilized to fully exploit feature information. Further we adopt a dense residual structure to deepen the network. We attach DIA_Net to the end of in-loop filter part in HEVC as a post-processor and apply it to luma components. The experimental results demonstrate the proposed DIA_Net has remarkable improvement over the standard HEVC. With all-intra(AI) and random access(RA) configurations, It achieves 8.2% bd-rate reduction in AI configuration and 5.6% bd-rate reduction in RA configuration.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133947989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongbin Liu, Li Zhang, Kai Zhang, Jizheng Xu, Yue Wang, Jiancong Luo, Yuwen He
{"title":"Adaptive Motion Vector Resolution for Affine-Inter Mode Coding","authors":"Hongbin Liu, Li Zhang, Kai Zhang, Jizheng Xu, Yue Wang, Jiancong Luo, Yuwen He","doi":"10.1109/PCS48520.2019.8954531","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954531","url":null,"abstract":"Affine Motion Model (AMM) based inter prediction, which can represent complex motions such as zooming, rotation or shearing, has been adopted into the Versatile Video Coding (VVC) standard. AMM is defined by Control Point Motion Vectors (CPMVs) in VVC. On the other hand, Adaptive Motion Vector Resolution (AMVR) has also been adopted into VVC standard due to a favorable trade-off between the Motion Vector (MV) precision and the bit consumption on MV Differences (MVDs). However, AMVR is only applied to the Translational Motion Model (TMM), and AMM cannot benefit from it. In this paper, it is proposed to extend AMVR to AMM. Specifically, 1-pixel, 1/4-pixel and 1/16-pixel MV precisions are allowed and can be selected adaptively by each affine-inter mode coded Coding Unit (CU). Simulation results reportedly show that the proposed method can achieve 0.32% BD-rate saving on average under the Random Access configuration.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115461348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visibility Metric for Visually Lossless Image Compression","authors":"Nanyang Ye, M. Pérez-Ortiz, Rafał K. Mantiuk","doi":"10.1109/PCS48520.2019.8954560","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954560","url":null,"abstract":"Encoding images in a visually lossless manner helps to achieve the best trade-off between image compression performance and quality and so that compression artifacts are invisible to the majority of users. Visually lossless encoding can often be achieved by manually adjusting compression quality parameters of existing lossy compression methods, such as JPEG or WebP. But the required compression quality parameter can also be determined automatically using visibility metrics. However, creating an accurate visibility metric is challenging because of the complexity of the human visual system and the effort needed to collect the required data. In this paper, we investigate how to train an accurate visibility metric for visually lossless compression from a relatively small dataset. Our experiments show that prediction error can be reduced by 40% compared with the state-of-theart, and that our proposed method can save between 25%-75% of storage space compared with the default quality parameter used in commercial software. We demonstrate how the visibility metric can be used for visually lossless image compression and for benchmarking image compression encoders.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"242 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133695556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Delivery of Very High Dynamic Range Compressed Imagery by Dynamic-Range-of-Interest","authors":"Lan Liu, D. Taubman","doi":"10.1109/PCS48520.2019.8954556","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954556","url":null,"abstract":"JPEG 2000 allows scenes to be encoded in a highly scalable and accessible manner, so that only the content that is relevant to a region or resolution of interest need be transmitted and decoded. This paper extends this property to allow efficient access into compressed images that have a very high dynamic range, based on a dynamic-range-of-interest. Optimized re-prioritization of the encoded content is used to stream imagery based on a potentially dynamic set of viewing conditions that implicitly identify the visual significance of content. We propose and validate a framework for doing this, based on a single compressed representation, a sparse set of display-independent luminance statistics and a novel algorithm for inferring the display-dependent significance of code-block distortions.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"51 S258","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113954397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Pass Renderer in MPEG Test Model for Immersive Video","authors":"Basel Salahieh, S. Bhatia, J. Boyce","doi":"10.1109/PCS48520.2019.8954515","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954515","url":null,"abstract":"MPEG is developing an immersive video coding standard aims to deliver 6DoF (degrees of freedom) visual experience enabling motion parallax and omnidirectional viewing capabilities. A reference software has been made available to implement the immersive standard. To make it robust for various immersive content and deliver sharper synthesis with suppressed artifacts, a multi-pass add-on tool has been implemented for rendering at the decoder side. The tool runs view synthesis multiple times with different selection of views or patches of views in each pass based on the distance to the desired viewing position and orientation and merges the intermediate synthesized views to output a coherent and complete desired view. We share the multi-pass rendering results operating on whole views or on atlases (i.e. collection of patches from views) and compare them to the traditional single-pass synthesis results. We report significant subjective and objective improvements (1.7 ~ 4.2 dB gain in PSNR) for the multi-pass technique in the whole views case while having its synthesis results being atlas-dependent in the immersive coding case.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115127958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond Coding: Detection-driven Image Compression with Semantically Structured Bit-stream","authors":"Tianyu He, Simeng Sun, Zongyu Guo, Zhibo Chen","doi":"10.1109/PCS48520.2019.8954525","DOIUrl":"https://doi.org/10.1109/PCS48520.2019.8954525","url":null,"abstract":"With the development of 5G and edge computing, it is increasingly important to offload intelligent media computing to edge device. Traditional media coding scheme codes the media into one binary stream without a semantic structure, which prevents many important intelligent applications from operating directly in bit-stream level, including semantic analysis, parsing specific content, media editing, etc. Therefore, in this paper, we propose a learning based Semantically Structured Coding (SSC) framework to generate Semantically Structured Bit-stream (SSB), where each part of bit-stream represents a certain object and can be directly used for aforementioned tasks. Specifically, we integrate an object detection module in our compression framework to locate and align the object in feature domain. After applying quantization and entropy coding, the features are re-organized according to detected and aligned objects to form a bit-stream. Besides, different from existing learning-based compression schemes that individually train models for specific bit-rate, we share most of model parameters among various bit-rates to significantly reduce model size for variable-rate compression. Experimental results demonstrate that only at the cost of negligible overhead, objects can be completely reconstructed from partial bit-stream. We also verified that classification and pose estimation can be directly performed on partial bit-stream without performance degradation.","PeriodicalId":237809,"journal":{"name":"2019 Picture Coding Symposium (PCS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115027807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}