{"title":"Content-adpative H.264 rate control for live screencasting","authors":"Yi Lin, Weikai Xie, Lei Jin, R. Shen","doi":"10.1109/VCIP.2012.6410797","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410797","url":null,"abstract":"Live screencasting involves encoding and streaming the screen content of a PC in real-time. Most existing H.264 rate control (RC) algorithms are designed for natural scenes and do not perform well with the quite different signal characteristics of screencasting. This paper proposes a content-adaptive H.264 RC scheme which classifies screen content as “slow-motion” phase or “fast-motion” phase on the fly, based on the temporal and spatial frame-area update pattern of recent frames. Then, for frames in “slow-motion” phase, which usually result from a presenter's GUI operations, a new RC algorithm named Frame Rate Adaptive-CQP (FRA-CQP) is applied, which puts priority on the quality of individual frame rather than the frame rate. For frames in “fast-motion” phase, which usually result from playing a movie during the presentation, the classical CRF+VBV RC algorithm is applied. Evaluation results show this adaptive RC scheme can regularly achieves higher subjective quality assessment score (0.7 to 1.6 points on a 1-5 scale) than existing algorithms while meeting the RC objective.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125067332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Perceptual quality metric guided blocking artifact reduction","authors":"Dong-Qing Zhang, H. H. Yu","doi":"10.1109/VCIP.2012.6410816","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410816","url":null,"abstract":"Blocking artifact reduction or deblocking algorithm is an important component in modern block-based video encoding architecture and often used as post-processing procedures in many encoding/transcoding applications. Most of the existing video deblocking algorithms do not take into account Human Visual System(HVS) models and employ empirically designed filters, resulting in suboptimal perceptual image quality and the difficulty to adjust parameters for different application scenarios. This paper presents a new deblocking algorithm that automatically generates spatially variant adaptive filters by maximizing a regularized perceptual quality metric. Such optimization based approach leads to better human perceived image quality and much easier filter parameter adjustment.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128669202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combination of SSIM and JND with content-transition classification for image quality assessment","authors":"Ming-Chung Hsu, Guan-Lin Wu, Shao-Yi Chien","doi":"10.1109/VCIP.2012.6410840","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410840","url":null,"abstract":"Image quality assessment (IQA) is a crucial feature of many image processing algorithms. The state-of-the-art IQA index, the structural similarity (SSIM) index, has been able to accurately predict image quality by assuming that the human visual system (HVS) separates structural information from non-structural information in a scene. However, the precision of SSIM is relatively lacking when used to access blurred images. This paper proposes a novel metric of image quality assessment, the JND-SSIM, which adopts the just-noticeable difference (JND) algorithm to differentiate between plain, edge, and texture blocks and obtain a visibility threshold map. Based on varying block transition types between the reference and distorted image, SSIM values are assigned respective weights and scaled down by visibility threshold map. We then test our algorithm on the LIVE and TID Image Quality Database, thereby demonstrating that our improved IQA index is much closer to human opinion.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131959326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A lossless color image compression method based on a new reversible color transform","authors":"Seyun Kim, N. Cho","doi":"10.1109/VCIP.2012.6410808","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410808","url":null,"abstract":"In many conventional lossless color image compression methods, the pixels or lines from each color component are interleaved, and then they are predicted and coded. Also, it has been reported that the reversible color transform (RCT) followed by a grayscale encoder gives higher coding gain than the independent compression of each channel does. In this paper, we propose a lossless color image compression method that concentrates on the efficient coding of chrominance channels with a new color transform and hierarchical coding of chrominance channel pixels. Specifically, we first transform an input image with R, G, and B color space into Y CuCv color space using the proposed RCT, which shows better decorrelation performance than the existing RCT. After the color transformation, the luminance channel Y is compressed by a conventional lossless image coder, such as JPEG-LS, CALIC, or JPEG2000 lossless. Unlike the luminance channel, the chrominance channels Cu and Cv are relatively smooth and have different statistical characteristic. Therefore, the chrominance channels are differently encoded based on a hierarchical decomposition and directional prediction. Finally, effective context modeling for prediction residuals is adopted. Experimental results show that the proposed method improves the compression performance by 40% over the conventional channel independent compression methods and 5% over the existing methods that exploit the channel correlation.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130845544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Content-adaptive inverse tone mapping","authors":"Pin-Hung Kuo, Chi-Sun Tang, Shao-Yi Chien","doi":"10.1109/VCIP.2012.6410798","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410798","url":null,"abstract":"Tone mapping is an important technique used for displaying high dynamic range (HDR) content on low dynamic range (LDR) devices. On the other hand, inverse tone mapping enables LDR content to appear with an HDR effect on HDR displays. The existing inverse tone mapping algorithms usually focus on enhancing the luminance in over-exposed regions with less (or even no) effort on the process of the wellexposed regions. In this paper, we propose an algorithm with not only enhancement in the over-exposed regions but also in the remaining well-exposed regions. This paper provides an ”histogram-based” method for inverse tone mapping. The proposed algorithm contains a content-adaptive inverse tone mapping operator, which has different responses with different scene characteristics. Scene classification is included in this algorithm to select the environment parameters. Lastly, enhancement of the over-exposed regions, which reconstructs the truncated information, is performed.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126224012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shen Wu, F. Jiang, Debin Zhao, Shaohui Liu, Wen Gao
{"title":"Viewpoint-independent hand gesture recognition system","authors":"Shen Wu, F. Jiang, Debin Zhao, Shaohui Liu, Wen Gao","doi":"10.1109/VCIP.2012.6410809","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410809","url":null,"abstract":"In this paper, we creatively present a viewpoint-free hand gesture recognition system based on Kinect sensor. Through depth image, we build Point Clouds of user. Then, we estimate the current optimal viewpoint, i.e., the front, and project Point Clouds to that direction. Through that process we in great extent overcome the viewpoint-dependency issue. To match hand types, we propose an improved shape context to describe each hand gesture and use the Hungarian algorithm to calculate match degree. Our method is quite straightforward, however the experimental results prove that by this means gestures can be recognized independent of viewpoints with great accuracy. Besides, it is fast and robust, thus can be applied under various realistic scenarios in realtime.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125727512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Local adaptive rational interpolation filter and its application for deinterlacing","authors":"Xiangdong Chen, Jechang Jeong, Gwanggil Jeon","doi":"10.1109/VCIP.2012.6410825","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410825","url":null,"abstract":"This paper proposes an efficient intra-field deinterlacing algorithms using local adaptive rational interpolation filter (LARF). Experimental results show that the proposed algorithm provides satisfied performances in terms of both objective and subjective image qualities. What is more, it just exploits the local spatial gradient information among the neighboring pixels without complex preset-conditions which has lower complexity than most of the existing algorithms.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124899343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparative analysis of local binary pattern texture classification","authors":"N. Doshi, G. Schaefer","doi":"10.1109/VCIP.2012.6410773","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410773","url":null,"abstract":"Texture recognition is an important aspect of many computer vision applications. Local binary pattern (LBP) based texture algorithms have gained significant popularity in recent years and have been shown to be useful for a variety of tasks. While over the years a variety of LBP algorithms have been introduced in the literature, what is missing is a comprehensive evaluation of their performance. In this paper, we fill this gap and benchmark 37 texture descriptors based on 15 LBP variants for texture classification against common standard datasets of textures including those captured at different rotation angles and under different illumination conditions. Overall, LBP variance (LBPV) is found to give the best texture classification performance.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125160557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimization of variational methods via motion-based weight selection and keypoint matching","authors":"Botao Wang, Qingxiang Zhu, H. Xiong","doi":"10.1109/VCIP.2012.6410761","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410761","url":null,"abstract":"Variational method is a well-established technique that solves for a dense field, which is widely adopted in the estimation of optical flow field and remains the most accurate technique to date. However, one of the problems in variational method lies in that it is optimized in an iterative manner towards a single objective, but local details may be compromised owing to the “big picture”. In this paper, we address this problem in an optical flow framework by introducing two sparse local rectifications to the global numerical scheme, i.e., motion-based weight selection and keypoint matching. The selection of the weighting parameter in a self-adaptive and content-aware manner provides a more accurate estimation of the optical flow field near motion boundaries, and motion details and small structures are preserved in the optical flow field by keypoint matching in the initialization of the optical flow field. Experimental results using the Middlebury dataset show that the proposed algorithm achieves higher accuracy compared to the original TV-ℓ1 optical flow algorithm and many state-of-the-art methods.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125177110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fast multiview video transcoder for bitrate reduction","authors":"Bing Wang, Xiaopeng Fan, Shaohui Liu, Yan Liu, Debin Zhao, Wen Gao","doi":"10.1109/VCIP.2012.6410847","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410847","url":null,"abstract":"Video transcoding is an efficient way to reduce the bitrate or convert the format of the original video stream to meet the requirements of different applications and various channel capacity. In this paper, we propose a fast multiview video transcoder (MVT) for bitrate reduction. Different from the H.264 transcoder, the inter-view prediction information in the input video stream is utilized to reduce the complexity of transcoding. Besides, we also utilize the mode and selected reference frame information in original stream to accelerate RD optimization calculations. Experimental results show that the proposed transcoder can achieve significant computation reduction while maintaining close RD performance compared to the fully decode and re-encode transcoder (FDET).","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115149401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}