Ke Gu, Guangtao Zhai, Min Liu, Qi Xu, Xiaokang Yang, Jun Zhou, Wenjun Zhang
{"title":"Adaptive high-frequency clipping for improved image quality assessment","authors":"Ke Gu, Guangtao Zhai, Min Liu, Qi Xu, Xiaokang Yang, Jun Zhou, Wenjun Zhang","doi":"10.1109/VCIP.2013.6706347","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706347","url":null,"abstract":"It is widely known that the human visual system (HVS) applies multi-resolution analysis to the scenes we see. In fact, many of the best image quality metrics, e.g. MS-SSIM and IW-PSNR/SSIM are based on multi-scale models. However, in existing multi-scale type of image quality assessment (IQA) methods, the resolution levels are fixed. In this paper, we examine the problem of selecting optimal levels in the multi-resolution analysis to preprocess the image for perceptual quality assessment. According to the contrast sensitivity function (CSF) of the HVS, the sampling of visual information by the human eyes approximates a low-pass process. For images, the amount of information we can extract depends on the size of the image (or the object(s) inside) as well as the viewing distance. Therefore, we proposed a wavelet transform based adaptive high-frequency clipping (AHC) model to approximate the effective visual information that enters the HVS. After the high-frequency clipping, rather than processing separately on each level, we transform the filtered images back to their original resolutions for quality assessment. Extensive experimental results show that on various databases (LIVE, IVC, and Toyama-MICT), performance of existing image quality algorithms (PSNR and SSIM) can be substantially improved by applying the metrics to those AHC model processed images.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132617756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two HEVC encoder methods for block artifact reduction","authors":"A. Norkin, K. Andersson, Valentin Kulyk","doi":"10.1109/VCIP.2013.6706452","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706452","url":null,"abstract":"The HEVC deblocking filter significantly improves the subjective quality of coded video sequences at lower bitrates. During the final phase of HEVC standardization, it was shown that the reference software encoder may produce visible block artifacts on some sequences with content that shows chaotic motion, such as water or fire. The paper analyses the reasons for blocking artifacts in such sequences and describes two simple encoder-side methods that improve the subjective quality on these sequences without degrading the quality on other content and without significant bitrate increase. The effect on subjective quality has been evaluated by a formal subjective test.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115567550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint image denoising using self-similarity based low-rank approximations","authors":"Yongqin Zhang, Jiaying Liu, Saboya Yang, Zongming Guo","doi":"10.1109/VCIP.2013.6706404","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706404","url":null,"abstract":"The observed images are usually noisy due to data acquisition and transmission process. Therefore, image denoising is a necessary procedure prior to post-processing applications. The proposed algorithm exploits the self-similarity based low rank technique to approximate the real-world image in the multivariate analysis sense. It consists of two successive steps: adaptive dimensionality reduction of similar patch groups, and the collaborative filtering. For each target patch, the singular value decomposition (SVD) is used to factorize the similar patch group collected in a local search window by block-matching. Parallel analysis automatically selects the principal signal components by discarding the nonsignificant singular values. After the inverse SVD transform, the denoised image is reconstructed by the weighted averaging approach. Finally, the collaborative Wiener filtering is applied to further remove the noise. Experimental results show that the proposed algorithm surpasses the state-of-the-art methods in most cases.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114723101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A real-time fall detection system based on HMM and RVM","authors":"Mei Jiang, Yuyang Chen, Yanyun Zhao, A. Cai","doi":"10.1109/VCIP.2013.6706385","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706385","url":null,"abstract":"The growing population of seniors leads to the need for an intelligent surveillance system to ensure the safety of the elders at home. Fall is one kind of the most seriously life-threatening emergencies for elderly people. Fall detection system based on video surveillance provides an efficient solution for detecting fall events automatically by analyzing human behaviors. In this paper, we propose a context-based fall detection system by analyzing human motion and posture using hidden Markov model (HMM) and relevance vector machine (RVM) respectively. Additionally, we integrate homography to deal with falls in any direction. The system is validated on an open fall database and our own video dataset. Experimental results demonstrate that our method achieves high robustness and accuracy in detecting different kinds of falls and runs at a real-time speed.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124735654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rapid human action recognition in H.264/AVC compressed domain for video surveillance","authors":"Manu Tom, R. Venkatesh Babu","doi":"10.1109/VCIP.2013.6706430","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706430","url":null,"abstract":"This paper discusses a novel high-speed approach for human action recognition in H.264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123668542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An estimation of the fundamental matrix using hybrid statistics","authors":"Ryo Okutani, Y. Kuroki","doi":"10.1109/VCIP.2013.6706341","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706341","url":null,"abstract":"The fundamental matrix in epipolar constraint represents important information from different viewpoints. This matrix can be estimated using more than seven corresponding keypoints. The maximum-likelihood estimation can correct errors of coordinates of corresponding keypoints, and calculates the fundamental matrix accurately. The accuracy of the fundamental matrix depends on the accuracy of corresponding keypoints; therefore, exact extraction of the corresponding keypoints plays an important role. SIFT (Scale Invariant Feature Transform) represents a feature vector for each keypoint, which is robust against geometrical changes and photometric changes. This property contributes to a high level of discrimination for finding corresponding keypoints. However, SIFT may extract corresponding keypoints with large errors, such as mismatched corresponding keypoints. These corresponding keypoints affect the accuracy of the fundamental matrix. The proposed method eliminates the mismatched corresponding keypoints using not only the statistics of epipolar equation error but also the ratio of the variances of the error before and after the keypoints' elimination. Experimental results demonstrate that the proposed method estimates the fundamental matrix more accurately than conventional methods.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"71 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129341622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Chan-Vese segmentation for iris segmentation","authors":"Gradi Yanto, M. Jaward, N. Kamrani","doi":"10.1109/VCIP.2013.6706440","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706440","url":null,"abstract":"In this paper, we propose a new model as an improvement of active contours without edges model by Chan-Vese to perform iris segmentation. Our proposed algorithm formulates the energy function defined by Chan-Vese as a Bayesian optimization problem. The prior probability is incorporated into the energy function; the prior information of the curve can be integrated with current information provided by likelihood calculation. In order to obtain the desired curve, Maximum a Posteriori (MAP) probability is minimized. Experimental results show that our proposed model gives a more robust performance in iris segmentation compared to the original Chan-Vese model.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121187976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Laplace distribution based CTU level rate control for HEVC","authors":"Junjun Si, Siwei Ma, Shiqi Wang, Wen Gao","doi":"10.1109/VCIP.2013.6706333","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706333","url":null,"abstract":"This paper proposes a coding tree unit (CTU) level rate control for HEVC based on the Laplace distribution modeling of the transformed residuals. Firstly, we give a study on the relationship model among the optimal quantization step, the Laplace parameter and the Lagrange multiplier. Based on the relationship model, the quantization parameter for each CTU can be dynamically adjusted according to the distribution of the transformed residual. Secondly, a CTU level rate control scheme is proposed to achieve accurate rate control as well as high coding performance. Experimental results show that the proposed rate control scheme achieves better coding performance than the state-of-the-art rate control schemes for HEVC in terms of both objective and subjective quality.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128784994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haixu Liu, Yang Liu, Shuxin Ouyang, Chenyu Liu, Xueming Li
{"title":"A novel method for stereo matching using Gabor Feature Image and Confidence Mask","authors":"Haixu Liu, Yang Liu, Shuxin Ouyang, Chenyu Liu, Xueming Li","doi":"10.1109/VCIP.2013.6706388","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706388","url":null,"abstract":"In this paper, we present a novel local-based algorithm for stereo matching using Gabor-Feature-Image and Confidence-Mask. Various local-based schemes have been proposed in recent years, most of them mainly use color difference as evaluation criterion when constructing the initial cost volume, however, color channel is highly sensitive to noise, illumination changes, etc. Therefore, we develop a new cost function based on Gabor-Feature-Image for obtaining a more accurate matching cost volume. Furthermore, in order to eliminate the matching ambiguities brought by the winnertakes-all method, an effective disparity refinement strategy using Confidence-Mask is implemented to select and refine the less reliable pixels. The proposed algorithm ranks 23th out of over 150 (global-based and local-based) methods on Middlebury data sets, both quantitative and qualitative evaluation show that it is comparable to state-of-the-art local-based stereo matching algorithms.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127481501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Edge-preserving single depth image interpolation","authors":"G. Zhong, Li Yu, Peng Zhou","doi":"10.1109/VCIP.2013.6706405","DOIUrl":"https://doi.org/10.1109/VCIP.2013.6706405","url":null,"abstract":"Depth image upsampling is an important issue in three-dimensional (3D) applications. However, edge blurring artifacts are still challenging problems in depth image upsampling, resulting in jagged artifacts in synthesized views which produce unpleasant visual perception. In this paper, an edge-preserving single depth image interpolation (ESDI) method is proposed. Specifically, local planar hypothesis (LPH) assuming that depth in natural scene are clustered as local planar planes is first explored. Then finite candidates generation (FCG) is proposed to generate limited discrete values satisfied with LPH to interpolated pixels. At last, the optimal combination of candidates is formulated as an energy minimization problem with a constraint in gradient domain, solved by iterated conditional modes (ICM) algorithm. Experiments demonstrate that ESDI achieves high resolution (HR) depth image with clear and sharp edges, and produces synthesized views with desirable quality.","PeriodicalId":407080,"journal":{"name":"2013 Visual Communications and Image Processing (VCIP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130618793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}