{"title":"Depth sensing with focus and exposure adaptation","authors":"Zhiwei Xiong, Yueyi Zhang, Pengyu Cong, Feng Wu","doi":"10.1109/VCIP.2012.6410772","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410772","url":null,"abstract":"Automatic focus and exposure are the key components in digital cameras nowadays, which jointly play an essential role for capturing a high quality image. In this paper, we make an attempt to address these two challenging issues for future depth cameras. Relying on a programmable projector, we establish a structured light system for depth sensing with focus and exposure adaptation. The basic idea is to change current illumination pattern and intensity locally according to the prior depth information. Consequently, object surfaces appearing at different depths in the scene can receive proper illumination respectively. In this way, more flexible and robust depth sensing can be achieved in comparison with fixed illumination, especially at near depth.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115381033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sparse representation of texture patches for low bit-rate image compression","authors":"Mai Xu, Jianhua Lu, Wenwu Zhu","doi":"10.1109/VCIP.2012.6410824","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410824","url":null,"abstract":"This paper proposes a sparse representation based approach for low bit-rate image compression using the learnt over-complete dictionary of texture patches. We first propose to compress each patch of the image with sparse and compressible linear combinations (via nonzero coefficients) of texture patterns encoded in a dictionary for image patches. Then, we find out that the compressibility and sparsity of coefficients can be achieved by the proposed recursive procedure of solving ℓ1 optimization problem of sparse representation. Moreover, rather than transform-based patterns (e.g. DCT), we explore the basic texture patterns from other training images with a learning algorithm based on the gradient descent, to form the over-complete dictionary. The experimental results demonstrate the effectiveness of the proposed approach.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115119121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image registration by using a descriptor for repetitive patterns","authors":"S. Ha, Seyun Kim, N. Cho","doi":"10.1109/VCIP.2012.6410831","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410831","url":null,"abstract":"This paper proposes a new feature-based image registration method based on the description of feature clusters. This method can find larger number of correspondences than the conventional methods using singleton feature descriptors, which often fail in repetitive patterns. The reason for the failure of conventional methods in a repeating pattern is due to the existence of too many similar features, which in turn gives geometrically inconsistent matching or do not survive ratio test. Hence the proposed method follows the strategy that first separate the similar features from the repetitive patterns from the others. Then the similar features in a pattern are grouped into a set that is described by a support vector descriptor in terms of the cluster's center and radius. Once the same pattern in different images are matched, the geometric cue is added to find many geometrically consistent correspondences of the features. In the experiments, it has been demonstrated that the larger number of geometrically consistent correspondences from the repetitive pattern give more accurate registration, and thus more pleasing results in image stitching and panoramic image generation.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124318242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Affine SKIP and DIRECT modes for efficient video coding","authors":"Han Huang, J. Woods, Yao Zhao, H. Bai","doi":"10.1109/VCIP.2012.6410841","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410841","url":null,"abstract":"Higher-order motion models were introduced in video coding a couple of decades ago, but have not been widely used due to both difficulty in parameters estimation and their requirement of more side information. Recently, researchers have put them back into consideration. In this paper, the affine motion model is employed in SKIP and DIRECT modes to produce a better prediction. In affine SKIP/DIRECT, candidate predictors of the motion parameters are derived from the motions of neighboring coded blocks, with the best predictor determined by rate-distortion tradeoff. Extensive experiments have shown the efficiency of these new affine modes. No additional motion estimation is needed, so the proposed method is also quite practical.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123819756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tom Heyman, Vincent Spruyt, Sebastian Gruenwedel, A. Ledda, W. Philips
{"title":"A Canonical Correlation Analysis based motion model for probabilistic visual tracking","authors":"Tom Heyman, Vincent Spruyt, Sebastian Gruenwedel, A. Ledda, W. Philips","doi":"10.1109/VCIP.2012.6410804","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410804","url":null,"abstract":"Particle filters are often used for tracking objects within a scene. As the prediction model of a particle filter is often implemented using basic movement predictions such as random walk, constant velocity or acceleration, these models will usually be incorrect. Therefore, this paper proposes a new approach, based on a Canonical Correlation Analysis (CCA) tracking method which provides an object specific motion model. This model is used to construct a proposal distribution of the prediction model which predicts new states, increasing the robustness of the particle filter. Results confirm an increase in accuracy compared to state-of-the-art methods.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128558799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive rate control for Wyner-Ziv video coding","authors":"Ghazaleh Esmaili, P. Cosman","doi":"10.1109/VCIP.2012.6410822","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410822","url":null,"abstract":"In Wyner-Ziv video coding architectures, the available bit budget to each GOP is shared between key frames and Wyner-Ziv frames. In this work, we first propose a model to express the relationship between quantization step size of key and WZ frames based on their motion activity. Then we apply this model to propose an adaptive algorithm adjusting the quantization step size of key and WZ frames to achieve and maintain a target bit rate. We evaluate the rate distortion performance of the proposed method and compare to a common method in the literature.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127121167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An efficient foreground-based surveillance video coding scheme in low bit-rate compression","authors":"Shanghang Zhang, Kaijin Wei, Huizhu Jia, Xiaodong Xie, Wen Gao","doi":"10.1109/VCIP.2012.6410791","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410791","url":null,"abstract":"Many works have been done in the area of surveillance video compression, while problems still exist. The block-based schemes have blocking artifacts in the edge of foreground, while the object-based coding schemes have excessive bit consumption for coding the object shape. A novel foreground-based (FG-based) coding scheme is presented in this paper to solve these two problems and can gain better video quality at low bit-rate. The improvement comes from: 1) obtaining a foreground frame (FG-frame) by segmentation, in which proper constant value 128 is adopted to represent the luminance and chrominance value of background pixel and thus the residue error is reduced; 2) FG-based motion estimation (ME) and motion compensation (MC), which are more accurate for the foreground prediction and reduce the residue error of edge block in the foreground; 3) a new coding mode (BG-mode) is designed to better code the background when it is falsely segmented as foreground in FG-frames; 4) FG-based rate distortion optimized (RDO) mode decision (MD) is proposed to emphasize the foreground by calculating the distortion in the foreground domain; 5) avoiding shape coding by recovering the shape mask from the reconstructed foreground (REC-FG) frame and the constant background value 128. Our scheme is implemented with AVS encoder platform and the experiment results show the efficiency of the proposed scheme.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127580772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"QoE analysis for scalable video adaptation","authors":"M. Li, Zhenzhong Chen, Yap-Peng Tan","doi":"10.1109/VCIP.2012.6410787","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410787","url":null,"abstract":"Quality of Experience (QoE) serves as a key service goal in video applications. In this paper, we study the QoE issue in scalable video adaptation by constructing a subjective video quality assessment database based on the full scalability of SVC. We derive the optimal scalability adaptation track for individual video and further summarize common scalability adaptation tracks for grouped videos. The common track provides useful guidelines on how to adapt scalable video based on their content characteristics. A rate-QoE model is proposed accordingly for the SVC adaptation. Experimental analyses show that the novel QoE-aware scalability adaptation scheme significantly outperforms the existing ones.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127271217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive protection scheme for MVC-encoded stereoscopic video streaming in IP-based networks","authors":"César Díaz, J. Cabrera, F. Jaureguizar, N. García","doi":"10.1109/VCIP.2012.6410802","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410802","url":null,"abstract":"We present an adaptive unequal error protection (UEP) strategy built on the 1-D interleaved parity Application Layer Forward Error Correction (AL-FEC) code for protecting the transmission of stereoscopic 3D video content encoded with Multiview Video Coding (MVC) through IP-based networks. Our scheme targets the minimization of quality degradation produced by packet losses during video transmission in time-sensitive application scenarios. To that end, based on a novel packet-level distortion model, it selects in real time the most suitable packets within each Group of Pictures (GOP) to be protected and the most convenient FEC technique parameters, i.e., the size of the FEC generator matrix. In order to make these decisions, it considers the relevance of the packet, the behavior of the channel, and the available bitrate for protection purposes. Simulation results validate both the distortion model introduced to estimate the importance of packets and the optimization of the FEC technique parameter values.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122029813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Residual coding of depth map with transform skipping","authors":"Cheon Lee, H. Wey, Jaejoon Lee, Yo-Sung Ho","doi":"10.1109/VCIP.2012.6410814","DOIUrl":"https://doi.org/10.1109/VCIP.2012.6410814","url":null,"abstract":"Since advanced 3D video systems employ depth information to support free-viewpoint navigation and comfortable 3D video viewing, efficient depth map coding is necessary for future 3D video systems. Most residual data in depth map coding are generated along abrupt depth discontinuities, represented by near-zero and high-magnitude values. In this paper, we model the residual data with two representative values calculated by the K-means clustering method and send them to the decoder by skipping transformation. After best mode decision, we applied the proposed method to a block containing residual data, and then we send the quantized representative values to decoder if its coding rate is less than the conventional best mode. By conducting INTRA only coding, -20.32% bit saving was achieved.","PeriodicalId":103073,"journal":{"name":"2012 Visual Communications and Image Processing","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114714145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}