{"title":"Block-size adaptive transform domain estimation of end-to-end distortion for error-resilient video coding","authors":"Bohan Li, Tejaswi Nanjundaswamy, K. Rose","doi":"10.1109/ICIP.2016.7532727","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532727","url":null,"abstract":"The accuracy of end-to-end distortion (EED) estimation is crucial to achieving effective error resilient video coding. An established solution, the recursive optimal per-pixel estimate (ROPE), does so by tracking the first and second moments of decoder-reconstructed pixels. An alternative estimation approach, the spectral coefficient-wise optimal recursive estimate (SCORE), tracks instead moments of decoder-reconstructed transform coefficients, which enables accounting for transform domain operations. However, the SCORE formulation relies on a fixed transform block size, which is incompatible with recent standards. This paper proposes a non-trivial generalization of the SCORE framework which, in particular, accounts for arbitrary block size combinations involving the current and reference block partitions. This seemingly intractable objective is achieved by a two-step approach: i) Given the fixed block size moments of a reference frame, estimate moments of transform coefficients for the codec-selected current block partition; ii) Convert the current results to transform coefficient moments corresponding to a regular fixed block size grid, to facilitate EED estimation for the next frame. Experimental results first demonstrate the accuracy of the proposed estimate in conjunction with transform domain temporal prediction. Then the estimate is leveraged to optimize the coding mode and yields considerable gains in rate-distortion performance.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"7 1","pages":"2092-2096"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82138938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Material segmentation in hyperspectral images with minimal region perimeters","authors":"Yu Zhang, C. P. Huynh, N. Habili, K. Ngan","doi":"10.1109/ICIP.2016.7532474","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532474","url":null,"abstract":"We propose a supervised approach to the classification and segmentation of material regions in hyperspectral imagery. Our algorithm is a two-stage process, combining a pixelwise classification step with a segmentation step aiming to minimise the total perimeters of the resulting regions. Our algorithm is distinctive in its ability to ensure label consistency within local homogeneous areas and to generate material segments with smooth boundaries. Furthermore, we establish a new hyperspectral benchmark dataset to demonstrate the advantages of the proposed approach over several state-of-the-art methods.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"213 1","pages":"834-838"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79768473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A shape feature based bovw method for image classification using N-gram and spatial pyramid coding scheme","authors":"Elham Etemad, Gang Hu, Q. Gao","doi":"10.1109/ICIP.2016.7532408","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532408","url":null,"abstract":"Image classification is a general visual analysis task based on the image content coded by its representation. In this research, we proposed an image representation method that is based on the perceptual shape features and their spatial distributions. A natural language processing concept, N-gram, is adopted to generate a set of perceptual shape visual words for encoding image features. By combining hierarchical visual words and spatial pyramid, Spatio-Shape Pyramid representation is constructed to reduce the semantic gaps. Experimental results show that the proposed method outperforms other state-of-the-art methods.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"51 1","pages":"504-508"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84938960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measurement of critical temporal inconsistency for quality assessment of synthesized video","authors":"Hak Gu Kim, Yong Man Ro","doi":"10.1109/ICIP.2016.7532513","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532513","url":null,"abstract":"This paper proposes a new temporal consistency measure for quality assessment of synthesized video. Disocclusion regions appear hole regions of the synthesized video at virtual viewpoints. Filling hole regions could be problematic when the synthesized video is perceived through multi-view displays. In particular, the temporal inconsistency caused by hole filling process in view synthesis could affect the perceptual quality of the synthesized video. In the proposed method, we extract excessive flicker regions between consecutive frames and quantify the perceptual effects of the temporal inconsistency on them by measuring the structural similarity. We have demonstrated the validity of the proposed quality measure by comparisons of subjective ratings and existing objective metrics. Experimental results have shown that the proposed temporal inconsistency measure is highly correlated with the overall quality of the synthesized video.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"41 1","pages":"1027-1031"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81767724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guillermo Ruiz, Eduard Ramon, J. G. Giraldez, M. Ballester, F. Sukno
{"title":"Weighted regularized ASM for face alignment","authors":"Guillermo Ruiz, Eduard Ramon, J. G. Giraldez, M. Ballester, F. Sukno","doi":"10.1109/ICIP.2016.7532891","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532891","url":null,"abstract":"Active Shape Models are a powerful and well known method to perform face alignment. In some applications it is common to have shape information available beforehand, such as previously detected landmarks. Introducing this prior knowledge to the statistical model may result of great advantage but it is challenging to maintain this priors unchanged once the statistical model constraints are applied. We propose a new weighted-regularized projection into the parameter space which allows us to obtain shapes that at the same time fulfill the imposed shape constraints and are plausible according to the statistical model. The performed experiments show how using this projection better performance than competing state of the art methods is achieved.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"18 1","pages":"2906-2910"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82030978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Oliveira Dantas, H. Leal, Davy Oliveira Barros Sousa
{"title":"Fast multidimensional image processing with OpenCL","authors":"Daniel Oliveira Dantas, H. Leal, Davy Oliveira Barros Sousa","doi":"10.1109/ICIP.2016.7532664","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532664","url":null,"abstract":"Multidimensional image data, i.e., images with three or more dimensions, are used in many areas of science. Multidimensional image proçessing is supported in Python and MATLAB. VisionGL is an open source library that provides a set of image processing functions and can help the programmer by automatically generating code. The objective of this work is to augment VisionGL by adding multidimensional image processing support with OpenCL for high performance through use of GPUs. Benchmarking experiments were run with window and point operations to compare Python, MATLAB and VisionGL when processing 1D to 5D images. As a result, speedups of up to two orders of magnitude were obtained.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"11 1","pages":"1779-1783"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81397215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual tracking with sparse correlation filters","authors":"Yanmei Dong, Min Yang, Mingtao Pei","doi":"10.1109/ICIP.2016.7532395","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532395","url":null,"abstract":"Correlation filters have recently made significant improvements in visual object tracking on both efficiency and accuracy. In this paper, we propose a sparse correlation filter, which combines the effectiveness of sparse representation and the computational efficiency of correlation filters. The sparse representation is achieved through solving an ℓ0 regularized least squares problem. The obtained sparse correlation filters are able to represent the essential information of the tracked target while being insensitive to noise. During tracking, the appearance of the target is modeled by a sparse correlation filter, and the filter is re-trained after tracking on each frame to adapt to the appearance changes of the target. The experimental results on the CVPR2013 Online Object Tracking Benchmark (OOTB) show the effectiveness of our sparse correlation filter-based tracker.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"13 1","pages":"439-443"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82477045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Print quality assessment for stochastic clustered-dot halftones using compactness measures","authors":"P. Goyal, J. Allebach","doi":"10.1109/ICIP.2016.7533069","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7533069","url":null,"abstract":"Most electro-photographic printers prefer clustered-dot halftone textures for rendering smooth and stable prints. Clustered-dot halftone patterns can be periodic or aperiodic. As periodic clustered-dot halftone can lead to undesirable moiré patterns, stochastic clustered-dot halftone textures are more preferred. There are available different screening methods to generate stochastic clustered-dot halftone textures but there are no standard print quality assessment measures that can be easily used for quantitatively evaluating and comparing different stochastic clustered-dot halftoning methods. We explore the use of compactness measures for this purpose, and also propose a new compactness measure that seems good metric to quantitatively compare and assess the print quality of different stochastic clustered-dot halftoning methods. Using the proposed metric, we compare three different stochastic clustered-dot halftoning methods, and our results are almost in agreement with psychophysical experiments results reported earlier.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"73 1","pages":"3792-3796"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86556339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GCE-based model for the fusion of multiples color image segmentations","authors":"Lazhar Khelifi, M. Mignotte","doi":"10.1109/ICIP.2016.7532824","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532824","url":null,"abstract":"In this work, we introduce a new fusion model whose objective is to fuse multiple region-based segmentation maps to get a final better segmentation result. This new fusion model is based on an energy function originated from the global consistency error (GCE), a perceptual measure which takes into account the inherent multiscale nature of an image segmentation by measuring the level of refinement existing between two spatial partitions. Combined with a region merging/splitting prior, this new energy-based fusion model of label fields allows to define an interesting penalized likelihood estimation procedure based on the global consistency error criterion with which the fusion of basic, rapidly-computed segmentation results appears as a relevant alternative compared with other segmentation techniques proposed in the image segmentation field. The performance of our fusion model was evaluated on the Berkeley dataset including various segmentations given by humans.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"35 1","pages":"2574-2578"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86574057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingtao Xu, Peng Ye, Qiaohong Li, Yong Liu, D. Doermann
{"title":"No-reference document image quality assessment based on high order image statistics","authors":"Jingtao Xu, Peng Ye, Qiaohong Li, Yong Liu, D. Doermann","doi":"10.1109/ICIP.2016.7532968","DOIUrl":"https://doi.org/10.1109/ICIP.2016.7532968","url":null,"abstract":"Document image quality assessment (DIQA) aims to predict the visual quality of degraded document images. Although the definition of “visual quality” can change based on the specific applications, in this paper, we use OCR accuracy as a metric for quality and develop a novel no-reference DIQA method based on high order image statistics for OCR accuracy prediction. The proposed method consists of three steps. First, normalized local image patches are extracted with regular grid and a comprehensive document image codebook is constructed by K-means clustering. Second, local features are softly assigned to several nearest codewords, and the direct differences between high order statistics of local features and codewords are calculated as global quality aware features. Finally, support vector regression (SVR) is utilized to learn the mapping between extracted image features and OCR accuracies. Experimental results on two document image databases show that the proposed method can accurately predict OCR accuracy and outperforms previous algorithms.","PeriodicalId":6521,"journal":{"name":"2016 IEEE International Conference on Image Processing (ICIP)","volume":"49 1","pages":"3289-3293"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83656057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}