{"title":"A contex-based predictive coder for lossless and near-lossless compression of video","authors":"K. Yang, A. Faryar","doi":"10.1109/ICIP.2000.900915","DOIUrl":"https://doi.org/10.1109/ICIP.2000.900915","url":null,"abstract":"We propose a new approach to context-based predictive coding of video, where the interframe or intraframe coding mode is adaptively selected on a pixel basis. We perform the coding mode selection using only the previously reconstructed samples which are also available at the decoder, so that any overhead information on the coding mode selection does not need to be transmitted to the decoder. The proposed coder also provides the lossless concatenated coding property when applied to multigeneration of video sequences since the same coding mode information is available at the second time encoding. The proposed coding mode selection enables the coder to easily incorporate error modeling and context modeling by performing the intraframe coding with one of the existing image coders such as the JPEG-LS standard. Experiments show that the proposed approach in conjunction with the JPEG-LS standard provides significant improvements in compression efficiency.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"341 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115954020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variational segmentation by piecewise facet models with application to range imagery","authors":"J. Goldschneider, A. Li","doi":"10.1109/ICIP.2000.901083","DOIUrl":"https://doi.org/10.1109/ICIP.2000.901083","url":null,"abstract":"Unlike conventional photographic images that are characterized only by light intensity or signal energy, laser radar (ladar) range data, or other terrain imagery, contain distance information. Range images are traditionally processed for their three-dimensional content. Previous innovations in partial differential equation (PDE) and total variation based segmentation techniques show good results for conventional images. Fast, efficient variational segmentation techniques that use higher order image models are needed for the preprocessing of such data for applications including target detection and acquisition, compression, and image modeling. We develop a variational segmentation algorithm using higher-order piecewise smooth models for multichannel range imagery such as ladar images. The algorithm is computationally stable, and a fast solution may be found using the fast Cholesky decomposition and a modified binary search tree.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131371629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image segmentation and object recognition by Bayesian grouping","authors":"S. Kalitzin, J. Staal, B. H. Romeny, M. Viergever","doi":"10.1109/ICIP.2000.899518","DOIUrl":"https://doi.org/10.1109/ICIP.2000.899518","url":null,"abstract":"We propose a Bayesian grouping approach for recognition and segmentation of large-scale structures representing objects in images. It is based on detection of local image properties, extraction of simple geometrical primitives, and grouping these primitives according to probability rules and prior models. As opposed to the various template matching techniques, our method does not rely on a fixed set of input data to generate the prior with a maximum likelihood. Instead, it selects a list of subsets of the local primitives and finds the optimum set of model priors that maximizes the likelihood of the model samples representing the selected subsets. In contrast with global recognition methods that classify the whole image, our approach aims at solving the recognition task together with the segmentation task. As an illustration we give a medical data example of feature grouping in 2D images involving vessel detection from local ridges.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132405398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High quality region-based video coding","authors":"A. Pinho","doi":"10.1109/ICIP.2000.901136","DOIUrl":"https://doi.org/10.1109/ICIP.2000.901136","url":null,"abstract":"Traditionally, region-based image and video coding have been addressed only in the context of low and very low bit-rate coding. However, one of the most interesting by-products of region-based image coding is the possibility of manipulation of image content at a reduced additional cost, allowing simple integration of content-based functionalities. In this paper, we address the topic of high quality region-based video coding and present experimental results showing that, using some recent encoding techniques, this goal can be achieved. An important part of the video codec, that one responsible for partition coding, relies on the recently proposed contour coding technique based on transition points, which is characterized by a good performance on high complexity contours.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132495816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance analysis of an H.263 video encoder for VIRAM","authors":"Thinh P. Q. Nguyen, A. Zakhor, K. Yelick","doi":"10.1109/ICIP.2000.899304","DOIUrl":"https://doi.org/10.1109/ICIP.2000.899304","url":null,"abstract":"VIRAM (vector intelligent random access memory) is a vector architecture processor with embedded memory, designed for portable multimedia processing devices. Its vector processing capability results in high performance multimedia processing, while embedded DRAM technology provides high memory bandwidth with low energy consumption. We evaluate and compare the performance of VIRAM to digital signal processors (DSPs) and conventional SIMD (single instruction multiple data) media extensions in the context of video coding. In particular, we examine motion estimation (ME) and the discrete cosine transform (DCT) which have been shown to dominate typical video encoders such as H.263. We show that VIRAM outperforms other architectures by 4.6/spl times/ to 8.7/spl times/ in computing ME and by 1.2/spl times/ to 5.0/spl times/ in computing DCT.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130006307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Tomita, T. Echigo, Masato Kurokawa, H. Miyamori, S. Iisaku
{"title":"A visual tracking system for sports video annotation in unconstrained environments","authors":"A. Tomita, T. Echigo, Masato Kurokawa, H. Miyamori, S. Iisaku","doi":"10.1109/ICIP.2000.899340","DOIUrl":"https://doi.org/10.1109/ICIP.2000.899340","url":null,"abstract":"A visual tracking system is presented in which a combination of techniques is used to obtain motion features of objects from a video sequence. Further processing of the motion features gives the spatio-temporal trajectories of the objects, that can be used as cues for annotation. The system solves problems found in tracking objects in unconstrained environments, such as in sports games, where there are multiple objects in motion, the camera performs pan, tilt and zoom movements, and there are objects other than the players in the background. Coarse segmentation is performed with multi-class statistical color models, constructed from samples of the representative colors of each team. Motion vectors are computed to find region correspondence between consecutive frames. Background elements are eliminated by using camera motion parameters, and other false matches are detected by analyzing motion pattern consistency. Finally, objects are registered by placing windows centered in each tracked region. An experimental realization was used to test the system for tracking players in a soccer game, but it could had also been used for generating annotation cues of videos from other sports as well.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130166489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual motion estimation via second order cone programming","authors":"Y. Jianchao","doi":"10.1109/ICIP.2000.899526","DOIUrl":"https://doi.org/10.1109/ICIP.2000.899526","url":null,"abstract":"The visual motion, induced by ego-motion of the camera, can be estimated through resection, intersection and transfer processes. Under the assumption of affine camera, the intersection/transfer process can be formulated as a system of 5 linear equations, so that any correspondence of an image point and its affine coordinates can be obtained by solving the equations using least squares (LS) techniques. However it produces sometimes a poor estimation result, due to the singularity of the coefficient matrix. In order to solve the problem, instead of trying to find an exact solution of the equations, we tried to obtain a robust least squares (RLS) solution via a second order cone programming technique. The superiority of RLS over LS is demonstrated by the experimental results.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134110684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Maldonado-Bascón, F. López-Ferreras, F. J. Acevedo-Rodríguez, H. Gómez-Moreno
{"title":"Intra and inter-band information evaluation in still image coding by means of the wavelet transform","authors":"S. Maldonado-Bascón, F. López-Ferreras, F. J. Acevedo-Rodríguez, H. Gómez-Moreno","doi":"10.1109/ICIP.2000.899324","DOIUrl":"https://doi.org/10.1109/ICIP.2000.899324","url":null,"abstract":"A new method for still image coding that tries to exploit intra and inter-band relations in the wavelet-transformed matrix is presented. An intuitive study of the relation between coefficients is done. The conclusion is that it is more important to exploit intra-band relations than inter-band relations. Results achieved are even better than those obtained with the best embedded published algorithms.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133998984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selecting the neighbourhood size, shape, weights and model order in optical flow estimation","authors":"L. Ng, V. Solo","doi":"10.1109/ICIP.2000.899525","DOIUrl":"https://doi.org/10.1109/ICIP.2000.899525","url":null,"abstract":"Local methods have long been used to estimate optical flow by fitting measurements in a small neighbourhood to a simple model. What is less well known are procedures to choose the neighbourhood size, weights and model order. In this paper, we show that the choice of these local model tuning variables can have a significant effect on the flow estimate. The optimal choice of these variables will depend on the image content, the noise level and the type of motion in the sequence. Hence, the development of a data-driven selection method is important research goal. This paper presents such a procedure based on Stein's unbiased risk estimators (SURE).","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134277896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An innovative approach for spatial video noise reduction using a wavelet based frequency decomposition","authors":"A. D. Stefano, P. White, W. Collis","doi":"10.1109/ICIP.2000.899350","DOIUrl":"https://doi.org/10.1109/ICIP.2000.899350","url":null,"abstract":"Many real word images are contaminated by noise. The noise not only degrades image quality but may also hinder further processing operations. Noise reduction techniques aim to both improve image quality and to aid further image processing. Spatial noise reduction techniques based on the discrete wavelet transform have been widely researched. This paper considers an undecimated shift invariant filter bank that has been used to decompose the image into components. The basic filters are derived from a biorthogonal wavelet basis. Reconstruction is obtained by a simple summation of the image components. A new thresholding scheme, which is obtained from Bayesian estimator theory, is used. The threshold parameters for each component are dependent on the noise level and are selected using a preliminary training procedure. The cost function utilised for the training is a weighted version of the mean square error which is designed to reflect human perception. The method compares favourably with other wavelet based noise reduction techniques and demonstrates significant noise reduction and visual quality enhancement.","PeriodicalId":193198,"journal":{"name":"Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134353505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}