Charles Florin, N. Paragios, G. Funka-Lea, James P. Williams
{"title":"Time-Varying Linear Autoregressive Models for Segmentation","authors":"Charles Florin, N. Paragios, G. Funka-Lea, James P. Williams","doi":"10.1109/ICIP.2007.4379003","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379003","url":null,"abstract":"Tracking highly deforming structures in space and time arises in numerous applications in computer vision. Static Models are often referred to as linear combinations of a mean model and modes of variation learned from training examples. In Dynamic Modeling, the shape is represented as a function of shapes at previous time steps. In this paper, we introduce a novel technique that uses the spatial and the temporal information on the object deformation. We reformulate tracking as a high order time series prediction mechanism that adapts itself on-line to the newest results. Samples (toward dimensionality reduction) are represented in an orthogonal basis, and are introduced in an auto-regressive model that is determined through an optimization process in appropriate metric spaces. Toward capturing evolving deformations as well as cases that have not been part of the learning stage, a process that updates on-line both the orthogonal basis decomposition and the parameters of the autoregressive model is proposed. Experimental results with a nonstationary dynamic system prove adaptive AR models give better results than both stationary models and models learned over the whole sequence.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133630352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Rate Control and Motion Estimation for H.264 Encoder","authors":"Loren Merritt, R. Vanam","doi":"10.1109/ICIP.2007.4379827","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379827","url":null,"abstract":"In this paper, we describe rate control and motion estimation in x264, an open source H.264/AVC encoder. We compare the rate control methods of x264 with the JM reference encoder and show that our approach performs well in both PSNR and bitrate. In motion estimation, we describe our implementation of initialization and show that it improves PSNR. We also propose an early termination for simplified uneven cross multi hexagon grid search (UMH) in x264 and show that it improves the speed by a factor of 1.5. Finally, we show that x264 performs 50 times faster and provides bitrates within 5% of the JM reference encoder for the same PSNR.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"483 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132266478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Peak Transform - A Nonlinear Transform for Efficient Image Representation and Coding","authors":"Zhihai He","doi":"10.1109/ICIP.2007.4379275","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379275","url":null,"abstract":"In this work, we introduce a nonlinear geometric transform, called peak transform, for efficient image representation and coding. Coupled with wavelet transform and subband decomposition, the peak transform is able to significantly reduce signal energy in high-frequency subbands and achieve a significant transform coding gain. This has important applications in efficient data representation and compression. Based on peak transform (PT), we design an image encoder, called PT encoder, for efficient image compression. Our extensive experimental results demonstrate that, in wavelet-based subband decomposition, the signal energy in high-frequency subbands can be reduced by up to 60% if a peak transform is applied. The PT image encoder outperforms state-of-the-art JPEG2000 and H.264 (INTRA) encoders by up to 2-3 dB in PSNR (peak signal-to-noise ratio), especially for images with a significant amount of high-frequency components.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132579637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image Recognition for Mobile Applications","authors":"J. Lee, K. Yow","doi":"10.1109/ICIP.2007.4379550","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379550","url":null,"abstract":"Our paper presents a system for efficient recognition of landmarks taken from camera phones. Information such as tutorial rooms within the captured landmarks is returned to user within seconds. The system uses a database of multiple viewpoint's images for matching. Various navigational aids and sensors are used to optimize accuracy and retrieval time by providing complementary information about relative position and viewpoint of each query image. This makes our system less sensitive to orientation, scale and perspective distortion. Multi-scale approach and a reliability score model are proposed in this application. Our system is validated by several experiments in the campus, with images taken from different resolution's camera phones, positions and times of day.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128817442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Topological-Stabilization Based Threshold Quantization for Robust Change Detection","authors":"Chang Su, A. Amer","doi":"10.1109/ICIP.2007.4379318","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379318","url":null,"abstract":"A threshold quantization algorithm for robust change detection is proposed in this paper. According to the threshold distribution of difference frames, a 4-level Lloyd-Max quantizer is designed, and then, based on the topological stabilization of video frames, the Lloyd-Max quantizer is refined by a linear adjusting function to form the proposed threshold quantizer. Objective and subjective experiments show that the proposed quantizer greatly improves the robustness of the thresholding methods for change detection thus significantly improves the quality of change masks without increasing computation loads.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128841237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Weighted Adaptive Lifting-Basedwavelet Transform","authors":"Yu Liu, K. Ngan","doi":"10.1109/ICIP.2007.4379278","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379278","url":null,"abstract":"In this paper, we propose a new weighted adaptive lifting (WAL)-based wavelet transform that is designed to solve the problems existing in the previous adaptive directional lifting (ADL) approach. The proposed approach uses the weighted function to make sure that the prediction and update stages are consistent, the directional interpolation to improve the orientation property of interpolated image, and adaptive interpolation filter to adjust to statistical property of each image. Experimental results show that the proposed WAL-based wavelet transform for image coding outperforms the conventional lifting-based wavelet transform up to 3.02 dB in PSNR and significant improvement in subjective quality is also observed. Compared with the ADL approach, up to 1.18 dB improvement in PSNR is reported.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131785582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaideep Jeyakar, R. Venkatesh Babu, K. Ramakrishnan
{"title":"Robust Object Tracking using Local Kernels and Background Information","authors":"Jaideep Jeyakar, R. Venkatesh Babu, K. Ramakrishnan","doi":"10.1109/ICIP.2007.4379762","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379762","url":null,"abstract":"The mean shift algorithm has been proved to be efficient for tracking 2D blobs through a video sequence. Even so, this algorithm has certain inherent disadvantages. In this paper, we propose a robust tracking algorithm which overcomes the drawbacks of global color histogram based tracking. We incorporate tracking based only on reliable colors by separating the object from its background. A fast yet robust model update is employed to overcome illumination changes. This algorithm is computationally simple enough to be executed real time and was tested on several complex video sequences. The proposed technique could be easily extended to other tracking algorithms too.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131903795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Some Techniques for Wow Effect Reduction","authors":"A. Czyżewski, P. Maziewski","doi":"10.1109/ICIP.2007.4379946","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379946","url":null,"abstract":"Wow distortion reduction has not attracted an adequate scientific attention so far. Only few papers on the subject are available, concerning mostly archive gramophone records, wax cylinders, and magnetic tapes affected by wow. This paper outlines researched wow reduction algorithms concerning archive movie soundtracks, or more generally audio recordings accompanying archival visual contents. The methods presented here are based on the pilot tone tracking, on the spectral analysis of genuine audio components, and on non-uniform resampling. The paper provides only a short overview of the concepts founding those methods; other studied approaches to the wow processing, as well as a more detailed description of the presented ones, can be found in referenced papers.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127432556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Epipolar Spaces and Optimal Sampling Strategies","authors":"J. Monaco, A. Bovik, L. Cormack","doi":"10.1109/ICIP.2007.4379642","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379642","url":null,"abstract":"If precise calibration information is unavailable, as is often the case for active binocular vision systems, the determination of epipolar lines becomes untenable. Yet, even without instantaneous knowledge of the geometry, the search for corresponding points can be restricted to areas called epipolar spaces. For each point in one image, we define the corresponding epipolar space in the other image as the union of all associated epipolar lines over all possible system geometries. Epipolar spaces eliminate the need for calibration at the cost of an increased search region. One approach to mitigate this increase is the application of a space variant sampling or foveation strategy. While the application of such strategies to stereo vision tasks is not new, only rarely has a foveation scheme been specifically tailored for a stereo vision task. In this paper we derive a foundation of theorems that provide a means for obtaining optimal sampling schemes for a given set of epipolar spaces. An optimal sampling scheme is defined as a strategy that minimizes the average area per epipolar space.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115342417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inter Frame Coding with Template Matching Averaging","authors":"Yoshinori Suzuki, C. Boon, T. K. Tan","doi":"10.1109/ICIP.2007.4379333","DOIUrl":"https://doi.org/10.1109/ICIP.2007.4379333","url":null,"abstract":"A template matching prediction based on a group of reconstructed pixels surrounding a target block enables prediction of pixels in the target block without motion information. The predictor of a target block is produced by minimizing the matching error of the template. Due to the freedom possessed by the template, the residuals of a target block may become large in flat regions. Our previous paper proposed to predictively encode the decimated version of a target block in flat regions to suppress the prediction errors. In this paper, the performance of template matching prediction is further improved. Multiple candidates are created by template matching at decoder. An average of the multiple candidates then forms the final predictor, which can reduce coding noise residing in the reference frames. Simulation results show that the proposed scheme improves coding efficiency of H.264 up to 7.9%.","PeriodicalId":131177,"journal":{"name":"2007 IEEE International Conference on Image Processing","volume":"241 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115586422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}