{"title":"Real-time level set based tracking with appearance model using Rao-Blackwellized particle filter","authors":"D. Kim, Ehwa Yang, M. Jeon, V. Shin","doi":"10.1109/ICIP.2010.5650026","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5650026","url":null,"abstract":"In this paper, a computationally efficient algorithm for level set based tracking is suggested for near real-time implementation. The problem of computational complexity in level set based tracking is tackled by combining a sparse field level set method (SFLSM) with a Rao-Blackwellized particle filter (RBPF). Under the RBPF framework, affine motion is estimated using an appearance-based particle filtering (PF) to provide the initial curves for SFLSM and the local deformation of contours is analytically estimated through SFLSM. SFLSM is adopted to significantly reduce the computational complexity of the level set method (LSM) implementation. For the initial curve estimation in SFLSM, the estimated position and object scale are provided by the appearance-based PF in order to achieve the desired efficiency. Furthermore, the appearance-based PF alleviates inaccurate segmentation incurred by an incorrect initial curve. Experimental results with a real-video confirm the promising performance of this method.","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115938321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Removal of false positive in object detection with contour-based classifiers","authors":"Hongyu Li, Lei Chen","doi":"10.1109/ICIP.2010.5649943","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5649943","url":null,"abstract":"This paper proposes a method of constructing a contour-based classifier to remove the false positive objects after Haar-based detection. The classifier is learned with the discrete AdaBoost. During the training, the oriented chamfer is introduced to construct strong learners. Experimental results have demonstrated that the proposed method is feasible and promising in the removal of the false positive.","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115856717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transferable Belief Model for hair mask segmentation","authors":"C. Rousset, P. Coulon, M. Rombaut","doi":"10.1109/ICIP.2010.5651970","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5651970","url":null,"abstract":"In this paper, we present a study of transferable belief model for automatic hair segmentation process. Firstly, we recall the transferable Belief Model. Secondly, we defined for the parameters which characterize hair (Frequency and Color) a Basic Belief assignment which represents the belief that a pixel was or not a hair pixel. Then we introduce a discounting function based on the distance to the face to increase the reliability of our sensors. At the end of this process, we segment the hair with a matting process. We compare the process with the logical fusion. Results are evaluated using semi-manual segmentation references","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"47 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132476839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"L1 matting","authors":"P. G. Lee, Ying Wu","doi":"10.1109/ICIP.2010.5652939","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5652939","url":null,"abstract":"Natural image matting continues to play a large role in a wide variety of applications. As an ill-posed problem, matting is a very difficult to solve due to its underconstrained nature. Current approaches can require a lot of user input, restrict themselves to a sparse subset of the image, and often make assumptions that are unlikely to hold. In this paper, we pose a way to better satisfy smoothness assumptions of some of these methods utilizing the nonlinear median filter which arises naturally from the L1 norm. The median has the property that it tends to smooth the foreground and background of the image while leaving any edges relatively unaltered. We then show that such an image is often more suitable as input than the original image, even when user interaction is minimal, suggesting that our method is more amenable to automatic matting. 1","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130013138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Direction-adaptive transforms for coding prediction residuals","authors":"R. Cohen, S. Klomp, A. Vetro, Huifang Sun","doi":"10.1109/ICIP.2010.5651058","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5651058","url":null,"abstract":"In this paper, we present 2-D direction-adaptive transforms for coding prediction residuals of video. These Direction-Adaptive Residual Transforms (DART) are shown to be more effective than the traditional 2-D DCT when coding residual blocks that contain directional features. After presenting the directional transform structures and improvements to their efficiency, we outline how they are used to code both Inter and Intra prediction residuals. For Intra coding, we also demonstrate the relation between the prediction mode and the optimal DART orientation. Experimental results exhibit up to 7% and 9.3% improvements in compression efficiency in JM 16.0 and JM-KTA 2.6r1 respectively, as compared to using only the conventional H.264/AVC transform.","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130197386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FTV (Free-viewpoint TV)","authors":"M. Tanimoto","doi":"10.1109/ICIP.2010.5652084","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5652084","url":null,"abstract":"We have developed a new type of television named FTV (Free-viewpoint TV). FTV is the ultimate 3DTV that enables us to view a 3D scene by freely changing our viewpoints. At present, FTV is available on a single PC or a mobile player. The international standardization of FTV has been conducted in MPEG. The first phase of FTV was MVC (Multi-view Video Coding) and the second phase is 3DV (3D Video). FDU (FTV Data Unit) is proposed as a data format for 3DV. FTU can compensate errors of the synthesized views caused by depth error.","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134033030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A rotation and scale invariant descriptor for shape recognition","authors":"A. D. Lillo, G. Motta, J. Storer","doi":"10.1109/ICIP.2010.5652671","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5652671","url":null,"abstract":"We address the problem of retrieving the silhouettes of objects from a database of shapes with a translation and rotation invariant feature extractor. We retrieve silhouettes by using a “soft” classification based on the Euclidean distance. Experiments show significant gains in retrieval accuracy over the existing literature. This work extends the use of our previously employed feature extractor and shows that the same descriptor can be used for both texture [3] and shape recognition.","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134118811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic preview video generation for mesh sequences","authors":"Seung-Ryong Han, T. Yamasaki, K. Aizawa","doi":"10.1109/ICIP.2010.5652185","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5652185","url":null,"abstract":"We present a novel method that automatically generates a preview video of a mesh sequence. To make the preview appealing to users, the important features of the mesh model should be captured in the preview video, while preserving the constraint that the transitions of the camera are as smooth as possible. Our approach models the important features by defining a surface saliency and by measuring the appearance of the mesh sequence. The task of generating the preview video is then formulated as a shortest-path problem and we find an optimal camera path by using Dijkstra's algorithm.","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134197520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spread spectrum-based watermarking for Tardos code-based fingerprinting for H.264/AVC video","authors":"Z. Shahid, M. Chaumont, W. Puech","doi":"10.1109/ICIP.2010.5652607","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5652607","url":null,"abstract":"In this paper, we present a novel approach for active finger-printing of state of the art video codec H.264/AVC. Tardos probabilistic fingerprinting code is embedded in H.264/AVC video signals using spread spectrum watermarking technique. Different linear and non-linear collusion attacks have been performed in the pixel domain to show the robustness of the proposed approach. The embedding has been performed in the non-zero quantized transformed coefficients (QTCs) while taking into account the reconstruction loop.","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134510426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Preventing re-recording based on difference between sensory perceptions of humans and devices","authors":"Takayuki Yamada, S. Gohshi, I. Echizen","doi":"10.1109/ICIP.2010.5650525","DOIUrl":"https://doi.org/10.1109/ICIP.2010.5650525","url":null,"abstract":"We propose a method for preventing illegal re-recording of images and videos with digital camcorders. Conventional digital watermarking techniques involve embedding content ID into images and videos, which helps to identify the place and time where the actual content was recorded. However, digital watermarking technology does not control the illegal re-recording of digital content with camcorders. The proposed method prevents such re-recording by corrupting the recorded content with an invisible noise signal using CCD or CMOS devices during recording. In this way, the re-recorded content will be unusable. To validate the results of this proposed method, we developed a functional prototype system for preventing illegal re-recording and implemented it on a 100-inch cinema screen.","PeriodicalId":228308,"journal":{"name":"2010 IEEE International Conference on Image Processing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133984599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}