Ting Yu, Cha Zhang, Michael F. Cohen, Y. Rui, Ying Wu
{"title":"Monocular Video Foreground/Background Segmentation by Tracking Spatial-Color Gaussian Mixture Models","authors":"Ting Yu, Cha Zhang, Michael F. Cohen, Y. Rui, Ying Wu","doi":"10.1109/WMVC.2007.27","DOIUrl":"https://doi.org/10.1109/WMVC.2007.27","url":null,"abstract":"This paper presents a new approach to segmenting monocular videos captured by static or hand-held cameras filming large moving non-rigid foreground objects. The foreground and background objects are modeled using spatialcolor Gaussian mixture models (SCGMM), and segmented using the graph cut algorithm, which minimizes a Markov random field energy function containing the SCGMM models. In view of the existence of a modeling gap between the available SCGMMs and segmentation task of a new frame, one major contribution of our paper is the introduction of a novel foreground/background SCGMM joint tracking algorithm to bridge this space, which greatly improves the segmentation performance in case of complex or rapid motion. Specifically, we propose to combine the two SCGMMs into a generative model of the whole image, and maximize the joint data likelihood using a constrained Expectation- Maximization (EM) algorithm. The effectiveness of the proposed algorithm is demonstrated on a variety of sequences.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130071827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Roberts, S. McKenna, N. Wuyts, T. Valentine, A. Bengough
{"title":"Performance of Low-Level Motion Estimation Methods for Confocal Microscopy of Plant Cells in vivo","authors":"T. Roberts, S. McKenna, N. Wuyts, T. Valentine, A. Bengough","doi":"10.1109/WMVC.2007.32","DOIUrl":"https://doi.org/10.1109/WMVC.2007.32","url":null,"abstract":"The performance of various low-level motion estimation methods applied to fluorescence labelled growing cellular structures imaged using confocal laser scanning microscopy is investigated. This is a challenging and unusual domain for motion estimation methods. A selection of methods are discussed that can be contrasted in terms of how much spatial or temporal contextual information is used. The Lucas Kanade feature tracker, a spatially and temporally localised method, was, as one would expect, accurate around resolvable structure. It was not able to track the smaller, repetitive cell structure in the root tip and was somewhat prone to identifying spurious features. This approach is improved by developing a full multi-frame, robust, Bayesian method, and it is demonstrated that by using extra frames with motion constraints reduces such errors. Next, spatially global methods are discussed, including robust variational smoothing and Markov Random Field (MRF) modelling. A key conclusion that is drawn from investigation of these methods is that generic low-level (robust) smoothing functions do not provide good results in this application and that this is probably due to the large regions with little stable structure. Furthermore, contrary to recently reported successes, graph cuts and loopy belief propagation for MAP estimation of the MRF labels provided often poor and inconsistent estimates. The results suggest the need for greater emphasis on temporal smoothing for generic low-level motion estimation tools and more task specific, spatial constraints, perhaps in the form of high level models in order to accurately recover motion from such data. Finally, the form of the estimated growth is briefly discussed and related to contemporary biological models. We hope that this paper will assist non-specialists in applying state-of-the-art methods to this form of data.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116271419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GPU Acceleration of Real-time Feature Based Algorithms","authors":"J. Ready, Clark N. Taylor","doi":"10.1109/WMVC.2007.17","DOIUrl":"https://doi.org/10.1109/WMVC.2007.17","url":null,"abstract":"Feature tracking is one of the most fundamental tasks in computer vision, being used as a preliminary step to many high-level algorithms. In general, however, the number of features tracked (leading to more accurate high-level algorithms) must be balanced against the computational requirements of the feature tracking algorithm. To enable a large number of features to be tracked in real time without degrading the computational performance of high-level computer vision algorithms, we offload the feature tracking algorithm to the the video card (GPU) found in modern personal computers. Using the GPU allows for tracking an order of magnitude more features than a pure software-based algorithm, with minimal increase in CPU usage. We have demonstrated the computational benefits of GPU-based feature tracking within a real-time video stabilization application.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121808415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Motion Analysis In Compressed Video - An Hybrid Approach","authors":"M. Ibrahim, S. Rao","doi":"10.1109/WMVC.2007.28","DOIUrl":"https://doi.org/10.1109/WMVC.2007.28","url":null,"abstract":"In this paper, we propose a technique for the automatic extraction of moving objects from compressed domain video. We propose the use of a spatio-temporal filter for filtering the motion vectors and a hybrid approach to exploit both compressed domain processing and spatial domain processing to meet the tradeoff between computation and accuracy. The experimental results shows that our approach can detect objects with size smaller than one macroblock with accurate shape of the object.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121982105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Hasler, B. Rosenhahn, M. Asbach, J. Ohm, H. Seidel
{"title":"An Analysis-by-Synthesis Approach to Tracking of Textiles","authors":"N. Hasler, B. Rosenhahn, M. Asbach, J. Ohm, H. Seidel","doi":"10.1109/WMVC.2007.7","DOIUrl":"https://doi.org/10.1109/WMVC.2007.7","url":null,"abstract":"Despite strong interest in cloth simulation on the one hand and tracking of deformable objects on the other, little effort has been put into tracking cloth motion by modelling the fabric. Here, an analysis-by-synthesis approach to tracking textiles is proposed which, by fitting a simulated textile to a set of contours, is able to reconstruct the 3D-cloth configuration. Fitting is accomplished by optimising the parameters of the mass-spring model that is used to simulate the textile as well as the positions of a limited number of constrained points. To improve tracking accuracy and to overcome the inherently chaotic behaviour of the real fabric several techniques for tracking features on the cloth¿s surface and the best way for them to influence the simulation are evaluated.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127075196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shape Background Modeling : The Shape of Things That Came","authors":"Nathan Jacobs, Robert Pless","doi":"10.1109/WMVC.2007.35","DOIUrl":"https://doi.org/10.1109/WMVC.2007.35","url":null,"abstract":"Detecting, isolating, and tracking moving objects in an outdoor scene is a fundamental problem of visual surveillance. A key component of most approaches to this problem is the construction of a background model of intensity values. We propose extending background modeling to include learning a model of the expected shape of foreground objects. This paper describes our approach to shape description, shape space density estimation, and unsupervised model training. A key contribution is a description of properties of the joint distribution of object shape and image location. We show object segmentation and anomalous shape detection results on video captured from road intersections. Our results demonstrate the usefulness of building scene-specific and spatially-localized shape background models.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126205187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Two-stage Multi-view Analysis Framework for Human Activity and Interactions","authors":"Sangho Park, M. Trivedi","doi":"10.1109/WMVC.2007.3","DOIUrl":"https://doi.org/10.1109/WMVC.2007.3","url":null,"abstract":"This paper presents a new framework for a multi-stage multi-view approach for human interactions and activity analysis. The analysis is performed in a distributed vision system that synergistically integrate track- and body-level representations across multiple cameras. Our system aims at versatile and easily-deployable system that does not require careful camera calibration. Main contributions of the paper are: (1) context-dependent camera handover for occlusion handling, (2) switching the multi-stage analysis between track- and body-level representations, and (3) a hypothesis-verification paradigm for top-down feedback exploiting spatio-temporal constraints inherent in human interaction. Experimental evaluation shows the efficacy of the proposed system for analyzing multi-person interactions. Current implementation uses two views, but extension to more views is straightforward.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126817692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cascaded change detection for foreground segmentation","authors":"L. Teixeira, L. Côrte-Real","doi":"10.1109/WMVC.2007.11","DOIUrl":"https://doi.org/10.1109/WMVC.2007.11","url":null,"abstract":"The extraction of relevant objects (foreground) from a background is an important first step in many applications. We propose a technique that tackles this problem using a cascade of change detection tests, including noise-induced, illumination variation and structural changes. An objective comparison of pixel-wise modellingmethods is first presented. Given its best relation performance/complexity, the mixture of Gaussians was chosen to be used in the proposed method to detect structural changes. Experimental results show that the cascade technique consistently outperforms the commonly used mixture of Gaussians, without additional post-processing and without the expense of processing overheads.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123270214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Video Mosaicing using Camera Motion Properties","authors":"Pulkit Parikh, C. V. Jawahar","doi":"10.1109/WMVC.2007.14","DOIUrl":"https://doi.org/10.1109/WMVC.2007.14","url":null,"abstract":"We propose a video mosaicing scheme which exploits the motion information, implicitly available in the video. The information about the camera motion is propagated to the homographies used for mosaicing. While some of the recent approaches make use of the information stemming from non-overlapping pairs of frames, the smoothness of the camera motion has gone largely under-capitalized. We present a technique which exploits this useful cue for refining homographies. Moreover, a generic framework which exploits the camera motion model, to relate homographies in a video, is also proposed. The analysis and results of the proposed algorithms demonstrate significant promise, in terms of accuracy and robustness.","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121415579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-frame Approaches To Improve Face Recognition","authors":"D. Thomas, K. Bowyer, P. Flynn","doi":"10.1109/WMVC.2007.29","DOIUrl":"https://doi.org/10.1109/WMVC.2007.29","url":null,"abstract":"Face recognition from video sequences is becoming an important area of biometrics research. In this work, we explore different strategies to improve face recognition performance from video. We develop a good strategy to select the smallest number of frames to achieve a high level of performance. We apply Principal Component Analysis to identify suitable frames to represent the subjects. We demonstrate our approaches on our dataset, which uses three different cameras and is larger than any known research database of video face sequences. Finally, we compare our approach to an existing approach from UCSD [8, 9] and show that it performs slightly better than that approach (99% rank one recognition rate).","PeriodicalId":177842,"journal":{"name":"2007 IEEE Workshop on Motion and Video Computing (WMVC'07)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132997233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}