{"title":"Incremental Object Matching with Bayesian Methods and Particle Filters","authors":"M. Toivanen, J. Lampinen","doi":"10.1109/DICTA.2009.26","DOIUrl":"https://doi.org/10.1109/DICTA.2009.26","url":null,"abstract":"In batch learning all the training examples have to be available at once to train the model, which often leads to slow performance and large memory requirements. Little work has been done in developing incremental object learners. In this paper, we present an incremental method that finds corresponding points of similar object instances, appearing in natural grayscale images with arbitrary location, scale and orientation. The approach is Bayesian and combines the shape and appearance of the corresponding points into the posterior distribution for the location of them. The posterior distribution is recursively sampled with particle filters to locate the most probable corresponding point sets in the image being processed. The results indicate that the matched corresponding points can be used in forming a representation of the object, which can be used in detecting instances of the object in novel test images.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133289622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real Time Target Tracking with Pan Tilt Zoom Camera","authors":"Pankaj Kumar, A. Dick, Tan Soo Sheng","doi":"10.1109/DICTA.2009.84","DOIUrl":"https://doi.org/10.1109/DICTA.2009.84","url":null,"abstract":"We present an approach for real-time tracking of a non-rigid target with a moving pan-tilt-zoom (PTZ) camera. The tracking of the object and control of the camera is handled by one computer in real time. The main contribution of the paper is method for target representation, localisation and detection, which takes into account both foreground and background properties, and is more discriminative than the common colour histogram based back-projection. A Bayesian hypothesis test is used to decide whether each pixel is occupied by the target or not. We show that this target representation is suitable for use with a Continuously Adaptive Mean Shift (CAMSHIFT) tracker. Experiments show that this leads to a tracking system that is efficient and accurate enough to guide a PTZ camera to follow a moving target in real time, despite the presence of background clutter and partial occlusion.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133827036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context-Based Appearance Descriptor for 3D Human Pose Estimation from Monocular Images","authors":"S. Sedai, Bennamoun, D. Huynh","doi":"10.1109/DICTA.2009.81","DOIUrl":"https://doi.org/10.1109/DICTA.2009.81","url":null,"abstract":"In this paper we propose a novel appearance descriptor for 3D human pose estimation from monocular images using a learning-based technique. Our image-descriptor is based on the intermediate local appearance descriptors that we design to encapsulate local appearance context and to be resilient to noise. We encode the image by the histogram of such local appearance context descriptors computed in an image to obtain the final image-descriptor for pose estimation. We name the final image-descriptor the Histogram of Local Appearance Context (HLAC). We then use Relevance Vector Machine (RVM) regression to learn the direct mapping between the proposed HLAC image-descriptor space and the 3D pose space. Given a test image, we first compute the HLAC descriptor and then input it to the trained regressor to obtain the final output pose in real time. We compared our approach with other methods using a synchronized video and 3D motion dataset. We compared our proposed HLAC image-descriptor with the Histogram of Shape Context and Histogram of SIFT like descriptors. The evaluation results show that HLAC descriptor outperforms both of them in the context of 3D Human pose estimation.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133938175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiying Wen, Feng Li, D. Fraser, A. Lambert, X. Jia
{"title":"A Super Resolution Algorithm for Atmospherically Degraded Images Using Lucky Regions and MAP-uHMT","authors":"Zhiying Wen, Feng Li, D. Fraser, A. Lambert, X. Jia","doi":"10.1109/DICTA.2009.94","DOIUrl":"https://doi.org/10.1109/DICTA.2009.94","url":null,"abstract":"This paper demonstrates the possibility of super resolved image reconstruction for images affected by atmospheric turbulence. A lucky region method using bicoherence is proposed to select image tiles with superior quality or “lucky image regions” from a large number of short exposure images. A super resolved image is then reconstructed by a MAP method based on a Universal Hidden Markov Tree model from the lucky regions. Performance is demonstrated with real data.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121996168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wavelet Transform and Fusion of Linear and Non Linear Method for Face Recognition","authors":"M. Mazloom, S. Kasaei, Nourolhoda Alemi Neissi","doi":"10.1109/DICTA.2009.56","DOIUrl":"https://doi.org/10.1109/DICTA.2009.56","url":null,"abstract":"This work presents a method to increase the face recognition accuracy using a combination of Wavelet, PCA, KPCA, and RBF Neural Networks. Preprocessing, feature extraction and classification rules are three crucial issues for face recognition. This paper presents a hybrid approach to employ these issues. For preprocessing and feature extraction steps, we apply a combination of wavelet transform, PCA and KPCA. During the classification stage, the Neural Network (RBF) is explored to achieve a robust decision in presence of wide facial variations. At first derives a feature vector from a set of downsampled wavelet representation of face images, then the resulting PCA-based linear features and KPCA- based nonlinear features on wavelet feature vector for reduces the dimensionary of the vector, are extracted. During the classification stage, the Neural Network (RBF) is explored to achieve a robust decision in presence of wide facial variations. The computational load of the proposed method is greatly reduced as comparing with the original PCA, KPCA, ICA and LDA based method on the ORL, Yale and AR face databases. Moreover, the accuracy of the proposed method is improved.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125639538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Context-Based Approach for Detecting Suspicious Behaviours","authors":"A. Wiliem, V. Madasu, W. Boles, P. Yarlagadda","doi":"10.1109/DICTA.2009.31","DOIUrl":"https://doi.org/10.1109/DICTA.2009.31","url":null,"abstract":"A video surveillance system capable of detecting suspicious activities or behaviours is of paramount importance to law enforcement agencies. Such a system will not only reduce the work load of security personnel involved with monitoring the CCTV video feeds but also improve the time required to respond to any incident. There are two well known models to detect suspicious behaviour: misuse detection models which are dependent on suspicious behaviour definitions and anomaly detection models which measure deviations from defined normal behaviour. However, it is nearly possible to encapsulate the entire spectrum of either suspicious or normal behaviour. One of the ways to overcome this problem is by developing a system which learns in real time and adapts itself to behaviour which can be considered as common and normal or uncommon and suspicious. We present an approach utilising contextual information. Two contextual features, namely, type of behaviour and the commonality level of each type are extracted from long-term observation. Then, a data stream model which treats the incoming data as a continuous stream of information is used to extract these features. We further propose a clustering algorithm which works in conjunction with data stream model. Experiments and comparisons are conducted on the well known CAVIAR datasets to show the efficacy of utilising contextual information for detecting suspicious behaviour. The proposed approach is generic in nature and can be applicable to any features. However for the purpose of this study, we have employed pedestrian trajectories to represent the behaviour of people.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130388481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Texture Description in Local Scale Using Texton Histograms with Universal Dictionary","authors":"J. Rouco, M. G. Penedo, M. Ortega, A. González","doi":"10.1109/DICTA.2009.18","DOIUrl":"https://doi.org/10.1109/DICTA.2009.18","url":null,"abstract":"An inherent property to the texture patterns is that they are only meaningful in an appropriate range of scales. Taking this into account, the description of the texture patterns should be limited to its meaningful scales. This assumption motivates the research on local scale texture description. In this paper a new method for the extraction and description of texture features using local scale is presented. The method is based on texton histograms using universal prototype dictionaries of extremal filter outputs. Comparison results with state of the art texture description methods demonstrate the higher discriminative power of the proposed method in local scale texture classification.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131502713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Biological Inspired Visual Landmark Recognition Architecture","authors":"Q. Do, L. Jain","doi":"10.1109/DICTA.2009.61","DOIUrl":"https://doi.org/10.1109/DICTA.2009.61","url":null,"abstract":"An architecture that is inspired by a human’s capability to autonomously navigate an environment based on visual landmark recognition is presented. It consists of pre-attentive and attentive stages that allow visual landmarks to be recognized reliably under both clean and cluttered backgrounds. The pre-attentive stage provides an efficient means for real-time image processing by selectively focusing on regions of interest within input images. The attentive stage has a memory feedback modulation mechanism that allows visual knowledge of landmarks in the memory to interact and guide different stages in the architecture for efficient feature extraction and landmark recognition. The results show that the architecture is able to reliably recognise both occluded and non-occluded visual landmarks in complex backgrounds.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132080227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Rastegari, Mohammad Rouhani, N. Gheissari, M. Pedram
{"title":"Cartoon Motion Capturing and Retargeting by Rigid Shape Manipulation","authors":"Mohammad Rastegari, Mohammad Rouhani, N. Gheissari, M. Pedram","doi":"10.1109/DICTA.2009.85","DOIUrl":"https://doi.org/10.1109/DICTA.2009.85","url":null,"abstract":"Motion capture from live performance has received a lot of attention in the past few years. However, little work has been done on capturing the motions from existing cartoons. This paper presents a novel approach for capturing motion from existing cartoons and retargeting it to new characters in order to animate them. Most current approaches rely on the identification of key shapes and they fail to directly handle articulated shapes or local deformations. In contrast, we propose to use key-points as more efficient descriptors of motion. We use shape context to capture the motion between successive source frames. Then for retargeting the captured motion to a new character, we use an efficient rigid shape manipulation method that handles local deformations. The proposed method relies on user interaction only for the first source frame. Our algorithm has been applied to a set of test cases and the results shows improved performance in preserving the target's visual style particularly for articulated objects.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130605298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Smooth Approximation of L_infinity-Norm for Multi-view Geometry","authors":"Yuchao Dai, Hongdong Li, Mingyi He, Chunhua Shen","doi":"10.1109/DICTA.2009.64","DOIUrl":"https://doi.org/10.1109/DICTA.2009.64","url":null,"abstract":"Recently the $L_infty$-norm optimization has been introduced to multi-view geometry to achieve global optimality. It is solved through solving a sequence of SOCP (second order cone programming) feasibility problems which needs sophisticated solvers and time consuming. This paper presents an efficient smooth approximation of $L_infty$-norm optimization in multi-view geometry using log-sum-exp functions. We have proven that the proposed approximation is pseudo-convex with the property of uniform convergence. This allows us to solve the problem using gradient based algorithms such as gradient descent to overcome the non-differentiable property of $L_infty$ norm. Experiments on both synthetic and real image sequence have shown that the proposed algorithm achieves high precision and also significantly speeds up the implementation.","PeriodicalId":277395,"journal":{"name":"2009 Digital Image Computing: Techniques and Applications","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126507083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}