{"title":"Semantic Segmentation of RGB-D Images Using 3D and Local Neighbouring Features","authors":"F. Fooladgar, S. Kasaei","doi":"10.1109/DICTA.2015.7371307","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371307","url":null,"abstract":"3D scene understanding is one of the most important problems in the field of computer vision. Although, in the past decades, considerable attention has been devoted on the 2D scene understanding problem, now with the development of the depth sensors (like Microsoft Kinect), the 3D scene understanding has become a very challenging task. Traditionally, the scene understanding problem was considered as the semantic labeling of each image pixel. Semantic labeling of RGB-D images has not attained a comparable success, as the RGB semantic labeling, due to the lack of a challenging dataset. With the introduction of an RGB-D dataset, called NYU-V2, it became possible to propose a novel method to improve the labeling accuracy. In this paper, a semantic segmentation algorithm for RGB-D images is presented. The concentration of the proposed algorithm is on the feature description and classification steps. In the feature description step, the more discriminative features from RGB images and the 3D point cloud data are grouped with local neighboring features to incorporate their context into the classification step. In the classification step, a pairwise multi-class conditional random field framework is utilized in which the unary potential function is considered as the probabilistic output of a random forest classifier. The proposed algorithm is evaluated on the NYU-V2 dataset and the performance is compared to that of other methods presented in the literature. The proposed algorithm achieves the state-of-the-art results on the NYU-V2 dataset.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122514214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pallab Kanti Podder, M. Paul, Tanmoy Debnath, M. Murshed
{"title":"An Analysis of Human Engagement Behaviour Using Descriptors from Human Feedback, Eye Tracking, and Saliency Modelling","authors":"Pallab Kanti Podder, M. Paul, Tanmoy Debnath, M. Murshed","doi":"10.1109/DICTA.2015.7371227","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371227","url":null,"abstract":"In this paper an analysis of human engagement behaviour with video is presented based on real life experiments. An engagement model could be employed in classroom education, enhancing programming skills, reading etc. Two groups of people, independent of one another, watched eighteen video clips separately at different times. The first group's participants' eye gaze locations, right and left pupil sizes, and eye blinking patterns are recorded by a state of the art Tobii eye tracker. The second group of people who are video experts opined about the most significant attention points of the videos. A well-known bottom-up visual saliency model, Graph-Based Visual Saliency (GBVS), is also utilized to create salient points for the videos. Taking into consideration all the above mentioned descriptors the introduced behaviour analysis demonstrates the level of participants' concentration with the videos.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114518307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Human Tracking to Occlusion in Crowded Scenes","authors":"Hiromasa Takada, K. Hotta","doi":"10.1109/DICTA.2015.7371302","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371302","url":null,"abstract":"Human tracking in crowded scenes is a challenging problem because occlusion is frequently occurred. In this paper, we propose an online human tracking method which can handle occlusion effectively. Our method automatically changes a learning rate for updating tracking model according to the situation. If the tracking target is under occlusion, the learning rate decreases to reduce the influence of occlusion. However, the similarity score decreases by scale change of a tracking target as well as occlusion. To judge the occlusion or scale change, the similarity score on the Log-Polar coordinate is used. Furthermore, the size of search region is also changed according to the information about occlusion at previous frame. Experiments using the PETS2009 dataset show that our method improves tracking accuracy in crowded scenes.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124459011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Aerial Car Detection and Urban Understanding","authors":"D. Kamenetsky, J. Sherrah","doi":"10.1109/DICTA.2015.7371225","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371225","url":null,"abstract":"In this work we investigate car detection from aerial imagery and explore how it can be applied to urban understanding. To perform car detection we use the rotationally-invariant Fourier HOG detector. By adding incremental changes we are able to improve its detection probability by 10% for a range of false alarm rates. Further improvements can be made if we filter out cars that are not near known streets or inside car parks. We use the detected cars for automatic urban understanding: street estimation, car park detection and monitoring. In our experiments we were able to detect about half of all car parks in two major cities. Our method for car park monitoring allows us to find simple trends in car park usage, as well as changes in car park structure. We expect this information to be highly useful for future city planning.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122581506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Olsen, Sung-Ji Han, Brendan Calvert, P. Ridd, Owen Kenny
{"title":"In Situ Leaf Classification Using Histograms of Oriented Gradients","authors":"A. Olsen, Sung-Ji Han, Brendan Calvert, P. Ridd, Owen Kenny","doi":"10.1109/DICTA.2015.7371274","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371274","url":null,"abstract":"Histograms of Oriented Gradients (HOGs) have proven to be a robust feature set for many visual object recognition applications. In this paper we investigate a simple but powerful approach to make use of the HOG feature set for in situ leaf classification. The contributions of this work are threefold. Firstly, we present a novel method for segmenting leaves from a textured background. Secondly, we investigate a scale and rotation invariant enhancement of the HOG feature set for texture based leaf classification - whose results compare well with a multi-feature probabilistic neural network classifier on a benchmark data set. And finally, we introduce an in situ data set containing 337 images of Lantana camara - a weed of national significance in the Australian landscape - and neighbouring flora, upon which our proposed classifier achieves high accuracy (86.07%) in reasonable time and is thus viable for real-time detection and control of Lantana camara.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"471 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116187220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Direct 6-DoF Pose Estimation from Point-Plane Correspondences","authors":"K. Khoshelham","doi":"10.1109/DICTA.2015.7371253","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371253","url":null,"abstract":"Localizing a mobile sensor in an indoor environment usually involves obtaining 3D scans of the environment and estimating the sensor pose by matching the successive scans. This can be done effectively by minimizing point-plane distances for which only iterative solutions are available. Iterative solutions are notorious for convergence issues, and are inefficient for long sequences of scans. This paper presents a direct method for estimating 6-dof pose of a sensor by minimizing point-plane distances. Through experimental evaluation it is shown that the direct method gives accurate estimates, and performs robustly in presence of noise. The performance of the direct method is also evaluated with point clouds of different scale, poor plane configurations and large numbers of points and planes. A MATLAB implementation of the direct solution is available at: http://people.eng.unimelb.edu.au/kkhoshelham/research.html#directmotion.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121160091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Track before Detect for Space Situation Awareness","authors":"S. Davey, T. Bessell, B. Cheung, M. Rutten","doi":"10.1109/DICTA.2015.7371316","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371316","url":null,"abstract":"This article considers the application of multi-target tracking algorithms to Space Situation awareness. The sensor is a telescope fitted with an optical band digital camera. Two different tracking paradigms are demonstrated: the first approach is a detect-then-track method that uses frame-to-frame registration to model the star field and detect a moving satellite, the detections are processed using a point-measurement tracker; the second is a track-before-detect algorithm that uses the telescope images directly as input and jointly tracks the stars and the satellite. The two are compared on experimental imagery collected from a telescope system.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128745396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Camera Network Topology Estimation by Lighting Variation","authors":"M. Zhu, A. Dick, A. Hengel","doi":"10.1109/DICTA.2015.7371245","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371245","url":null,"abstract":"The goal of this paper is to find connections between cameras in a large surveillance network. As a proxy for camera pairs whose fields of view overlap spatially, we find pairs that are affected by a common light source. We propose multiple measures of lighting variation and show that we can reliably detect nearby cameras even without direct overlap. The relationships discovered by our process can be used for problems such as automated tracking and re-identification across large camera networks. We demonstrate our method on a campus network of 20 and 26 cameras and evaluate its accuracy and performance.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129164856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Camera Tracking of Intelligent Targets with Hidden Reciprocal Chains","authors":"G. Stamatescu, A. Dick, L. White","doi":"10.1109/DICTA.2015.7371287","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371287","url":null,"abstract":"Real world targets are intelligent and almost always move with a destination in mind. This paper introduces a new target tracking algorithm for multi-camera networks based on a hidden reciprocal chain (HRC), which is able to capture the local dynamics and intention of a real world target in a statistical way. The model is non-causal and therefore fundamentally different to standard Markovian motion models which underpin most trackers, such as the Kalman filter. However it is less computationally expensive than more sophisticated models like Markov decision processes, which can capture complex behaviours but require approximate algorithms for inference. We argue that HRCs are a natural extension to existing Markovian models by presenting exact online inference and detection algorithms which scale well with the number of cameras and targets. Finally we demonstrate the potential benefits by presenting results on synthetic data for the problem of multi-target tracking across multiple cameras.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116663737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Class-Semantic Textons with Superpixel Neighborhoods for Natural Roadside Vegetation Classification","authors":"Ligang Zhang, B. Verma","doi":"10.1109/DICTA.2015.7371246","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371246","url":null,"abstract":"Accurate classification of roadside vegetation plays a significant role in many practical applications, such as vegetation growth management and fire hazard identification. However, relatively little attention has been paid to this field in previous studies, particularly for natural data. In this paper, a novel approach is proposed for natural roadside vegetation classification, which generates class- sematic color-texture textons at a pixel level and then makes a collective classification decision in a neighborhood of superpixels. It first learns two individual sets of bag-of-word visual dictionaries (i.e. class-semantic textons) from color and filter-bank texture features respectively for each object. The color and texture features of all pixels in each superpixel in a test image are mapped into one of the learnt textons using the nearest Euclidean distance, which are further aggregated into class probabilities for each superpixel. The class probabilities in each superpixel and its neighboring superpixels are combined using a linear weighting mixing, and the classification of this superpixel is finally achieved by assigning it the class with the highest class probability. Our approach shows higher accuracy than four benchmarking approaches on both a cropped region and an image datasets collected by the Department of Transport and Main Roads, Queensland, Australia.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128137437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}