{"title":"Smooth Globally Warp Locally: Video Stabilization Using Homography Fields","authors":"William X. Liu, Tat-Jun Chin","doi":"10.1109/DICTA.2015.7371309","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371309","url":null,"abstract":"Conceptually, video stabilization is achieved by estimating the camera trajectory throughout the video and then smoothing the trajectory. In practice, the pipeline invariably leads to estimating update transforms that adjust each frame of the video such that the overall sequence appears to be stabilized. Therefore, we argue that estimating good update transforms is more critical to success than accurately modeling and characterizing the motion of the camera. Based on this observation, we propose the usage of homography fields for video stabilization. A homography field is a spatially varying warp that is regularized to be as projective as possible, so as to enable accurate warping while adhering closely to the underlying geometric constraints. We show that homography fields are powerful enough to meet the various warping needs of video stabilization, not just in the core step of stabilization, but also in video inpainting. This enables relatively simple algorithms to be used for motion modeling and smoothing. We demonstrate the merits of our video stabilization pipeline on various public testing videos.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121523491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multi-Kernel Local Level Set Image Segmentation Algorithm for Fluorescence Microscopy Images","authors":"A. Gharipour, Alan Wee-Chung Liew","doi":"10.1109/DICTA.2015.7371218","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371218","url":null,"abstract":"Fluorescence microscopy image segmentation is a central task in high-throughput applications such as protein expression quantification and cell function investigation. In this paper, a multiple kernel local level set segmentation algorithm is introduced as a framework for fluorescence microscopy cell image segmentation. In this framework, a new local region-based active contour model in a variational level set formulation based on the piecewise constant model and multiple kernels mapping is proposed where a linear combination of multiple kernels is utilized to implicitly map the original local image data into data of a higher dimension. We evaluate the performance of the proposed method using a large number of fluorescence microscopy images. A quantitative comparison is also performed with some state-of-the-art segmentation approaches.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"03 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127280073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video Classification Based on Spatial Gradient and Optical Flow Descriptors","authors":"Xiaolin Tang, A. Bouzerdoum, S. L. Phung","doi":"10.1109/DICTA.2015.7371319","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371319","url":null,"abstract":"Feature point detection and local feature extraction are the two critical steps in trajectory-based methods for video classification. This paper proposes to detect trajectories by tracking the spatiotemporal feature points in salient regions instead of the entire frame. This strategy significantly reduces noisy feature points in the background region, and leads to lower computational cost and higher discriminative power of the feature set. Two new spatiotemporal descriptors, namely the STOH and RISTOH are proposed to describe the spatiotemporal characteristics of the moving object. The proposed method for feature point detection and local feature extraction is applied for human action recognition. It is evaluated on three video datasets: KTH, YouTube, and Hollywood2. The results show that the proposed method achieves a higher classification rate, even when it uses only half the number of feature points compared to the dense sampling approach. Moreover, features extracted from the curvature of the motion surface are more discriminative than features extracted from the spatial gradient.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"6 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116817338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face Recognition Using Two-Dimensional Tunable-Q Wavelet Transform","authors":"T. S. Kumar, Vivek Kanhangad","doi":"10.1109/DICTA.2015.7371261","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371261","url":null,"abstract":"Tunable-Q wavelet transform (TQWT) is a discrete wavelet transform that has been very effective in decomposing signals with oscillatory nature. In this paper, we develop a new two dimensional tunable-Q wavelet transform (2D-TQWT) using its 1D prototype and propose an approach for face recognition using 2D-TQWT. The proposed approach decomposes a face image into four sub bands. This is followed by extraction of local binary pattern based histogram features from different sub-bands. This extracted information is further combined to get the final representation. In order to evaluate the performance of the proposed 2D-TQWT based face recognition approach, experiments are carried out on two datasets namely, Yale and ORL face datasets. The performance of proposed approach is also compared with other existing wavelets. Experimental results show that the 2D-TQWT yields better recognition accuracy than other wavelets employed in our experiments for comparison.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131065037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Classification and Reconstruction by Introducing Independence and Randomization in Deep Neural Networks","authors":"G. Hiranandani, H. Karnick","doi":"10.1109/DICTA.2015.7371270","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371270","url":null,"abstract":"This paper deals with a novel way of improving classification as well as reconstructions obtained from deep neural networks. The underlying ideas that have been used throughout are Independence and Randomization. The idea is to expose the inherent properties of neural network architectures and to make simpler models that are easy to implement rather than creating highly fine-tuned and complex neural network architectures. For the most basic type of deep neural network i.e. fully connected, it has been shown that dividing the data into independent components and training each component separately not only reduces the parameters to be learned but also the training is more efficient. And if the predictions are fused appropriately the overall accuracy also increases. Using the orthogonality of LAB colour space, it is shown that L,A and B components trained separately produce better reconstructions than RGB components taken together which in turn produce better reconstructions than LAB components taken together. Based on a similar approach, randomization has been injected into the networks so as to make different networks as independent as possible. Again fusing predictions appropriately increases accuracy. The best error on MNIST's test data set was 1.91% which is a drop by 1.05% in comparison to architectures that we created similar to [1]. As the technique is architecture independent it can be applied to other networks - for example CNNs or RNNs.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125744766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Linear Complexity Approximate Method for Multi-Target Particle Filter Track before Detect","authors":"S. Davey, B. Cheung","doi":"10.1109/DICTA.2015.7371215","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371215","url":null,"abstract":"The particle filter offers the optimal Bayesian filter for track before detect with a single target. However, direct application to the case of multiple targets can be infeasible because the number of particles required grows exponentially. This paper presents a new method for efficiently implementing track before detect for multiple targets using particles. This method is compared with alternative options on a challenging scenario with up to 20 targets.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132641398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Localized Deep Extreme Learning Machines for Efficient RGB-D Object Recognition","authors":"H. F. Zaki, F. Shafait, A. Mian","doi":"10.1109/DICTA.2015.7371280","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371280","url":null,"abstract":"Existing RGB-D object recognition methods either use channel specific handcrafted features, or learn features with deep networks. The former lack representation ability while the latter require large amounts of training data and learning time. In real-time robotics applications involving RGB-D sensors, we do not have the luxury of both. In this paper, we propose Localized Deep Extreme Learning Machines (LDELM) that efficiently learn features from RGB-D data. By using localized patches, not only is the problem of data sparsity solved, but the learned features are robust to occlusions and viewpoint variations. LDELM learns deep localized features in an unsupervised way from random patches of the training data. Each image is then feed-forwarded, patch-wise, through the LDELM to form a cuboid of features. The cuboid is divided into cells and pooled to get the final compact image representation which is then used to train an ELM classifier. Experiments on the benchmark Washington RGB-D and 2D3D datasets show that the proposed algorithm not only is significantly faster to train but also outperforms state-of-the-art methods in terms of accuracy and classification time.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133401363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic Diagnosis Support System Using Nuclear and Luminal Features","authors":"Yuriko Harai, Toshiyuki Tanaka","doi":"10.1109/DICTA.2015.7371235","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371235","url":null,"abstract":"We present a method of automatic colorectal cancer diagnosis that can quantify cellular and structural tissue information. In this paper, we consider sixteen-dimensional features, consisting of the nuclei-cytoplasm (NC) ratio, connected nuclei area, and atypical lumen ratio. For the purpose of imitating the conditions of accurate medical diagnosing, we introduce a four-class classification for group 1, group 3 low, group 3 high, and group 5 biopsies (group 5 biopsies include well-, moderately, and poorly differentiated) in contrast to most previous works proposed in the literature, which classify biopsies into two or three classes. The image set used in this paper consists of 400 images stained from 123 patients by hematoxylin and eosin (the HE method). We compared the performance of the proposed method with a method using texture features that have been widely used in previous studies. Two classification tests were performed, leave-one-ROI-out cross-validation (CV) and leave-one-specimen-out CV. As a result, the proposed method obtained a classification accuracy of 95.0% for ROI-based CV and 78.3% for specimen-based CV.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133181870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scale Adaptive Filters","authors":"R. Marchant","doi":"10.1109/DICTA.2015.7371304","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371304","url":null,"abstract":"Image features vary in size and thus feature analysis often requires a multi-scale approach. Typically, this is achieved using a bank of filters centred at discrete scales. We introduce a novel filter bank constructed from Fourier series basis functions in the logarithmic frequency domain. The filter bank responses can be used to obtain a continuous approximation of the response to another filter shifted through scale. Using the Riesz transform of the filter bank, we can create a vector-valued monogenic signal scale response. The amplitude of this response is a phase- invariant distribution of the local energy of the image across scale, from which statistics such as mean scale and variance can be calculated. We demonstrate the usefulness of the filter bank by using principal component analysis to design filters, using k-means clustering to classify pixels by scale response and local structure, and creating novel continuous methods of blob detection and phase congruency.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122258486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Textons for 3D Binary Data with Applications to Classifying Cancellous Bone","authors":"B. Martin, M. Bottema","doi":"10.1109/DICTA.2015.7371312","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371312","url":null,"abstract":"Changes to bone density (BV/TV) due to ageing or diseases such as osteoporosis is well documented. Changes in the structure of cancellous bone are less well understood. In addition, changes in structure as a function of distance from the growth plate have not received much attention. One obstacle in studying structure changes is in quantifying the irregular shapes of trabeculae that constitute cancellous bone. Here, the method of texons, originally developed for texture analysis in images, is adapted to characterise patterns in three-dimensional binary data and used to characterise the structure of cancellous bone. Analysis of micro-CT scans of tibias from 30 growing rats in three experimental groups indicate that texton based structure characteristics are able to distinguish structure of cancellous bone both in the different treatment groups and as a function of distance from the epiphyseal growth plate.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122389857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}