{"title":"Summarizing Surveillance Video by Saliency Transition and Moving Object Information","authors":"M. Salehin, M. Paul","doi":"10.1109/DICTA.2015.7371311","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371311","url":null,"abstract":"Everyday an enormous amount of video is captured by surveillance system for various purposes around the whole world. However, this is almost impossible for human to analyze the vast majority of video data. In this paper, a video summarization method is introduced combining foreground object, motion, and visual attention cue. Foreground objects typically provide important information about video contents. Additionally, object motion is naturally more attractive to human being. Moreover, visual attention cue indicates the human's attraction label for key frame determination. Using these features, supervised classifier support vector machine (SVM) is applied to obtain the key frames from a surveillance video. Extensive experimental results show that the proposed method performs superior to the state-of-the-art method using publicly available BL-7F surveillance video dataset.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"58 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121013467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Aggregation Based Deep Aging Feature for Age Prediction","authors":"Jiayan Qiu, Yuchao Dai, Yuhang Zhang, J. Álvarez","doi":"10.1109/DICTA.2015.7371264","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371264","url":null,"abstract":"We propose a new, hierarchical, aggregation-based deep neural network to learn aging features from facial images. Our deep-aging feature vector is designed to capture both local and global aging cues from facial images. A Convolutional Neural Network (CNN) is employed to extract region- specific features at the lowest level of our hierarchy. These features are then hierarchically aggregated to consecutive higher levels and the resultant aging feature vector, of dimensionality 110, achieves both good discriminative ability and efficiency. Experimental results of age prediction on the MORPH-II databases show that our method outperforms state-of-the-art aging features by a clear margin. Experimental trails of our method across race and gender provide further confidence in its performance and robustness.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129404315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3-D Modeling from Concept Sketches of Human Characters with Minimal User Interaction","authors":"A. Johnston, G. Carneiro, Ren Ding, L. Velho","doi":"10.1109/DICTA.2015.7371212","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371212","url":null,"abstract":"We propose a new methodology for creating 3-D models for computer graphics applications from 2-D concept sketches of human characters using minimal user interaction. This methodology will facilitate the fast production of high quality 3-D models by non-expert users involved in the development process of video games and movies. The workflow starts with an image containing the sketch of the human character from a single viewpoint, in which a 2-D body pose detector is run to infer the positions of the skeleton joints of the character. Then the 3-D body pose and camera motion are estimated from the 2-D body pose detected from above, where we take a recently proposed methodology that works with real humans and adapt it to work with concept sketches of human characters. The final step of our methodology consists of an optimization process based on a sampling importance re-sampling method that takes as input the estimated 3-D body pose and camera motion and builds a 3-D mesh of the body shape, which is then matched to the concept sketch image. Our main contributions are: 1) a novel adaptation of the 3-D from 2-D body pose estimation methods to work with sketches of humans that have non-standard body part proportions and constrained camera motion; and 2) a new optimization (that estimates a 3-D body mesh using an underlying low-dimensional linear model of human shape) guided by the quality of the matching between the 3-D mesh of the body shape and the concept sketch. We show qualitative results based on seven 3-D models inferred from 2-D concept sketches, and also quantitative results, where we take seven different 3-D meshes to generate concept sketches, and use our method to infer the 3-D model from these sketches, which allows us to measure the average Euclidean distance between the original and estimated 3-D models. Both qualitative and quantitative results show that our model has potential in the fast production of 3-D models from concept sketches.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129122106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Liang, Chathurdara Sri Nadith Pathirage, Chenyu Wang, Wanquan Liu, Ling Li, J. Duan
{"title":"Face Recognition Despite Wearing Glasses","authors":"A. Liang, Chathurdara Sri Nadith Pathirage, Chenyu Wang, Wanquan Liu, Ling Li, J. Duan","doi":"10.1109/DICTA.2015.7371260","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371260","url":null,"abstract":"In this paper we address the challenge of performing face recognition on human faces that are wearing glasses. This is a common problem for face recognition and automatic identity checking at airports, as passengers frequently forget to remove their glasses when passing through customs. In order to solve this problem, we first propose an automatic glasses presence detection model based on the tree-pictorial-structured face detection model and such model can detect the presence of glasses and further assign landmarks on the rim, hinge, and bridge of the glasses on frontal faces. Experimental results show that the glasses detection rate is highly satisfactory for various face databases. Secondly, based on the landmarks on glasses, we apply the non-local colour total variation (CTV) inpainting approach in an attempt to remove the glasses; also, we apply the deep learning technique to further remove the traces of glasses and light reflection on lenses by regarding them as noises. Finally, experiments for face recognition after glasses removal are conducted by using some typical approaches and the results show that our glasses removal framework can improve face recognition accuracy significantly.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124153427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Level Set Based Segmentation of Cell Nucleus in Fluorescence Microscopy Images Using Correntropy-Based K-Means Clustering","authors":"A. Gharipour, Alan Wee-Chung Liew","doi":"10.1109/DICTA.2015.7371279","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371279","url":null,"abstract":"Fluorescence microscopy image segmentation is a challenging task in fluorescence microscopy image analysis and high-throughput applications such as protein expression quantification and cell function investigation. In this paper, a novel local level set segmentation algorithm in a variational level set formulation via a correntropy-based k-means clustering (LLCK) is introduced for fluorescence microscopy cell image segmentation. The performance of the proposed method is evaluated using a large number of fluorescence microscopy images. A quantitative comparison is also performed with some state-of-the-art segmentation approaches.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128530961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Revisit of Methods for Determining the Fundamental Matrix with Planes","authors":"Yi Zhou, L. Kneip, Hongdong Li","doi":"10.1109/DICTA.2015.7371221","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371221","url":null,"abstract":"Determining the fundamental matrix from a collection of inter-frame homographies (more than two) is a classical problem. The compatibility relationship between the fundamental matrix and any of the ideally consistent homographies can be used to compute the fundamental matrix. Using the direct linear transformation (DLT), the compatibility equation can be translated into a least squares problem and can be easily solved via SVD decomposition. However, this solution is extremely susceptible to noise and motion inconsistencies, hence rarely used. Inspired by the normalized eight-point algorithm, we show that a relatively simple but non-trivial two-step normalization of the input homographies achieves the desired effect, and the results are at last comparable to the less attractive hallucinated points method. The algorithm is theoretically justified and verified by experiments on both synthetic and real data.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128043803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-Rigid Structure from Motion through Estimation of Blend Shapes","authors":"P. Zhang, Y. Hung","doi":"10.1109/DICTA.2015.7371291","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371291","url":null,"abstract":"In this paper, we propose a prior-free approach to estimate non-rigid object from 2D image trajectories assuming the affine camera model. As mentioned in some recent works [7, 8], most low- rank methods are unable to recover objects with complex motion. We identify the small deformation condition as the condition fundamental to the triple column metric upgrade algorithm commonly used in many low-rank methods, and accordingly modify this algorithm so that it becomes independent of the number of basis. Inspired by the blend shape technique used in computer graphics, we model the non-rigid object as a combination of blend shapes. Unlike many existing methods that estimate an average shape plus a few directions of deformation, we recover each blend shape as a valid 3D shape through the introduction of a pseudo view, which helps to prevent degeneration in the direction of the camera axes. This gives the blend shapes clear physical meaning, and makes the method robust against overfitting. Experiments on synthetic datasets and real tracking datasets show that the proposed method outperforms the existing methods in both 3D error and robustness.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116573407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Fast 3D Reconstruction Using Silhouettes and Sparse Motion","authors":"D. Eason, J. Heather, Gadi Ben-Tal","doi":"10.1109/DICTA.2015.7371315","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371315","url":null,"abstract":"An efficient new 3D reconstruction algorithm designed for an industrial vision system is presented. The algorithm generates 3D models and motion estimates of rotating objects moving along a conveyor past a set of calibrated cameras. For our application the objects of interest (natural produce) have relatively simple surface geometries and this feature can be exploited in the reconstruction process. The proposed method combines shape-from-silhouette concepts with sparse motion tracking and is potentially fast enough to extend to real-time industrial applications. In addition to being robust and extremely computationally efficient, key differentiators for this work include (a) full exploitation of a priori model knowledge, (b) handling of highly dynamic and unpredictable object motions, and (c) support for objects containing relatively little shape and texture definition. The method is demonstrated and evaluated on a collection of synthetic example image sequences, and surface errors have been quantified as being less than 0.2mm from the ground-truth, providing confidence in the solution.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134480449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probabilistic Detection of Pointing Directions for Human-Robot Interaction","authors":"Dadhichi Shukla, Ö. Erkent, J. Piater","doi":"10.1109/DICTA.2015.7371296","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371296","url":null,"abstract":"Deictic gestures - pointing at things in human-human collaborative tasks - constitute a pervasive, non-verbal way of communication, used e.g. to direct attention towards objects of interest. In a human-robot interactive scenario, in order to delegate tasks from a human to a robot, one of the key requirements is to recognize and estimate the pose of the pointing gesture. Standard approaches rely on full-body or partial-body postures to detect the pointing direction. We present a probabilistic, appearance-based object detection framework to detect pointing gestures and robustly estimate the pointing direction. Our method estimates the pointing direction without assuming any human kinematic model. We propose a functional model for pointing which incorporates two types of pointing, finger pointing and tool pointing using an object in hand. We evaluate our method on a new dataset with 9 participants pointing at 10 objects.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130011181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xueyan Dong, Philip Eichinski, M. Towsey, Jinglan Zhang, P. Roe
{"title":"Birdcall Retrieval from Environmental Acoustic Recordings Using Image Processing","authors":"Xueyan Dong, Philip Eichinski, M. Towsey, Jinglan Zhang, P. Roe","doi":"10.1109/DICTA.2015.7371242","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371242","url":null,"abstract":"Acoustic recordings of the environment provide an effective means to monitor bird species diversity. To facilitate exploration of acoustic recordings, we describe a content-based birdcall retrieval algorithm. A query birdcall is a region of spectrogram bounded by frequency and time. Retrieval depends on a similarity measure derived from the orientation and distribution of spectral ridges. The spectral ridge detection method caters for a broad range of birdcall structures. In this paper, we extend previous work by incorporating a spectrogram scaling step in order to improve the detection of spectral ridges. Compared to an existing approach based on MFCC features, our feature representation achieves better retrieval performance for multiple bird species in noisy recordings.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134006500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}