2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)最新文献

筛选
英文 中文
Robust Automatic Face Clustering in News Video 新闻视频中鲁棒自动人脸聚类
Kaneswaran Anantharajah, S. Denman, D. Tjondronegoro, S. Sridharan, C. Fookes
{"title":"Robust Automatic Face Clustering in News Video","authors":"Kaneswaran Anantharajah, S. Denman, D. Tjondronegoro, S. Sridharan, C. Fookes","doi":"10.1109/DICTA.2015.7371301","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371301","url":null,"abstract":"Clustering identities in a video is a useful task to aid in video search, annotation and retrieval, and cast identification. However, reliably clustering faces across multiple videos is challenging task due to variations in the appearance of the faces, as videos are captured in an uncontrolled environment. A person's appearance may vary due to session variations including: lighting and background changes, occlusions, changes in expression and make up. In this paper we propose the novel Local Total Variability Modelling (Local TVM) approach to cluster faces across a news video corpus; and incorporate this into a novel two stage video clustering system. We first cluster faces within a single video using colour, spatial and temporal cues; after which we use face track modelling and hierarchical agglomerative clustering to cluster faces across the entire corpus. We compare different face recognition approaches within this framework. Experiments on a news video database show that the Local TVM technique is able effectively model the session variation observed in the data, resulting in improved clustering performance, with much greater computational efficiency than other methods.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121332086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluating Spatio-Temporal Parameters in Video Similarity Detection by Global Descriptors 基于全局描述符的视频相似性检测时空参数评估
A. Rouhi
{"title":"Evaluating Spatio-Temporal Parameters in Video Similarity Detection by Global Descriptors","authors":"A. Rouhi","doi":"10.1109/DICTA.2015.7371255","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371255","url":null,"abstract":"The role of partitioned colour-based global descriptors is well known in video similarity detection tasks for their inexpensive yet effective performance compared to local descriptors. They provide robust and discriminative results in content-preserving visual distortions such as strong re-encoding, pattern insertions and photometric effects. The current research evaluates the effectiveness of three spatio-temporal parameters in video similarity detection tasks. The investigated parameters are specifically colour space, frame partitioning and sampling frame rates. CRIM method (video only) is selected as the base due to its optimum performance in content-preserving visual distortions in the TRECVID/CCD (Content-based Copy Detection) 2011. An amended version of CRIM, based on normalised-average luminance is introduced to compare the results with the baseline. The performance comparison is conducted using a subset of the TRECVID/CCD 2011 dataset, affected by four types of content-preserving visual distortions: T3, T4, T5 and T6. The experimental results shows that the normalised-average luminance descriptors offer more robust and competitive performance. Although they yielded a slightly better performance at the highest sampling frame rate (all frames), compared to the baseline, they offer significantly better performance at the lower sampling frame rate. The experimental evidence also reveals that the core competency of the luminance-based descriptors is significantly improved in terms of mean processing time. This metric is generally known as a shortcoming in video processing algorithms. The effect of the number of partitions is also investigated and it has been shown that increasing the number of partitions can severely lower the efficiency of the method, without yielding a significant increase in the performance.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"229 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114817692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Stacked Face De-Noising Auto Encoders for Expression-Robust Face Recognition 用于表情鲁棒性人脸识别的堆叠人脸去噪自动编码器
Chathurdara Sri Nadith Pathirage, Ling Li, Wanquan Liu, Min Zhang
{"title":"Stacked Face De-Noising Auto Encoders for Expression-Robust Face Recognition","authors":"Chathurdara Sri Nadith Pathirage, Ling Li, Wanquan Liu, Min Zhang","doi":"10.1109/DICTA.2015.7371310","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371310","url":null,"abstract":"Recent advancement in unsupervised and transfer learning methods of deep learning networks has seen a complete paradigm shift in machine learning. Inspired by the recent evolution of deep learning (DL) networks that demonstrates a proven pathway of addressing challenging dilemmas in various problem domains, we propose a novel DL framework for expression-robust feature acquisition. The framework exploits the contributions of different colour components in different local face regions by recovering the neutral expression from various expressions. Furthermore, the framework rigorously de-noises a face with dynamic expressions in a progressive way thus it is termed as stacked face de-noising auto-encoders (SFDAE). The high-level expression-robust representations that are learnt via this framework will not only yield better reconstruction of neutral expression faces but also boost the performance of the subsequent LDA[1] classifier. The experimental results reveal the superiority of the proposed method to the existing works in terms of its generalization ability and the high recognition accuracy.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130635974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Batch Mode Active Learning for Object Detection Based on Maximum Mean Discrepancy 基于最大平均差异的批处理模式主动学习目标检测
Yingying Liu, Yang Wang, A. Sowmya
{"title":"Batch Mode Active Learning for Object Detection Based on Maximum Mean Discrepancy","authors":"Yingying Liu, Yang Wang, A. Sowmya","doi":"10.1109/DICTA.2015.7371240","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371240","url":null,"abstract":"Various active learning methods have been proposed for image classification problems, while very little work addresses object detection. Measuring the informativeness of an image based on its object windows is a key problem in active learning for object detection. In this paper, an image selection method to select the most representative images is proposed based on measuring their object window distributions by Maximum Mean Discrepancy (MMD). Then an active learning method for object detection is introduced based on MMD-based image selection. Experimental results show that MMD-based image selection can improve object detection performance compared to random image selection. The proposed active learning method based on MMD image selection also outperforms a classical active learning method and passive learning method.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125929195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Mammogram Mass Classification with Temporal Features and Multiple Kernel Learning 基于时间特征和多核学习的乳房x光片质量分类
Fei Ma, Limin Yu, M. Bajger, M. Bottema
{"title":"Mammogram Mass Classification with Temporal Features and Multiple Kernel Learning","authors":"Fei Ma, Limin Yu, M. Bajger, M. Bottema","doi":"10.1109/DICTA.2015.7371282","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371282","url":null,"abstract":"Based on previous work on regional temporal mammogram registration, this study investigates the combination of image features measured from single regions (single features) and image features measured from the matched regions of temporal mammograms (temporal features) for the classification of malignant masses. Three SVM kernels, the multilayer perceptron kernel, the polynomial kernel, and the gaussian radial basis function kernel, and the combination of these kernels, the multiple kernel learning method, were applied to both single and temporal features for the mass classification. To combine the two types of features, 3 combination rules, Linear combination, Max and Min, were used to combine classification results obtained on single and temporal features. The results showed that combining the MKL classification results on single features, and MKL classification results on temporal features, with Min rule produces the best classification results. The experiment result indicates that incorporating the temporal change information in mammography mass classification can improve the performance detection.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121903171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Violent Scene Detection Using a Super Descriptor Tensor Decomposition 使用超描述张量分解的暴力场景检测
Muhammad Rizwan Khokher, A. Bouzerdoum, S. L. Phung
{"title":"Violent Scene Detection Using a Super Descriptor Tensor Decomposition","authors":"Muhammad Rizwan Khokher, A. Bouzerdoum, S. L. Phung","doi":"10.1109/DICTA.2015.7371320","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371320","url":null,"abstract":"This article presents a new method for violent scene detection using super descriptor tensor decomposition. Multi-modal local features comprising auditory and visual features are extracted from Mel-frequency cepstral coefficients (including first and second order derivatives) and refined dense trajectories. There is usually a large number of dense trajectories extracted from a video sequence; some of these trajectories are unnecessary and can affect the accuracy. We propose to refine the dense trajectories by selecting only discriminative trajectories in the region of interest. Visual descriptors consisting of oriented gradient and motion boundary histograms are computed along the refined dense trajectories. In traditional bag-of-visual-words techniques, the feature descriptors are concatenated to form a single large feature vector for classification. This destroys the spatio-temporal interactions among features extracted from multi-modal data. To address this problem, a super descriptor tensor decomposition is proposed. The extracted feature descriptors are first encoded using super descriptor vector method. Then the encoded features are arranged as tensors so as to retain the spatio-temporal structure of the features. To obtain a compact set of features for classification, the TUCKER-3 decomposition is applied to the super descriptor tensors, followed by feature selection using Fisher feature ranking. The obtained features are fed to a support vector machine classifier. Experimental evaluation is performed on violence detection benchmark dataset, MediaEval VSD2014. The proposed method outperforms most of the state-of-the-art methods, achieving MAP2014 scores of 60.2% and 67.8% on two subsets of the dataset.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126503051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Illumination Invariant Efficient Face Recognition Using a Single Training Image 基于单幅训练图像的光照不变高效人脸识别
B. L. Jangid, K. K. Biswas, M. Hanmandlu, G. Chetty
{"title":"Illumination Invariant Efficient Face Recognition Using a Single Training Image","authors":"B. L. Jangid, K. K. Biswas, M. Hanmandlu, G. Chetty","doi":"10.1109/DICTA.2015.7371266","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371266","url":null,"abstract":"This paper presents a single sample face recognition technique which takes care of illumination variations by applying normalization based on Weber's law. Local Directional Pattern (LDP) features are extracted from the normalized face by examining the prominent edge directions at each pixel. The LDP image is divided into non-overlapping windows and each window is treated as a fuzzy set. Treating LDP values as the information source values, entropy features called the information set- based features are extracted from each window. Further, 2DPCA is used to reduce the number of features. These features are augmented with entropy features of the fiducial regions and contour based features for face recognition. A nearest neighbor classifier based on these features is used on Extended Yale B and Face94 datasets and it is shown that compared with other results based on single and multiple training images, the proposed approach results in better recognition accuracy for wide illumination variations in test images. Further the efficiency of the scheme is shown by comparing the number of features needed for recognition.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131388603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Expectation-Maximization with Image-Weighted Markov Random Fields to Handle Severe Pathology 期望最大化与图像加权马尔科夫随机场处理严重病理
A. Pagnozzi, N. Dowson, A. Bradley, R. Boyd, P. Bourgeat, S. Rose
{"title":"Expectation-Maximization with Image-Weighted Markov Random Fields to Handle Severe Pathology","authors":"A. Pagnozzi, N. Dowson, A. Bradley, R. Boyd, P. Bourgeat, S. Rose","doi":"10.1109/DICTA.2015.7371257","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371257","url":null,"abstract":"This paper describes an automatic tissue segmentation algorithm for brain MRI of children with cerebral palsy (CP) who exhibit severe cortical malformations. Many of the currently popular brain segmentation techniques rely on registered atlas priors and so generalize poorly to severely injured data sets, because of large discrepancies between the target brain and healthy (or injured) atlases. We propose a prior-less approach combined with a modification of the Expectation Maximization (EM)/Markov Random Field (MRF) segmentation by imposing a continuous weighting scheme to penalize intensity discrepancies between pairs of neighbors within each clique neighborhood, to provide robustness to the unique clinical problem of severe anatomical distortion. This approach was applied to gray matter segmentations in 20 3D T1-weighted MRIs, of which 17 were of CP patients exhibiting severe malformation. We compare our adaptive algorithm to the popular 'FreeSurfer', 'NiftySeg', 'FAST' and 'Atropos' segmentations, which collectively are state-of-the-art surface deformation and EM approaches. The algorithm driven approach yielded improved segmentations (DSC 0.66 v 0.44 (FreeSurfer) v 0.60 (NiftySeg with 100% atlas prior relaxation) v 0.59 (FAST) v 0.64 (Atropos)) of the cerebral cortex relative to several ground-truth manual segmentations, when compared to the existing approaches.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122968657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Making Patch Based Descriptors More Distinguishable and Robust for Image Copy Retrieval 基于补丁的描述符在图像拷贝检索中的可识别性和鲁棒性
Junaid Baber, Erum Fida, Maheen Bakhtyar, Humaira Ashraf
{"title":"Making Patch Based Descriptors More Distinguishable and Robust for Image Copy Retrieval","authors":"Junaid Baber, Erum Fida, Maheen Bakhtyar, Humaira Ashraf","doi":"10.1109/DICTA.2015.7371281","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371281","url":null,"abstract":"Images have become one of the main sources for the information, learning and entertainment; but due to the advancement and progress in multimedia technologies, millions of images are shared daily on Internet which can be easily duplicated and redistributed. Distribution of these duplicated and transformed images causes a lot of problems and challenges such as piracy, redundancy, and content-based image indexing and retrieval. To address these problems, copy detection systems based on local features are widely used. Initially, keypoints are detected and represented by some robust descriptors. The descriptors are computed over the affine patches around the keypoints, these patches should be repeatable under photometric and geometric transformations. However, there exists two main challenges with patch based descriptors, (1) the affine patch over the keypoint can produce similar descriptors under entirely different scene or context which causes \"ambiguity'' (in-distinctiveness), and (2) the descriptors are not enough \"robust'' under image noise. In this paper, we present a framework that makes descriptor more distinguishable and robust by influencing them with the texture or gradients in vicinity by computing them on different and multiple scales. To evaluate the robustness of descriptors, an experiment on keypoint matching under severe transformations is conducted. On average the robustness of SIFT descriptor is increased up-to 12.5%, and robustness of CSLBP descriptor is increased up-to 31%. The distinctiveness is evaluated on image copy retrieval experiment where copies of images are retrieved under challenging transformations. On average, the performance of SIFT to retrieve all copies is increased up-to 27.27%, and the performance of CSLBP to retrieve all copies is increased up-to 27.02%.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116461419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Dauphin: A Signal Processing Language - Statistical Signal Processing Made Easy 一种信号处理语言-统计信号处理变得容易
R. Kyprianou, P. Schachte, Bill Moran
{"title":"Dauphin: A Signal Processing Language - Statistical Signal Processing Made Easy","authors":"R. Kyprianou, P. Schachte, Bill Moran","doi":"10.1109/DICTA.2015.7371250","DOIUrl":"https://doi.org/10.1109/DICTA.2015.7371250","url":null,"abstract":"Dauphin is a new statistical signal processing language designed for easier formulation of detection, classification and estimation algorithms. This paper demonstrates the ease of developing signal processing algorithms in Dauphin. We illustrate this by providing exemplar code for two classifiers: Bayesian and k-means, and for an estimator: the Kalman filter. In all cases, and especially the last named, the code provides a more conceptually defined approach to these problems than other languages such as Matlab. Some Dauphin features under development are also highlighted, for instance a infinite list construct called streams, which is designed to be used as a natural representation of random processes.","PeriodicalId":214897,"journal":{"name":"2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124520199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信