{"title":"Calorific Expenditure Estimation Using Deep Convolutional Network Features","authors":"Baodong Wang, L. Tao, T. Burghardt, M. Mirmehdi","doi":"10.1109/WACVW.2018.00014","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00014","url":null,"abstract":"Accurately estimating a person's energy expenditure is an important tool in tracking physical activity levels for healthcare and sports monitoring tasks, amongst other applications. In this paper, we propose a method for deriving calorific expenditure based on deep convolutional neural network features (within a healthcare scenario). Our evaluation shows that the proposed approach gives high accuracy in activity recognition (82.3%) and low normalised root mean square error in calorific expenditure prediction (0.41). It is compared against the current state-ofthe-art calorific expenditure estimation method, based on a classical approach, and exhibits an improvement of 7.8% in the calorific expenditure prediction task. The proposed method is suitable for home monitoring in a controlled environment.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130732237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Visual Engagement for Trauma Recovery","authors":"Svati Dhamija, T. Boult","doi":"10.1109/WACVW.2018.00016","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00016","url":null,"abstract":"Applications ranging from human emotion understanding to e-health are exploring methods to effectively understand user behavior from self-reported questionnaires. However, little is understood about non-invasive techniques that involve face-based deep-learning models to predict engagement. Current research in visual engagement poses two key questions: 1) how much time do we need to analyze facial behavior for accurate engagement prediction? and 2) which deep learning approach provides the most accurate predictions? In this paper we compare RNN, GRU and LSTM using different length segments of AUs. Our experiments show no significant difference in prediction accuracy when using anywhere between 15 and 90 seconds of data. Moreover, the results reveal that simpler models of recurrent networks are statistically significantly better suited for capturing engagement from AUs.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125188750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elizabeth Tran, Michael B. Mayhew, Hyojin Kim, P. Karande, A. Kaplan
{"title":"Facial Expression Recognition Using a Large Out-of-Context Dataset","authors":"Elizabeth Tran, Michael B. Mayhew, Hyojin Kim, P. Karande, A. Kaplan","doi":"10.1109/WACVW.2018.00012","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00012","url":null,"abstract":"We develop a method for emotion recognition from facial imagery. This problem is challenging in part because of the subjectivity of ground truth labels and in part because of the relatively small size of existing labeled datasets. We use the FER+ dataset [8], a dataset with multiple emotion labels per image, in order to build an emotion recognition model that encompasses a full range of emotions. Since the amount of data in the FER+ dataset is limited, we explore the use of a much larger face dataset, MS-Celeb-1M [41], in conjunction with the FER+ dataset. Specific layers within an Inception-ResNet-v1 [13, 38] model trained for facial recognition are used for the emotion recognition problem. Thus, we leverage the MS-Celeb-1M dataset in addition to the FER+ dataset and experiment with different architectures to assess the overall performance of neural networks to recognize emotion using facial imagery.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"468 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132969943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating a Convolutional Neural Network on Short-Wave Infra-Red Images","authors":"M. Bihn, Manuel Günther, Daniel Lemmond, T. Boult","doi":"10.1109/WACVW.2018.00008","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00008","url":null,"abstract":"Machine learning algorithms, both traditional and neuralnetwork-based, have been tested against RGB facial images for years, but these algorithms are prone to fail when illumination conditions are insufficient, for example, at night or when images are taken from long distances. Short-Wave Infra-Red (SWIR) illumination provides a much higher intensity and a much more ambient structure than visible light, which makes it better suited for face recognition in different conditions. However, current neural networks require lots of training data, which is not available in the SWIR domain. In this paper, we examine the ability of a convolutional neural network, specifically, the VGG Face network, which was trained on visible spectrum images, to work on SWIR images. Utilizing a dataset containing both RGB and SWIR images, we hypothesize that the VGG Face network will perform well both on facial images taken in RGB and SWIR wavelengths. We expect that the features extracted with VGG Face are independent of the actual wavelengths that the images were taken with. Thus, face recognition with VGG Face is possible between the RGB and SWIR domains. We find that VGG Face performs reasonable on some of the SWIR wavelengths. We can almost reach the same recognition performance when using composite images built from three SWIR wavelengths probing on RGB.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114803902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hadi Kazemi, S. M. Iranmanesh, Ali Dabouei, Sobhan Soleymani, N. Nasrabadi
{"title":"Facial Attributes Guided Deep Sketch-to-Photo Synthesis","authors":"Hadi Kazemi, S. M. Iranmanesh, Ali Dabouei, Sobhan Soleymani, N. Nasrabadi","doi":"10.1109/WACVW.2018.00006","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00006","url":null,"abstract":"Face sketch-photo synthesis is a critical application in law enforcement and digital entertainment industry. Despite the significant improvements in sketch-to-photo synthesis techniques, existing methods have still serious limitations in practice, such as the need for paired data in the training phase or having no control on enforcing facial attributes over the synthesized image. In this work, we present a new framework, which is a conditional version of Cycle-GAN, conditioned on facial attributes. The proposed network forces facial attributes, such as skin and hair color, on the synthesized photo and does not need a set of aligned face-sketch pairs during its training. We evaluate the proposed network by training on two real and synthetic sketch datasets. The hand-sketch images of the FERET dataset and the color face images from the WVU Multi-modal dataset are used as an unpaired input to the proposed conditional CycleGAN with the skin color as the controlled face attribute. For more attribute guided evaluation, a synthetic sketch dataset is created from the CelebA dataset and used to evaluate the performance of the network by forcing several desired facial attributes on the synthesized faces.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122887802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multi-task Convolutional Neural Network for Joint Iris Detection and Presentation Attack Detection","authors":"Cunjian Chen, A. Ross","doi":"10.1109/WACVW.2018.00011","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00011","url":null,"abstract":"In this work, we propose a multi-task convolutional neural network learning approach that can simultaneously perform iris localization and presentation attack detection (PAD). The proposed multi-task PAD (MT-PAD) is inspired by an object detection method which directly regresses the parameters of the iris bounding box and computes the probability of presentation attack from the input ocular image. Experiments involving both intra-sensor and cross-sensor scenarios suggest that the proposed method can achieve state-of-the-art results on publicly available datasets. To the best of our knowledge, this is the first work that performs iris detection and iris presentation attack detection simultaneously.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"03 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131049945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generic Object Discrimination for Mobile Assistive Robots Using Projective Light Diffusion","authors":"P. Papadakis, David Filliat","doi":"10.1109/WACVW.2018.00013","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00013","url":null,"abstract":"A number of assistive robot services depend on the classification of objects while dealing with an increased volume of sensory data, scene variability and limited computational resources. We propose using more concise representations via a seamless combination of photometric and geometric features fused by exploiting local photometric/geometric correlation and employing domain transform filtering in order to recover scene structure. This is obtained through a projective light diffusion imaging process (PLDI) which allows capturing surface orientation, image edges and global depth gradients into a single image. Object candidates are finally encoded into a discriminative, wavelet-based descriptor allowing very fast object queries. Experiments with an indoor robot demonstrate improved classification performance compared to alternative methods and an overall superior discriminative power compared to state-of-the-art unsupervised descriptors within ModelNet10 benchmark.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"415 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132337607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ivisX: An Integrated Video Investigation Suite for Forensic Applications","authors":"Chengchao Qu, J. Metzler, Eduardo Monari","doi":"10.1109/WACVW.2018.00007","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00007","url":null,"abstract":"Video data from surveillance cameras are nowadays an important instrument for investigating crimes and identifying the identity of an offender. The analysis of the mass data acquired from numerous cameras poses enormous challenges to police investigation authorities. Supporting softwares and video management tools currently on the market focus either on elaborate visualization and editing of video data, specific image processing or video content analysis tasks. As a result, such a scattered system landscape further exacerbates the complexity and difficulty of a timely analysis of the available data. This work presents our unified framework ivisX, which is an integrated suite to simplify the entire workflow of video data investigation. The algorithmic backbone of ivisX is built upon an effective content-based search algorithm using region covariance for low-resolution (LR) data and a novel 3D face super-resolution (FSR) approach, which can generate high-resolution (HR) 3D face models to render high-quality facial composites with a single blurred and pixelated face image of the LR domain. Moreover, ivisX has a modular design, which allows for flexible incorporation of various extensions ranging from processing and display of video data from multiple cameras to analysis and documentation of the results into a powerful integrated toolkit to assist forensic investigation.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125853802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emanuela Marasco, Alex Feldman, Keleigh Rachel Romine
{"title":"Enhancing Optical Cross-Sensor Fingerprint Matching Using Local Textural Features","authors":"Emanuela Marasco, Alex Feldman, Keleigh Rachel Romine","doi":"10.1109/WACVW.2018.00010","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00010","url":null,"abstract":"Fingerprint systems have been designed to typically operate on images acquired using the same sensor. Existing fingerprint systems are not able to accurately compare images collected using different sensors. In this paper, we propose a learning-based scheme for enhancing interoperability between optical fingerprint sensors by compensating the output of a traditional commercial matcher. Specifically, cross-sensor differences are captured by incorporating Local Binary Patterns (LBP) and Local Phase Quantization (LPQ), while dimensionality reduction is performed by using Reconstruction Independent Component Analysis (RICA). The evaluation is carried out on rolled fingerprints pertaining to 494 users collected atWest Virginia University and acquired using multiple optical sensors and Ten Print cards. In cross-sensor at False Acceptance Rate of 0.01%, the proposed approach achieves a False Rejection Rate of 4.12%.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116863791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. N. Jahromi, Morten Bojesen Bonderup, Maryam Asadi-Aghbolaghi, Egils Avots, Kamal Nasrollahi, Sergio Escalera, S. Kasaei, T. Moeslund, G. Anbarjafari
{"title":"Automatic Access Control Based on Face and Hand Biometrics in a Non-cooperative Context","authors":"M. N. Jahromi, Morten Bojesen Bonderup, Maryam Asadi-Aghbolaghi, Egils Avots, Kamal Nasrollahi, Sergio Escalera, S. Kasaei, T. Moeslund, G. Anbarjafari","doi":"10.1109/WACVW.2018.00009","DOIUrl":"https://doi.org/10.1109/WACVW.2018.00009","url":null,"abstract":"Automatic access control systems (ACS) based on the human biometrics or physical tokens are widely employed in public and private areas. Yet these systems, in their conventional forms, are restricted to active interaction from the users. In scenarios where users are not cooperating with the system, these systems are challenged. Failure in cooperation with the biometric systems might be intentional or because the users are incapable of handling the interaction procedure with the biometric system or simply forget to cooperate with it, due to for example, illness like dementia. This work introduces a challenging bimodal database, including face and hand information of the users when they approach a door to open it by its handle in a noncooperative context. We have defined two (an easy and a challenging) protocols on how to use the database. We have reported results on many baseline methods, including deep learning techniques as well as conventional methods on the database. The obtained results show the merit of the proposed database and the challenging nature of access control with non-cooperative users.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"376 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115912669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}