{"title":"Pre-trained classifiers with One Shot Similarity for context aware face verification and identification","authors":"Monika Sharma, R. Hebbalaguppe, L. Vig","doi":"10.1109/ISBA.2017.7947687","DOIUrl":null,"url":null,"abstract":"Most affect based systems analyse facial expressions for emotion detection, and utilize face detection and recognition methods in order to do effective affect analysis. Recent work has demonstrated the efficacy of deep architectures for face recognition by training as classifiers on voluminous datasets. Some architectures are trained as classifiers, and some directly learn an embedding via a triplet loss function. In this paper, we consider the case of one shot prediction from the feature space learnt initially via classification, i.e. we consider the situation where we have a pre-trained model, but do not have access to the training data and are required to make predictions on novel faces with just one training image per identity. We utilize the one shot similarity metric in order to compute similarity scores and compare it with the state-of-the-art results on the Youtube videos face dataset (YTF). We demonstrate the effect of temporal context on frame wise face recognition, and use a probabilistic majority voting scheme over past frames to determine current frame identity. Additionally, we found a number of labelling errors in the Youtube face dataset that were not published in the errata, and have published the same online for the benefit of the community.","PeriodicalId":436086,"journal":{"name":"2017 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Identity, Security and Behavior Analysis (ISBA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISBA.2017.7947687","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Most affect based systems analyse facial expressions for emotion detection, and utilize face detection and recognition methods in order to do effective affect analysis. Recent work has demonstrated the efficacy of deep architectures for face recognition by training as classifiers on voluminous datasets. Some architectures are trained as classifiers, and some directly learn an embedding via a triplet loss function. In this paper, we consider the case of one shot prediction from the feature space learnt initially via classification, i.e. we consider the situation where we have a pre-trained model, but do not have access to the training data and are required to make predictions on novel faces with just one training image per identity. We utilize the one shot similarity metric in order to compute similarity scores and compare it with the state-of-the-art results on the Youtube videos face dataset (YTF). We demonstrate the effect of temporal context on frame wise face recognition, and use a probabilistic majority voting scheme over past frames to determine current frame identity. Additionally, we found a number of labelling errors in the Youtube face dataset that were not published in the errata, and have published the same online for the benefit of the community.