Philipp Terhörst;Marco Huber;Naser Damer;Florian Kirchbuchner;Kiran Raja;Arjan Kuijper
{"title":"Pixel-Level Face Image Quality Assessment for Explainable Face Recognition","authors":"Philipp Terhörst;Marco Huber;Naser Damer;Florian Kirchbuchner;Kiran Raja;Arjan Kuijper","doi":"10.1109/TBIOM.2023.3263186","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3263186","url":null,"abstract":"In this work, we introduce the concept of pixel-level gface image quality that determines the utility of single pixels in a face image for recognition. We propose a training-free approach to assess the pixel-level qualities of a face image given an arbitrary face recognition network. To achieve this, a model-specific quality value of the input image is estimated and used to build a sample-specific quality regression model. Based on this model, quality-based gradients are back-propagated and converted into pixel-level quality estimates. In the experiments, we qualitatively and quantitatively investigated the meaningfulness of our proposed pixel-level qualities based on real and artificial disturbances and by comparing the explanation maps on faces incompliant with the ICAO standards. In all scenarios, the results demonstrate that the proposed solution produces meaningful pixel-level qualities enhancing the interpretability of the face image and its quality. The code is publicly available.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"288-297"},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8423754/10124455/10088447.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Double-Identity Fingerprint Attacks","authors":"M. Ferrara;R. Cappelli;D. Maltoni","doi":"10.1109/TBIOM.2023.3279859","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3279859","url":null,"abstract":"Double-identity biometrics, that is the combination of two subjects’ features into a single template, was demonstrated to be a serious threat against existing biometric systems. In fact, well-synthetized samples can fool state-of-the-art biometric verification systems, leading them to falsely accept both the contributing subjects. This work proposes one of the first techniques to defy existing double-identity fingerprint attacks. The proposed approach inspects the regions where the two aligned fingerprints overlap but minutiae cannot be consistently paired. If the quality of these regions is good enough to minimize the risk of false or miss minutiae detection, then the alarm score is increased. Experimental results carried out on two fingerprint databases, with two different techniques to generate double-identity fingerprints, validate the effectiveness of the proposed approach.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 4","pages":"476-485"},"PeriodicalIF":0.0,"publicationDate":"2023-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8423754/10273758/10138034.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49963984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Replay Attention and Data Augmentation Network for 3-D Face and Object Reconstruction","authors":"Zhiyuan Zhou;Lei Li;Suping Wu;Xinyu Li;Kehua Ma;Xitie Zhang","doi":"10.1109/TBIOM.2023.3261272","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3261272","url":null,"abstract":"3D face reconstruction from single-view images plays an important role in the field of biometrics, which is a long-standing challenging problem in the wild. Traditional 3DMM-based methods directly regressed parameters, which probably caused that the network learned the discriminative informative features insufficiently. In this paper, we propose a replay attention and data augmentation network (RADAN) for 3D dense alignment and face reconstruction. Unlike the traditional attention mechanism, our replay attention module aims to increase the sensitivity of the network to informative features by adaptively recalibrating the weight response in the attention, which typically reinforces the distinguishability of the learned feature representation. In this way, the network can further improve the accuracy of face reconstruction and dense alignment in unconstrained environments. Moreover, to improve the generalization performance of the model and the ability of the network to capture local details, we present a data augmentation strategy to preprocess the sample data, which generates the images that contain more local details and occluded face in cropping and pasting manner. Furthermore, we also apply the replay attention to 3D object reconstruction task to verify the commonality of this mechanism. Extensive experimental results on widely-evaluated datasets demonstrate that our approach achieves competitive performance compared to state-of-the-art methods. Code is available at \u0000<uri>https://github.com/zhouzhiyuan1/RADANet</uri>\u0000.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"308-320"},"PeriodicalIF":0.0,"publicationDate":"2023-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49966610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Priyanka Das;Richard Plesh;Veeru Talreja;Natalia A. Schmid;Matthew Valenti;Joseph Skufca;Stephanie Schuckers
{"title":"Empirical Assessment of End-to-End Iris Recognition System Capacity","authors":"Priyanka Das;Richard Plesh;Veeru Talreja;Natalia A. Schmid;Matthew Valenti;Joseph Skufca;Stephanie Schuckers","doi":"10.1109/TBIOM.2023.3256894","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3256894","url":null,"abstract":"Iris is an established modality in biometric recognition applications including consumer electronics, e-commerce, border security, forensics, and de-duplication of identity at a national scale. In light of the expanding usage of biometric recognition, identity clash (when templates from two different people match) is an imperative factor of consideration for a system’s deployment. This study explores system capacity estimation by empirically estimating the constrained capacity of an end-to-end iris recognition system (NIR systems with Daugman-based feature extraction) operating at an acceptable error rate, i.e., the number of subjects a system can resolve before encountering an error. We study the impact of six system parameters on an iris recognition system’s constrained capacity- number of enrolled identities, image quality, template dimension, random feature elimination, filter resolution, and system operating point. In our assessment, we analyzed 13.2 million comparisons from 5158 unique identities for each of 24 different system configurations. This work provides a framework to better understand iris recognition system capacity as a function of biometric system configurations beyond the operating point, for large-scale applications.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"154-169"},"PeriodicalIF":0.0,"publicationDate":"2023-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49964210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiangyu Zhu;Tingting Liao;Xiaomei Zhang;Jiangjing Lyu;Zhiwen Chen;Yunfeng Wang;Kan Guo;Qiong Cao;Stan Z. Li;Zhen Lei
{"title":"MVP-Human Dataset for 3-D Clothed Human Avatar Reconstruction From Multiple Frames","authors":"Xiangyu Zhu;Tingting Liao;Xiaomei Zhang;Jiangjing Lyu;Zhiwen Chen;Yunfeng Wang;Kan Guo;Qiong Cao;Stan Z. Li;Zhen Lei","doi":"10.1109/TBIOM.2023.3276901","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3276901","url":null,"abstract":"In this paper, we consider a novel problem of reconstructing a 3D clothed human avatar from multiple frames, independent of assumptions on camera calibration, capture space, and constrained actions. We contribute a large-scale dataset, Multi-View and multi-Pose 3D human (MVP-Human in short) to help address this problem. The dataset contains 400 subjects, each of which has 15 scans in different poses and 8-view images for each pose, providing \u0000<inline-formula> <tex-math>$6,000 3text{D}$ </tex-math></inline-formula>\u0000 scans and 48,000 images in total. In addition, a baseline method that takes multiple images as inputs, and generates a shape-with-skinning avatar in the canonical space, finished in one feed-forward pass is proposed. It first reconstructs the implicit skinning fields in a multi-level manner, and then the image features from multiple images are aligned and integrated to estimate a pixel-aligned implicit function that represents the clothed shape. With the newly collected dataset and the baseline method, it shows promising performance on 3D clothed avatar reconstruction. We release the MVP-Human dataset and the baseline method in \u0000<uri>https://github.com/TingtingLiao/MVPHuman</uri>\u0000, hoping to promote research and development in this field.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 4","pages":"464-475"},"PeriodicalIF":0.0,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49963985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ERR-Net: Facial Expression Removal and Recognition Network With Residual Image","authors":"Baishuang Li;Siyi Mo;Wenming Yang;Guijin Wang;Qingmin Liao","doi":"10.1109/TBIOM.2023.3250832","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3250832","url":null,"abstract":"Facial expression recognition is an important part of computer vision and has attracted great attention. Although deep learning pushes forward the development of facial expression recognition, it still faces huge challenges due to unrelated factors such as identity, gender, and race. Inspired by decomposing an expression into two parts: neutral component and expression component, we define residual features and propose an end-to-end network framework named Expression Removal and Recognition Network (ERR-Net), which can simultaneously perform expression removal and recognition tasks. The residual features are represented in two ways: pixel level and facial landmark level. Our network focuses on interpreting the encoder’s output and corresponding its segments to expressions to maximize the inter-class distances. We explore the improved generative adversarial network to convert different expressions into neutral expressions (i.e., expression removal), take the residual images as the output, learn the expression components in the process, and realize the classification of expressions. Through sufficient ablation experiments, we have proved that various improvements added on the network have obvious effects. Experimental comparisons on two benchmarks CK+ and MMI demonstrate that our proposed ERR-Net surpasses the state-of-the-art methods in terms of accuracy.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 4","pages":"425-434"},"PeriodicalIF":0.0,"publicationDate":"2023-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilizing Alignment Loss to Advance Eye Center Detection and Face Recognition in the LWIR Band","authors":"Suha Reddy Mokalla;Thirimachos Bourlai","doi":"10.1109/TBIOM.2023.3251738","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3251738","url":null,"abstract":"Geometric normalization is an integral part of most of the face recognition (FR) systems. To geometrically normalize a face, it is essential to detect the eye centers, since one way to align the face images is to make the line joining the eye centers horizontal. This paper proposes a novel approach to detect eye centers in the challenging Long-Wave Infrared (LWIR) spectrum (8-\u0000<inline-formula> <tex-math>$14 ~mu text{m}$ </tex-math></inline-formula>\u0000). While using thermal band images for face recognition is a feasible approach in low-light and nighttime conditions, where visible face images cannot be used, there are not many thermal or dual band (visible and thermal) face datasets available to train and test new eye center detection models. This work takes advantage of the available deep learning based eye center detection algorithms in the visible band to detect the eye centers in thermal face images through image synthesis. While we empirically evaluate different image synthesis models, we determine that StarGAN2 yields the highest eye center detection accuracy, when compared to the other state-of-the-art models. We incorporate alignment loss that captures the normalized error between the detected and actual eye centers as an additional loss term during training (using the generated images during training, ground truth annotations, and an eye center detection model), so that the model learns to align the images to minimize this error. During test phase, visible images are generated from the thermal images using the trained model. Then, the available landmark detection algorithms in the visible band, namely, MT-CNN and HR-Net are used to detect the eye centers. Next, these eye centers are used to geometrically normalize the source thermal face images before performing same-spectral (thermal-to-thermal) face recognition. The proposed method improved the eye center detection accuracy by 60% over the baseline model, and by 14% over training only the StarGAN2 model without the alignment loss. The proposed approach also reports the highest improvement in the face recognition accuracy by 36% and 3% over the baseline and original StarGAN2 models, respectively, when using deep learning based face recognition models, namely, Facenet, ArcFace, and VGG-Face. We also perform experiments by augmenting the train and test datasets with images rotated in-plane to further demonstrate the efficiency of the proposed approach. When CycleGAN (another unpaired image translation network) is used to generate images before incorporating the alignment loss, it failed to preserve the alignment at the slightest, therefore the eye center detection accuracy was extremely low. With the alignment loss, the accuracy increased by 20%, 50%, and 80% when the normalized error (e) \u0000<inline-formula> <tex-math>$le0.05$ </tex-math></inline-formula>\u0000, 0.10 and 0.25 respectively.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"255-265"},"PeriodicalIF":0.0,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Runsheng Wang;Yuxuan Shi;Hefei Ling;Zongyi Li;Ping Li;Boyuan Liu;Hanqing Zheng;Qian Wang
{"title":"Gait Recognition via Gait Period Set","authors":"Runsheng Wang;Yuxuan Shi;Hefei Ling;Zongyi Li;Ping Li;Boyuan Liu;Hanqing Zheng;Qian Wang","doi":"10.1109/TBIOM.2023.3244206","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3244206","url":null,"abstract":"Gait recognition has promising application prospects in surveillance applications, with the recently proposed video-based gait recognition methods affording huge progress. However, due to the poor image quality of some gait frames, the original frame-level features extracted from gait silhouettes are not discriminative enough to be aggregated as gait features utilized during the final recognition. Besides, as a type of periodic biometric behavior, periodic gait information is considered efficacious for capturing typical human walking patterns and refining frame-level gait features. Therefore, this paper proposes a novel approach that exploits periodic gait information, named Gait Period Set (GPS), which divides the gait period into several phases and ensembles the gait phase features to refine frame-level features. Then, features from different phases are aggregated into a video-level feature. Moreover, the refined frame-level features are aggregated as the refined gait phase features with higher quality, which can be used to re-refine the frame-level features. Hence, we upgrade the GPS into the Iterative Gait Period Set (IGPS) to iteratively refine the frame-level features. The results of extensive experiments on prevailing gait recognition datasets validate the effectiveness of the GPS and IGPS modules and demonstrate that the proposed method achieves state-of-the-art performance.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"183-195"},"PeriodicalIF":0.0,"publicationDate":"2023-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49964206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Masked Face Recognition Dataset and Application","authors":"Zhongyuan Wang;Baojin Huang;Guangcheng Wang;Peng Yi;Kui Jiang","doi":"10.1109/TBIOM.2023.3242085","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3242085","url":null,"abstract":"During COVID-19 coronavirus epidemic, almost everyone wears a mask to prevent the spread of virus. It raises a problem that the traditional face recognition model basically fails in the scene of face-based identity verification, such as security check, community visit check-in, etc. Therefore, it is imminent to boost the performance of masked face recognition. Most recent advanced face recognition methods are based on deep learning, which heavily depends on a large number of training samples. However, there are presently no publicly available masked face recognition datasets, especially real ones. To this end, this work proposes three types of masked face datasets, including Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Synthetic Masked Face Recognition Dataset (SMFRD). Besides, we conduct benchmark experiments on these three datasets for reference. As far as we know, we are the first to publicly release large-scale masked face recognition datasets that can be downloaded for free at \u0000<uri>https://github.com/X-zhangyang/Real-World-Masked-Face-Dataset</uri>\u0000.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"298-304"},"PeriodicalIF":0.0,"publicationDate":"2023-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mahshid Sadeghpour;Arathi Arakala;Stephen A. Davis;Kathy J. Horadam
{"title":"Protection of Sparse Retinal Templates Using Cohort-Based Dissimilarity Vectors","authors":"Mahshid Sadeghpour;Arathi Arakala;Stephen A. Davis;Kathy J. Horadam","doi":"10.1109/TBIOM.2023.3239866","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3239866","url":null,"abstract":"Retinal vasculature is a biometric characteristic that is highly accurate for recognition but for which no template protection scheme exists. We propose the first retinal template protection scheme, adapting an existing paradigm of cohort-based modelling to templates containing the node and edge data of retinal graphs. The template protection scheme results in at most 2.3% reduction in accuracy compared to unprotected templates. A common concern with cohort based systems is that the availability of distance scores can be exploited to reconstruct the biometric image or biometric template using inversion attack. On the contrary, we show that using our sparse templates in a cohort-based system results in less than 0.3% success rate for an inverse biometric attack. In addition, rigorous unlinkability analysis shows that the template protection scheme has linkability scores at least as low as or lower than the state-of-the-art template protection schemes.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"233-243"},"PeriodicalIF":0.0,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}