{"title":"Pose Impact Estimation on Face Recognition Using 3-D-Aware Synthetic Data With Application to Quality Assessment","authors":"Marcel Grimmer;Christian Rathgeb;Christoph Busch","doi":"10.1109/TBIOM.2024.3361657","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3361657","url":null,"abstract":"Evaluating the quality of facial images is essential for operating face recognition systems with sufficient accuracy. The recent advances in face quality standardisation (ISO/IEC CD3 29794-5) recommend the usage of component quality measures for breaking down face quality into its individual factors, hence providing valuable feedback for operators to re-capture low-quality images. In light of recent advances in 3D-aware generative adversarial networks, we propose a novel dataset, Syn-YawPitch, comprising 1,000 identities with varying yaw-pitch angle combinations. Utilizing this dataset, we demonstrate that pitch angles beyond 30 degrees have a significant impact on the biometric performance of current face recognition systems. Furthermore, we propose a lightweight and explainable pose quality predictor that adheres to the draft international standard of ISO/IEC CD3 29794–5 and benchmark it against state-of-the-art face image quality assessment algorithms.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 2","pages":"209-218"},"PeriodicalIF":0.0,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140345503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kangli Zeng;Zhongyuan Wang;Tao Lu;Jianyu Chen;Zheng He;Zhen Han
{"title":"Implicit Mutual Learning With Dual-Branch Networks for Face Super-Resolution","authors":"Kangli Zeng;Zhongyuan Wang;Tao Lu;Jianyu Chen;Zheng He;Zhen Han","doi":"10.1109/TBIOM.2024.3354333","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3354333","url":null,"abstract":"Face super-resolution (SR) algorithms have recently made significant progress. However, most existing methods prefer to employ texture and structure information together to promote the generation of high-resolution features, neglecting the mutual encouragement between them, as well as the effective unification of their own low-level and high-level information, thus yielding unsatisfactory results. To address these problems, we propose an implicit mutual learning of dual-branch networks for face super-resolution, which adequately considers both extraction and aggregation of structure and texture information. The proposed approach consists of four essential blocks. First, the deep feature extractor is equipped with a deep feature reinforcement module (DFRM) based on two-stage cross-dimensional attention (TCA), which behaves in the texture enhancement and structure reconstruction branches, respectively. Then, we elaborate two information exchange blocks for two branches, one for the first information exchange block (FIEB) from the texture branch to the structure branch and one for the second information exchange block (SIEB) from the structure branch to the texture branch. These two interaction blocks perform further fusion enhancement of potential features. Finally, a hybrid fusion network (HFNet) based on supervised attention executes adaptive aggregation of the enhanced texture and structure maps. Additionally, we use a joint loss function that modifies the recovery of structure information, diminishes the use of potentially erroneous information, and encourages the generation of realistic face images. Experiments on public datasets show that our method consistently achieves better quantitative and qualitative results than SOTA methods.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 2","pages":"182-194"},"PeriodicalIF":0.0,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140345504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning Joint Local-Global Iris Representations via Spatial Calibration for Generalized Presentation Attack Detection","authors":"Gaurav Jaswal;Aman Verma;Sumantra Dutta Roy;Raghavendra Ramachandra","doi":"10.1109/TBIOM.2024.3355136","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3355136","url":null,"abstract":"Existing Iris Presentation Attack Detection (IPAD) systems do not generalize well across datasets, sensors and subjects. The main reason for the same is the presence of similarities in bonafide samples and attacks, and intricate iris textures. The proposed DFCANet (Dense Feature Calibration Attention-Assisted Network) uses feature calibration convolution and residual learning to generate domain-specific iris feature representations at local and global scales. DFCANet’s channel attention enables the use of discriminative feature learning across channels. Compared to state-of-the-art methods, DFCANet achieves significant performance gains for the IIITD-CLI, IIITD-WVU, IIIT-CSD, Clarkson-15, Clarkson-17, NDCLD-13, and NDCLD-15 benchmark datasets. Incremental learning in DFCANet overcomes data scarcity issues and cross-domain challenges. This paper also pursues the challenging soft-lens attack scenarios. An additional study conducted over contact lens detection task suggests high domain-specific feature modeling capacities of the proposed network.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 2","pages":"195-208"},"PeriodicalIF":0.0,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140345462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuoyuan Wang;Lei Zhang;Xing Wang;Wenbo Huang;Hao Wu;Aiguo Song
{"title":"PatchHAR: A MLP-Like Architecture for Efficient Activity Recognition Using Wearables","authors":"Shuoyuan Wang;Lei Zhang;Xing Wang;Wenbo Huang;Hao Wu;Aiguo Song","doi":"10.1109/TBIOM.2024.3354261","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3354261","url":null,"abstract":"To date, convolutional neural networks have played a dominant role in sensor-based human activity recognition (HAR) scenarios. In 2021, researchers from four institutions almost simultaneously released their newest work to arXiv.org, where each of them independently presented new network architectures mainly consisting of linear layers. This arouses a heated debate whether the current research hotspot in deep learning architectures is returning to MLPs. Inspired by the recent success achieved by MLPs, in this paper, we first propose a lightweight network architecture called all-MLP for HAR, which is entirely built on MLP layers with a gating unit. By dividing multi-channel sensor time series into nonoverlapping patches, all linear layers directly process sensor patches to automatically extract local features, which is able to effectively reduce computational cost. Compared with convolutional architectures, it takes fewer FLOPs and parameters but achieves comparable classification score on WISDM, OPPORTUNITY, PAMAP2 and USC-HAD HAR benchmarks. The additional benefit is that all involved computations are matrix multiplication, which can be readily optimized with popular deep learning libraries. This advantage can promote practical HAR deployment in wearable devices. Finally, we evaluate the actual operation of all-MLP model on a Raspberry Pi platform for real-world human activity recognition simulation. We conclude that the new architecture is not a simple reuse of traditional MLPs in HAR scenario, but is a significant advance over them.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 2","pages":"169-181"},"PeriodicalIF":0.0,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140345506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EdgeFace: Efficient Face Recognition Model for Edge Devices","authors":"Anjith George;Christophe Ecabert;Hatef Otroshi Shahreza;Ketan Kotwal;Sébastien Marcel","doi":"10.1109/TBIOM.2024.3352164","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3352164","url":null,"abstract":"In this paper, we present EdgeFace - a lightweight and efficient face recognition network inspired by the hybrid architecture of EdgeNeXt. By effectively combining the strengths of both CNN and Transformer models, and a low rank linear layer, EdgeFace achieves excellent face recognition performance optimized for edge devices. The proposed EdgeFace network not only maintains low computational costs and compact storage, but also achieves high face recognition accuracy, making it suitable for deployment on edge devices. The proposed EdgeFace model achieved the top ranking among models with fewer than 2M parameters in the IJCB 2023 Efficient Face Recognition Competition. Extensive experiments on challenging benchmark face datasets demonstrate the effectiveness and efficiency of EdgeFace in comparison to state-of-the-art lightweight models and deep face recognition models. Our EdgeFace model with 1.77M parameters achieves state of the art results on LFW (99.73%), IJB-B (92.67%), and IJB-C (94.85%), outperforming other efficient models with larger computational complexities. The code to replicate the experiments will be made available publicly.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 2","pages":"158-168"},"PeriodicalIF":0.0,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140345474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging Diffusion for Strong and High Quality Face Morphing Attacks","authors":"Zander W. Blasingame;Chen Liu","doi":"10.1109/TBIOM.2024.3349857","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3349857","url":null,"abstract":"Face morphing attacks seek to deceive a Face Recognition (FR) system by presenting a morphed image consisting of the biometric qualities from two different identities with the aim of triggering a false acceptance with one of the two identities, thereby presenting a significant threat to biometric systems. The success of a morphing attack is dependent on the ability of the morphed image to represent the biometric characteristics of both identities that were used to create the image. We present a novel morphing attack that uses a Diffusion-based architecture to improve the visual fidelity of the image and the ability of the morphing attack to represent characteristics from both identities. We demonstrate the effectiveness of the proposed attack by evaluating its visual fidelity via Fréchet Inception Distance (FID). Also, extensive experiments are conducted to measure the vulnerability of FR systems to the proposed attack. The ability of a morphing attack detector to detect the proposed attack is measured and compared against two state-of-the-art GAN-based morphing attacks along with two Landmark-based attacks. Additionally, a novel metric to measure the relative strength between different morphing attacks is introduced and evaluated.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 1","pages":"118-131"},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140063545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Niloufar Alipour Talemi;Hossein Kashiani;Nasser M. Nasrabadi
{"title":"CATFace: Cross-Attribute-Guided Transformer With Self-Attention Distillation for Low-Quality Face Recognition","authors":"Niloufar Alipour Talemi;Hossein Kashiani;Nasser M. Nasrabadi","doi":"10.1109/TBIOM.2023.3349218","DOIUrl":"10.1109/TBIOM.2023.3349218","url":null,"abstract":"Although face recognition (FR) has achieved great success in recent years, it is still challenging to accurately recognize faces in low-quality images due to the obscured facial details. Nevertheless, it is often feasible to make predictions about specific soft biometric (SB) attributes, such as gender, and baldness even in dealing with low-quality images. In this paper, we propose a novel multi-branch neural network that leverages SB attribute information to boost the performance of FR. To this end, we propose a cross-attribute-guided transformer fusion (CATF) module that effectively captures the long-range dependencies and relationships between FR and SB feature representations. The synergy created by the reciprocal flow of information in the dual cross-attention operations of the proposed CATF module enhances the performance of FR. Furthermore, we introduce a novel self-attention distillation framework that effectively highlights crucial facial regions, such as landmarks by aligning low-quality images with those of their high-quality counterparts in the feature space. The proposed self-attention distillation regularizes our network to learn a unified qualityinvariant feature representation in unconstrained environments. We conduct extensive experiments on various FR benchmarks varying in quality. Experimental results demonstrate the superiority of our FR method compared to state-of-the-art FR studies.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 1","pages":"132-146"},"PeriodicalIF":0.0,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139449706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Information for Authors","authors":"","doi":"10.1109/TBIOM.2023.3337966","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3337966","url":null,"abstract":"","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 1","pages":"C3-C3"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10462648","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140063505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Publication Information","authors":"","doi":"10.1109/TBIOM.2023.3337965","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3337965","url":null,"abstract":"","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 1","pages":"C2-C2"},"PeriodicalIF":0.0,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10462640","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140063504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Madina Abdrakhmanova;Timur Unaspekov;Huseyin Atakan Varol
{"title":"Multimodal Person Verification With Generative Thermal Data Augmentation","authors":"Madina Abdrakhmanova;Timur Unaspekov;Huseyin Atakan Varol","doi":"10.1109/TBIOM.2023.3346938","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3346938","url":null,"abstract":"The fusion of audio, visual, and thermal modalities has proven effective in developing reliable person verification systems. In this study, we enhanced multimodal person verification performance by augmenting training data using domain transfer methods. Specifically, we enriched the audio-visual-thermal SpeakingFaces dataset with a combination of real audio-visual data and synthetic thermal data from the VoxCeleb dataset. We adapted visual images in VoxCeleb to the thermal domain using CycleGAN, trained on SpeakingFaces. Our results demonstrate the positive impact of augmented training data on all unimodal and multimodal models. The score fusion of unimodal audio, unimodal visual, bimodal, and trimodal systems trained on the combined data achieved the best results on both datasets and exhibited robustness in low-illumination and noisy conditions. Our findings emphasize the importance of utilizing synthetic data, produced by generative methods, to improve deep learning model performance. To facilitate reproducibility and further research in multimodal person verification, we have made our code, pretrained models, and preprocessed dataset freely available in our GitHub repository.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 1","pages":"43-53"},"PeriodicalIF":0.0,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140063600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}