{"title":"Zero-Shot Demographically Unbiased Image Generation From an Existing Biased StyleGAN","authors":"Anubhav Jain;Rishit Dholakia;Nasir Memon;Julian Togelius","doi":"10.1109/TBIOM.2024.3416403","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3416403","url":null,"abstract":"Face recognition systems have made significant strides thanks to data-heavy deep learning models, but these models rely on large privacy-sensitive datasets. Recent work in facial analysis and recognition have thus started making use of synthetic datasets generated from GANs and diffusion based generative models. These models, however, lack fairness in terms of demographic representation and can introduce the same biases in the trained downstream tasks. This can have serious societal and security implications. To address this issue, we propose a methodology that generates unbiased data from a biased generative model using an evolutionary algorithm. We show results for StyleGAN2 model trained on the Flicker Faces High Quality dataset to generate data for singular and combinations of demographic attributes such as Black and Woman. We generate a large racially balanced dataset of 13.5 million images, and show that it boosts the performance of facial recognition and analysis systems whilst reducing their biases. We have made our code-base (\u0000<uri>https://github.com/anubhav1997/youneednodataset</uri>\u0000) public to allow researchers to reproduce our work.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 4","pages":"498-514"},"PeriodicalIF":0.0,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Caiyong Wang;Haiqing Li;Yixin Zhang;Guangzhe Zhao;Yunlong Wang;Zhenan Sun
{"title":"Sclera-TransFuse: Fusing Vision Transformer and CNN for Accurate Sclera Segmentation and Recognition","authors":"Caiyong Wang;Haiqing Li;Yixin Zhang;Guangzhe Zhao;Yunlong Wang;Zhenan Sun","doi":"10.1109/TBIOM.2024.3415484","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3415484","url":null,"abstract":"This paper investigates a deep learning based unified framework for accurate sclera segmentation and recognition, named Sclera-TransFuse. Unlike previous CNN-based methods, our framework incorporates Vision Transformer and CNN to extract complementary feature representations, which are beneficial to both subtasks. Specifically, for sclera segmentation, a novel two-stream hybrid model, referred to as Sclera-TransFuse-Seg, is developed to integrate classical ResNet-34 and recently emerging Swin Transformer encoders in parallel. The dual-encoders firstly extract coarse- and fine-grained feature representations at hierarchical stages, separately. Then a Cross-Domain Fusion (CDF) module based on information interaction and self-attention mechanism is introduced to efficiently fuse the multi-scale features extracted from dual-encoders. Finally, the fused features are progressively upsampled and aggregated to predict the sclera masks in the decoder meanwhile deep supervision strategies are employed to learn intermediate feature representations better and faster. With the results of sclera segmentation, the sclera ROI image is generated for sclera feature extraction. Additionally, a new sclera recognition model, termed as Sclera-TransFuse-Rec, is proposed by combining lightweight EfficientNet B0 and multi-scale Vision Transformer in sequential to encode local and global sclera vasculature feature representations. Extensive experiments on several publicly available databases suggest that our framework consistently achieves state-of-the-art performance on various sclera segmentation and recognition benchmarks, including the 8th Sclera Segmentation and Recognition Benchmarking Competition (SSRBC 2023). A UBIRIS.v2 subset of 683 eye images with manually labeled sclera masks, and our codes are publicly available to the community through \u0000<uri>https://github.com/lhqqq/Sclera-TransFuse</uri>\u0000.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 4","pages":"575-590"},"PeriodicalIF":0.0,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CoNAN: Conditional Neural Aggregation Network for Unconstrained Long Range Biometric Feature Fusion","authors":"Bhavin Jawade;Deen Dayal Mohan;Prajwal Shetty;Dennis Fedorishin;Srirangaraj Setlur;Venu Govindaraju","doi":"10.1109/TBIOM.2024.3410311","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3410311","url":null,"abstract":"Person recognition from image sets acquired under unregulated and uncontrolled settings, such as at large distances, low resolutions, varying viewpoints, illumination, pose, and atmospheric conditions, is challenging. Feature aggregation, which involves aggregating a set of N feature representations present in a template into a single global representation, plays a pivotal role in such recognition systems. Existing works in traditional face feature aggregation either utilize metadata or high-dimensional intermediate feature representations to estimate feature quality for aggregation. However, generating high-quality metadata or style information is not feasible for extremely low-resolution faces captured in long-range and high altitude settings. To overcome these limitations, we propose a feature distribution conditioning approach called CoNAN for template aggregation. Specifically, our method aims to learn a context vector conditioned over the distribution information of the incoming feature set, which is utilized to weigh the features based on their estimated informativeness. The proposed method produces state-of-the-art results on long-range unconstrained face recognition datasets such as BTS, and DroneSURF, validating the advantages of such an aggregation strategy. We show that CoNAN generalizes present CoNAN’s results on other modalities such as body features and gait. We also produce extensive qualitative and quantitative experiments on different components of CoNAN.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 4","pages":"602-612"},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatio-Temporal Dual-Attention Transformer for Time-Series Behavioral Biometrics","authors":"Kim-Ngan Nguyen;Sanka Rasnayaka;Sandareka Wickramanayake;Dulani Meedeniya;Sanjay Saha;Terence Sim","doi":"10.1109/TBIOM.2024.3394875","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3394875","url":null,"abstract":"Continuous Authentication (CA) using behavioral biometrics is a type of biometric identification that recognizes individuals based on their unique behavioral characteristics. Many behavioral biometrics can be captured through multiple sensors, each providing multichannel time-series data. Utilizing this multichannel data effectively can enhance the accuracy of behavioral biometrics-based CA. This paper extends BehaveFormer, a new framework that effectively combines time series data from multiple sensors to provide higher security in behavioral biometrics. BehaveFormer includes two Spatio-Temporal Dual Attention Transformers (STDAT), a novel transformer we introduce to extract more discriminative features from multichannel time-series data. Experimental results on two behavioral biometrics, Keystroke Dynamics and Swipe Dynamics with Inertial Measurement Unit (IMU), have shown State-of-the-art performance. For Keystroke, on three publicly available datasets (Aalto DB, HMOG DB, and HuMIdb), BehaveFormer outperforms the SOTA. For instance, BehaveFormer achieved an EER of 2.95% on the HuMIdb. For Swipe, on two publicly available datasets (HuMIdb and FETA) BehaveFormer outperforms the SOTA, for instance, BehaveFormer achieved an EER of 3.67% on the HuMIdb. Additionally, the BehaveFormer model shows superior performance in various CA-specific evaluation metrics. The proposed STDAT-based BehaveFormer architecture can also be effectively used for transfer learning. The model weights and reproducible experimental results are available at: \u0000<uri>https://github.com/nganntk/BehaveFormer</uri>","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 4","pages":"591-601"},"PeriodicalIF":0.0,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Template Inversion Attack Using Synthetic Face Images Against Real Face Recognition Systems","authors":"Hatef Otroshi Shahreza;Sébastien Marcel","doi":"10.1109/TBIOM.2024.3391759","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3391759","url":null,"abstract":"In this paper, we use synthetic data and propose a new method for template inversion attacks against face recognition systems. We use synthetic data to train a face reconstruction model to generate high-resolution (i.e., \u0000<inline-formula> <tex-math>$1024times 1024$ </tex-math></inline-formula>\u0000) face images from facial templates. To this end, we use a face generator network to generate synthetic face images and extract their facial templates using the face recognition model as our training set. Then, we use the synthesized dataset to learn a mapping from facial templates to the intermediate latent space of the same face generator network. We propose our method for both whitebox and blackbox TI attacks. Our experiments show that the trained model with synthetic data can be used to reconstruct face images from templates extracted from real face images. In our experiments, we compare our method with previous methods in the literature in attacks against different state-of-the-art face recognition models on four different face datasets, including the MOBIO, LFW, AgeDB, and IJB-C datasets, demonstrating the effectiveness of our proposed method on real face recognition datasets. Experimental results show our method outperforms previous methods on high-resolution 2D face reconstruction from facial templates and achieve competitive results with SOTA face reconstruction methods. Furthermore, we conduct practical presentation attacks using the generated face images in digital replay attacks against real face recognition systems, showing the vulnerability of face recognition systems to presentation attacks based on our TI attack (with synthetic train data) on real face datasets.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 3","pages":"374-384"},"PeriodicalIF":0.0,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identity-Aware Facial Age Editing Using Latent Diffusion","authors":"Sudipta Banerjee;Govind Mittal;Ameya Joshi;Sai Pranaswi Mullangi;Chinmay Hegde;Nasir Memon","doi":"10.1109/TBIOM.2024.3390570","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3390570","url":null,"abstract":"Aging in face images is a type of intra-class variation that has a stronger impact on the performance of biometric recognition systems than other modalities (such as iris scans and fingerprints). Improving the robustness of automated face recognition systems with respect to aging requires high quality longitudinal datasets that should contain images belonging to a large number of individuals collected across a long time span, ideally decades apart. Unfortunately, there is a dearth of such good operational quality longitudinal datasets. Synthesizing longitudinal data that meet these requirements can be achieved using modern generative models. However, these tools may produce unrealistic artifacts or compromise the biometric quality of the age-edited images. In this work, we simulate facial aging and de-aging by leveraging text-to-image diffusion models with the aid of few-shot fine-tuning and intuitive textual prompting. Our method is supervised using identity-preserving loss functions that ensure biometric utility preservation while imparting a high degree of visual realism. We ablate our method using different datasets, state-of-the art face matchers and age classification networks. Our empirical analysis validates the success of the proposed method compared to existing schemes. Our code is available at \u0000<uri>https://github.com/sudban3089/ID-Preserving-Facial-Aging.git</uri>","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 4","pages":"443-457"},"PeriodicalIF":0.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GaitSTR: Gait Recognition With Sequential Two-Stream Refinement","authors":"Wanrong Zheng;Haidong Zhu;Zhaoheng Zheng;Ram Nevatia","doi":"10.1109/TBIOM.2024.3390626","DOIUrl":"10.1109/TBIOM.2024.3390626","url":null,"abstract":"Gait recognition aims to identify a person based on their walking sequences, serving as a useful biometric modality as it can be observed from long distances without requiring cooperation from the subject. In representing a person’s walking sequence, silhouettes and skeletons are the two primary modalities used. Silhouette sequences lack detailed part information when overlapping occurs between different body segments and are affected by carried objects and clothing. Skeletons, comprising joints and bones connecting the joints, provide more accurate part information for different segments; however, they are sensitive to occlusions and low-quality images, causing inconsistencies in frame-wise results within a sequence. In this paper, we explore the use of a two-stream representation of skeletons for gait recognition, alongside silhouettes. By fusing the combined data of silhouettes and skeletons, we refine the two-stream skeletons, joints, and bones through self-correction in graph convolution, along with cross-modal correction with temporal consistency from silhouettes. We demonstrate that with refined skeletons, the performance of the gait recognition model can achieve further improvement on public gait recognition datasets compared with state-of-the-art methods without extra annotations.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 4","pages":"528-538"},"PeriodicalIF":0.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140752555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gender Privacy Angular Constraints for Face Recognition","authors":"Zohra Rezgui;Nicola Strisciuglio;Raymond Veldhuis","doi":"10.1109/TBIOM.2024.3390586","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3390586","url":null,"abstract":"Deep learning-based face recognition systems produce templates that encode sensitive information next to identity, such as gender and ethnicity. This poses legal and ethical problems as the collection of biometric data should be minimized and only specific to a designated task. We propose two privacy constraints to hide the gender attribute that can be added to a recognition loss. The first constraint relies on the minimization of the angle between gender-centroid embeddings. The second constraint relies on the minimization of the angle between gender specific embeddings and their opposing gender-centroid weight vectors. Both constraints enforce the overlapping of the gender specific distributions of the embeddings. Furthermore, they have a direct interpretation in the embedding space and do not require a large number of trainable parameters as two fully connected layers are sufficient to achieve satisfactory results. We also provide extensive evaluation results across several datasets and face recognition networks, and we compare our method to three state-of-the-art methods. Our method is capable of maintaining high verification performances while significantly improving privacy in a cross-database setting, without increasing the computational load for template comparison. We also show that different training data can result in varying levels of effectiveness of privacy-enhancing methods that implement data minimization.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 3","pages":"352-363"},"PeriodicalIF":0.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10504554","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Min Long;Qiangqiang Duan;Le-Bing Zhang;Fei Peng;Dengyong Zhang
{"title":"Trans-FD: Transformer-Based Representation Interaction for Face De-Morphing","authors":"Min Long;Qiangqiang Duan;Le-Bing Zhang;Fei Peng;Dengyong Zhang","doi":"10.1109/TBIOM.2024.3390056","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3390056","url":null,"abstract":"Face morphing attacks aim to deceive face recognition systems by using a facial image that contains multiple biometric information. It has been demonstrated to pose a significant threat to commercial face recognition systems and human experts. Although a large number of face morphing detection methods have been proposed in recent years to enhance the security of face recognition systems, little attention has been paid to restoring the identity of the accomplice from a morphed image. In this paper, Trans-FD, a novel model that uses Transformer representation interaction to restore the identity of the accomplice, is proposed. To effectively separate the identity of an accomplice, Trans-FD applies Transformer to perform representation interaction in the separation network. Additionally, it utilizes CNN encoders to extract multi-scale features, and it establishes skip connections between the encoder and generator through the Transformer-based separation network to provide detailed information for the generator. Experiments demonstrate that Trans-FD can effectively restore the accomplice’s face and outperforms previous works in terms of restoration accuracy and image quality.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 3","pages":"385-397"},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Face Super-Resolution Quality Assessment Based on Identity and Recognizability","authors":"Weiling Chen;Weitao Lin;Xiaoyi Xu;Liqun Lin;Tiesong Zhao","doi":"10.1109/TBIOM.2024.3389982","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3389982","url":null,"abstract":"Face Super-Resolution (FSR) plays a crucial role in enhancing low-resolution face images, which is essential for various face-related tasks. However, FSR may alter individuals’ identities or introduce artifacts that affect recognizability. This problem has not been well assessed by existing Image Quality Assessment (IQA) methods. In this paper, we present both subjective and objective evaluations for FSR-IQA, resulting in a benchmark dataset and a reduced reference quality metrics, respectively. First, we incorporate a novel criterion of identity preservation and recognizability to develop our Face Super-resolution Quality Dataset (FSQD). Second, we analyze the correlation between identity preservation and recognizability, and investigate effective feature extractions for both of them. Third, we propose a training-free IQA framework called Face Identity and Recognizability Evaluation of Super-resolution (FIRES). Experimental results using FSQD demonstrate that FIRES achieves competitive performance.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"6 3","pages":"364-373"},"PeriodicalIF":0.0,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}