Raghavendra Ramachandra;Narayan Vetrekar;Sushrut Patwardhan;Sushma Venkatesh;Gauresh Naik;Rajendra S. Gad
{"title":"PCGattnNet: A 3-D Point Cloud Dynamic Graph Attention for Generalizable Face Presentation Attack Detection","authors":"Raghavendra Ramachandra;Narayan Vetrekar;Sushrut Patwardhan;Sushma Venkatesh;Gauresh Naik;Rajendra S. Gad","doi":"10.1109/TBIOM.2025.3534641","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3534641","url":null,"abstract":"Face recognition systems that are commonly used in access control settings are vulnerable to presentation attacks, which pose a significant security risk. Therefore, it is crucial to develop a robust and reliable face presentation attack detection system that can automatically detect these types of attacks. In this paper, we present a novel technique called Point Cloud Graph Attention Network (PCGattnNet) to detect face presentation attacks using 3D point clouds captured from a smartphone. The innovative nature of the proposed technique lies in its ability to dynamically represent point clouds as graphs that effectively capture discriminant information, thereby facilitating the detection of robust presentation attacks. To evaluate the efficacy of the proposed method effectively, we introduced newly collected 3D face point clouds using two different smartphones. The newly collected dataset comprised bona fide samples from 100 unique data subjects and six different 3D face presentation attack instruments. Extensive experiments were conducted to evaluate the generalizability of the proposed and existing methods to unknown attack instruments. The outcomes of these experiments demonstrate the reliability of the proposed method for detecting unknown attack instruments.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 4","pages":"924-939"},"PeriodicalIF":5.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10854497","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145134938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Coarse to Fine-Grained Spatio-Temporal Modeling for Video Action Recognition","authors":"Chunlei Li;Can Cheng;Miao Yu;Zhoufeng Liu;Di Huang","doi":"10.1109/TBIOM.2025.3532416","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3532416","url":null,"abstract":"The action recognition task involves analyzing video content and temporal relationships between frames to identify actions. Crucial to this process are action representations that effectively capture varying temporal scales and spatial motion variations. To address these challenges, we propose the Joint Coarse to Fine-Grained Spatio-Temporal Modeling (JCFG-STM) approach, which is designed to capture robust spatio-temporal representations through three key components: the Temporal-enhanced Spatio-Temporal Perception (TSTP) module, the Positional-enhanced Spatio-Temporal Perception (PSTP) module, and the Fine-grained Spatio-Temporal Perception (FSTP) module. Specifically, TSTP is designed to fuse temporal information across both local and global spatial scales, while PSTP emphasizes the integration of spatial coordinate directions, both horizontal and vertical, with temporal dynamics. Meanwhile, FSTP focuses on combining spatial coordinate information with short-term temporal data by differentiating neighboring frames, enabling fine-grained spatio-temporal modeling. JCFG-STM effectively focuses on multi-granularity and complementary motion patterns associated with actions. Extensive experiments conducted on large-scale action recognition datasets, including Kinetics-400, Something-Something V2, Jester, and EgoGesture, demonstrate the effectiveness of our approach and its superiority over state-of-the-art methods.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 3","pages":"444-457"},"PeriodicalIF":0.0,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanxian Duan;Qian Jiang;Xiaoyuan Xu;Yu Wang;Huasong Yi;Shaowen Yao;Xin Jin
{"title":"Adversarial Samples Generated by Self-Forgery for Face Forgery Detection","authors":"Hanxian Duan;Qian Jiang;Xiaoyuan Xu;Yu Wang;Huasong Yi;Shaowen Yao;Xin Jin","doi":"10.1109/TBIOM.2025.3529026","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3529026","url":null,"abstract":"As deep learning techniques continue to advance making face synthesis realistic and indistinguishable. Algorithms need to be continuously improved to cope with increasingly sophisticated forgery techniques. Current face forgery detectors achieve excellent results when detecting training and testing from the same dataset. However, the detector performance degrades when generalized to unknown forgery methods. One of the most effective ways to address this problem is to train the model using synthetic data. This helps the model learn a generic representation for deep forgery detection. In this article, we propose a new strategy for synthesis of training data. To improve the quality and sensitivity to forgeries, we include a Multi-scale Feature Aggregation Module and a Forgery Identification Module in the generator and discriminator. The Multi-scale Feature Aggregation Module captures finer details and textures while reducing forgery traces. The Forgery Identification Module more acutely detects traces and irregularities in the forgery images. It can better distinguish between real and fake images and improve overall detection accuracy. In addition, we employ an adversarial training strategy to dynamically construct the detector. This effectively explores the enhancement space of forgery samples. Through extensive experiments, we demonstrate the effectiveness of the proposed synthesis strategy. Our code can be found at: <uri>https://github.com/1241128239/ASG-SF</uri>.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 3","pages":"432-443"},"PeriodicalIF":0.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Boyuan Liu;Xin Zhang;Hefei Ling;Zongyi Li;Runsheng Wang;Hanyuan Zhang;Ping Li
{"title":"AIM-Bone: Texture Discrepancy Generation and Localization for Generalized Deepfake Detection","authors":"Boyuan Liu;Xin Zhang;Hefei Ling;Zongyi Li;Runsheng Wang;Hanyuan Zhang;Ping Li","doi":"10.1109/TBIOM.2025.3526655","DOIUrl":"https://doi.org/10.1109/TBIOM.2025.3526655","url":null,"abstract":"Deep synthesis multimedia content, especially human face manipulation poses a risk of visual and auditory confusion, highlighting the call for generalized face forgery detection methods. In this paper, we propose a novel method for fake sample synthesis, along with a dual auto-encoder network for generalized deepfake detection. First, we delve into the texture discrepancy between tampered and unperturbed regions within forged images and impose models to learn such features by adopting Augmentation Inside Masks (AIM). It is capable of sabotaging the texture consistency within a single real image and generating textures that are commonly seen in fake images. It is realized by exhibiting forgery clues of discrepancy in noise patterns, colors, resolutions, and especially the existence of GAN (Generative Adversarial Network) features, including GAN textures, deconvolution traces, GAN distribution, etc. To the best of our knowledge, this work is the first to incorporate GAN features in fake sample synthesizing. The second is that we design a Bone-shaped dual auto-encoder with a powerful image texture filter bridged in between to aid forgery detection and localization in two streams. Reconstruction learning in the color stream avoids over-fitting in specific textures and imposes learning color-related features. Moreover, the GAN fingerprints harbored within the output image can be in furtherance of AIM and produce texture-discrepant samples for further training. The noise stream takes input processed by the proposed texture filter to focus on noise perspective and predict forgery region localization, subjecting to the constraint of mask label produced by AIM. We conduct extensive experiments on multiple benchmark datasets and the superior performance has proven the effectiveness of AIM-Bone and its advantage against current state-of-the-art methods. Our source code is available at <monospace><uri>https://github.com/heart74/AIM-Bone.git</uri></monospace>.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 3","pages":"422-431"},"PeriodicalIF":0.0,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unbiased-Diff: Analyzing and Mitigating Biases in Diffusion Model-Based Face Image Generation","authors":"Malsha V. Perera;Vishal M. Patel","doi":"10.1109/TBIOM.2024.3525037","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3525037","url":null,"abstract":"Diffusion-based generative models have become increasingly popular in applications such as synthetic data generation and image editing, due to their ability to generate realistic, high-quality images. However, these models can exacerbate existing social biases, particularly regarding attributes like gender and race, potentially impacting downstream applications. In this paper, we analyze the presence of social biases in diffusion-based face generations and propose a novel sampling process guidance algorithm to mitigate these biases. Specifically, during the diffusion sampling process, we guide the generation to produce samples with attribute distributions that align with a balanced or desired attribute distribution. Our experiments demonstrate that diffusion models exhibit biases across multiple datasets in terms of gender and race. Moreover, our proposed method effectively mitigates these biases, making diffusion-based face generation more fair and inclusive.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 3","pages":"384-395"},"PeriodicalIF":0.0,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Information for Authors","authors":"","doi":"10.1109/TBIOM.2024.3513762","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3513762","url":null,"abstract":"","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 1","pages":"C3-C3"},"PeriodicalIF":0.0,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10816732","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142890362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Publication Information","authors":"","doi":"10.1109/TBIOM.2024.3513761","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3513761","url":null,"abstract":"","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 1","pages":"C2-C2"},"PeriodicalIF":0.0,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10816704","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142890360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiago Roxo;Joana Cabral Costa;Pedro R. M. Inácio;Hugo Proença
{"title":"BIAS: A Body-Based Interpretable Active Speaker Approach","authors":"Tiago Roxo;Joana Cabral Costa;Pedro R. M. Inácio;Hugo Proença","doi":"10.1109/TBIOM.2024.3520030","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3520030","url":null,"abstract":"State-of-the-art Active Speaker Detection (ASD) approaches heavily rely on audio and facial features to perform, which is not a sustainable approach in wild scenarios. Although these methods achieve good results in the standard AVA-ActiveSpeaker set, a recent wilder ASD dataset (WASD) showed the limitations of such models and raised the need for new approaches. As such, we propose BIAS, a model that, for the first time, combines audio, face, and body information, to accurately predict active speakers in varying/challenging conditions. Additionally, we design BIAS to provide interpretability by proposing a novel use for Squeeze-and-Excitation blocks, namely in attention heatmaps creation and feature importance assessment. For a full interpretability setup, we annotate an ASD-related actions dataset (ASD-Text) to finetune a ViT-GPT2 for text scene description to complement BIAS interpretability. The results show that BIAS is state-of-the-art in challenging conditions where body-based features are of utmost importance (Columbia, open-settings, and WASD), and yields competitive results in AVA-ActiveSpeaker, where face is more influential than body for ASD. BIAS interpretability also shows the features/aspects more relevant towards ASD prediction in varying settings, making it a strong baseline for further developments in interpretable ASD models, and is available at <uri>https://github.com/Tiago-Roxo/BIAS</uri>.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 3","pages":"410-421"},"PeriodicalIF":0.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inpainting Diffusion Synthetic and Data Augment With Feature Keypoints for Tiny Partial Fingerprints","authors":"Mao-Hsiu Hsu;Yung-Ching Hsu;Ching-Te Chiu","doi":"10.1109/TBIOM.2024.3517330","DOIUrl":"https://doi.org/10.1109/TBIOM.2024.3517330","url":null,"abstract":"The advancement of fingerprint research within public academic circles has been trailing behind facial recognition, primarily due to the scarcity of extensive publicly available datasets, despite fingerprints being widely used across various domains. Recent progress has seen the application of deep learning techniques to synthesize fingerprints, predominantly focusing on large-area fingerprints within existing datasets. However, with the emergence of AIoT and edge devices, the importance of tiny partial fingerprints has been underscored for their faster and more cost-effective properties. Yet, there remains a lack of publicly accessible datasets for such fingerprints. To address this issue, we introduce publicly available datasets tailored for tiny partial fingerprints. Using advanced generative deep learning, we pioneer diffusion methods for fingerprint synthesis. By combining random sampling with inpainting diffusion guided by feature keypoints masks, we enhance data augmentation while preserving key features, achieving up to 99.1% recognition matching rate. To demonstrate the usefulness of our fingerprint images generated using our approach, we conducted experiments involving model training for various tasks, including denoising, deblurring, and deep forgery detection. The results showed that models trained with our generated datasets outperformed those trained without our datasets or with other synthetic datasets. This indicates that our approach not only produces diverse fingerprints but also improves the model’s generalization capabilities. Furthermore, our approach ensures confidentiality without compromise by partially transforming randomly sampled synthetic fingerprints, which reduces the likelihood of real fingerprints being leaked. The total number of generated fingerprints published in this article amounts to 818,077. Moving forward, we are ongoing updates and releases to contribute to the advancement of the tiny partial fingerprint field. The code and our generated tiny partial fingerprint dataset can be accessed at <uri>https://github.com/Hsu0623/Inpainting-Diffusion-Synthetic-and-Data-Augment-with-Feature-Keypoints-for-Tiny-Partial-Fingerprints.git</uri>","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"7 3","pages":"396-409"},"PeriodicalIF":0.0,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144492305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}