{"title":"Learning Deep Embedding with Acoustic and Phoneme Features for Speaker Recognition in FM Broadcasting","authors":"Xiao Li, Xiao Chen, Rui Fu, Xiao Hu, Mintong Chen, Kun Niu","doi":"10.1049/2024/6694481","DOIUrl":"10.1049/2024/6694481","url":null,"abstract":"<div>\u0000 <p>Text-independent speaker verification (TI-SV) is a crucial task in speaker recognition, as it involves verifying an individual’s claimed identity from speech of arbitrary content without any human intervention. The target for TI-SV is to design a discriminative network to learn deep speaker embedding for speaker idiosyncrasy. In this paper, we propose a deep speaker embedding learning approach of a hybrid deep neural network (DNN) for TI-SV in FM broadcasting. Not only acoustic features are utilized, but also phoneme features are introduced as prior knowledge to collectively learn deep speaker embedding. The hybrid DNN consists of a convolutional neural network architecture for generating acoustic features and a multilayer perceptron architecture for extracting phoneme features sequentially, which represent significant pronunciation attributes. The extracted acoustic and phoneme features are concatenated to form deep embedding descriptors for speaker identity. The hybrid DNN demonstrates not only the complementarity between acoustic and phoneme features but also the temporality of phoneme features in a sequence. Our experiments show that the hybrid DNN outperforms existing methods and delivers a remarkable performance in FM broadcasting TI-SV.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6694481","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140220402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2024-02-23DOI: 10.1049/2024/1808587
Jascha Kolberg, Yannik Schäfer, Christian Rathgeb, Christoph Busch
{"title":"On the Potential of Algorithm Fusion for Demographic Bias Mitigation in Face Recognition","authors":"Jascha Kolberg, Yannik Schäfer, Christian Rathgeb, Christoph Busch","doi":"10.1049/2024/1808587","DOIUrl":"10.1049/2024/1808587","url":null,"abstract":"<div>\u0000 <p>With the rise of deep neural networks, the performance of biometric systems has increased tremendously. Biometric systems for face recognition are now used in everyday life, e.g., border control, crime prevention, or personal device access control. Although the accuracy of face recognition systems is generally high, they are not without flaws. Many biometric systems have been found to exhibit demographic bias, resulting in different demographic groups being not recognized with the same accuracy. This is especially true for facial recognition due to demographic factors, e.g., gender and skin color. While many previous works already reported demographic bias, this work aims to reduce demographic bias for biometric face recognition applications. In this regard, 12 face recognition systems are benchmarked regarding biometric recognition performance as well as demographic differentials, i.e., fairness. Subsequently, multiple fusion techniques are applied with the goal to improve the fairness in contrast to single systems. The experimental results show that it is possible to improve the fairness regarding single demographics, e.g., skin color or gender, while improving fairness for demographic subgroups turns out to be more challenging.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/1808587","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140436576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2024-02-05DOI: 10.1049/2024/6523854
Yi Zhao, Xin Jin, Song Gao, Liwen Wu, Shaowen Yao, Qian Jiang
{"title":"Face Forgery Detection with Long-Range Noise Features and Multilevel Frequency-Aware Clues","authors":"Yi Zhao, Xin Jin, Song Gao, Liwen Wu, Shaowen Yao, Qian Jiang","doi":"10.1049/2024/6523854","DOIUrl":"10.1049/2024/6523854","url":null,"abstract":"<div>\u0000 <p>The widespread dissemination of high-fidelity fake faces created by face forgery techniques has caused serious trust concerns and ethical issues in modern society. Consequently, face forgery detection has emerged as a prominent topic of research to prevent technology abuse. Although, most existing face forgery detectors demonstrate success when evaluating high-quality faces under intra-dataset scenarios, they often overfit manipulation-specific artifacts and lack robustness to postprocessing operations. In this work, we design an innovative dual-branch collaboration framework that leverages the strengths of the transformer and CNN to thoroughly dig into the multimodal forgery artifacts from both a global and local perspective. Specifically, a novel adaptive noise trace enhancement module (ANTEM) is proposed to remove high-level face content while amplifying more generalized forgery artifacts in the noise domain. Then, the transformer-based branch can track long-range noise features. Meanwhile, considering that subtle forgery artifacts could be described in the frequency domain even in a compression scenario, a multilevel frequency-aware module (MFAM) is developed and further applied to the CNN-based branch to extract complementary frequency-aware clues. Besides, we incorporate a collaboration strategy involving cross-entropy loss and single center loss to enhance the learning of more generalized representations by optimizing the fusion features of the dual branch. Extensive experiments on various benchmark datasets substantiate the superior generalization and robustness of our framework when compared to the competing approaches.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/6523854","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139862462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2024-02-03DOI: 10.1049/2024/4413655
Pesigrihastamadya Normakristagaluh, Geert J. Laanstra, Luuk J. Spreeuwers, Raymond N. J. Veldhuis
{"title":"The Impact of Illumination on Finger Vascular Pattern Recognition","authors":"Pesigrihastamadya Normakristagaluh, Geert J. Laanstra, Luuk J. Spreeuwers, Raymond N. J. Veldhuis","doi":"10.1049/2024/4413655","DOIUrl":"10.1049/2024/4413655","url":null,"abstract":"<div>\u0000 <p>This paper studies the impact of illumination direction and bundle width on finger vascular pattern imaging and recognition performance. A qualitative theoretical model is presented to explain the projection of finger blood vessels on the skin. A series of experiments were conducted using a scanner of our design with illumination from the top, a single-direction side (left or right), and narrow or wide beams. A new dataset was collected for the experiments, containing 4,428 NIR images of finger vein patterns captured under well-controlled conditions to minimize position and rotation angle differences between different sessions. Top illumination performs well because of more homogenous, which enhances a larger number of visible veins. Narrower bundles of light do not affect which veins are visible, but they reduce the overexposure at finger boundaries and increase the quality of vascular pattern images. The narrow beam achieves the best performance with 0% of [email protected]%, and the wide beam consistently results in a higher false nonmatch rate. The comparison of left- and right-side illumination has the highest error rates because only the veins in the middle of the finger are visible in both images. Different directional illumination may be interoperable since they produce the same vascular pattern and principally are the projected shadows on the finger surface. Score and image fusion for right- and left-side result in recognition performance similar to that obtained with top illumination, indicating the vein patterns are independent of illumination direction. All results of these experiments support the proposed model.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/4413655","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139807886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2024-01-27DOI: 10.1049/2024/8526857
Claudio Yáñez, Juan E. Tapia, Claudio A. Perez, Christoph Busch
{"title":"Impact of Occlusion Masks on Gender Classification from Iris Texture","authors":"Claudio Yáñez, Juan E. Tapia, Claudio A. Perez, Christoph Busch","doi":"10.1049/2024/8526857","DOIUrl":"10.1049/2024/8526857","url":null,"abstract":"<div>\u0000 <p>Gender classification on normalized iris images has been previously attempted with varying degrees of success. In these previous studies, it has been shown that occlusion masks may introduce gender information; occlusion masks are used in iris recognition to remove non-iris elements. When, the goal is to classify the gender using exclusively the iris texture, the presence of gender information in the masks may result in apparently higher accuracy, thereby not reflecting the actual gender information present in the iris. However, no measures have been taken to eliminate this information while preserving as much iris information as possible. We propose a novel method to assess the gender information present in the iris more accurately by eliminating gender information in the masks. This consists of pairing iris with similar masks and different gender, generating a paired mask using the OR operator, and applying this mask to the iris. Additionally, we manually fix iris segmentation errors to study their impact on the gender classification. Our results show that occlusion masks can account for 6.92% of the gender classification accuracy on average. Therefore, works aiming to perform gender classification using the iris texture from normalized iris images should eliminate this correlation.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/8526857","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140492836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2024-01-17DOI: 10.1049/2024/4924184
Fen Dai, Ziyang Wang, Xiangqun Zou, Rongwen Zhang, Xiaoling Deng
{"title":"Noncontact Palm Vein ROI Extraction Based on Improved Lightweight HRnet in Complex Backgrounds","authors":"Fen Dai, Ziyang Wang, Xiangqun Zou, Rongwen Zhang, Xiaoling Deng","doi":"10.1049/2024/4924184","DOIUrl":"10.1049/2024/4924184","url":null,"abstract":"<div>\u0000 <p>The extraction of ROI (region of interest) was a key step in noncontact palm vein recognition, which was crucial for the subsequent feature extraction and feature matching. A noncontact palm vein ROI extraction algorithm based on the improved HRnet for keypoints localization was proposed for dealing with hand gesture irregularities, translation, scaling, and rotation in complex backgrounds. To reduce the computation time and model size for ultimate deploying in low-cost embedded systems, this improved HRnet was designed to be lightweight by reconstructing the residual block structure and adopting depth-separable convolution, which greatly reduced the model size and improved the inference speed of network forward propagation. Next, the palm vein ROI localization and palm vein recognition are processed in self-built dataset and two public datasets (CASIA and TJU-PV). The proposed improved HRnet algorithm achieved 97.36% accuracy for keypoints detection on self-built palm vein dataset and 98.23% and 98.74% accuracy for keypoints detection on two public palm vein datasets (CASIA and TJU-PV), respectively. The model size was only 0.45 M, and on a CPU with a clock speed of 3 GHz, the average running time of ROI extraction for one image was 0.029 s. Based on the keypoints and corresponding ROI extraction, the equal error rate (EER) of palm vein recognition was 0.000362%, 0.014541%, and 0.005951% and the false nonmatch rate was 0.000001%, 11.034725%, and 4.613714% (false match rate: 0.01%) in the self-built dataset, TJU-PV, and CASIA, respectively. The experimental result showed that the proposed algorithm was feasible and effective and provided a reliable experimental basis for the research of palm vein recognition technology.</p>\u0000 </div>","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"2024 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/2024/4924184","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139526814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2023-12-18DOI: 10.1049/2023/7519499
L. Ruzicka, Dominik Söllinger, Bernhard Kohn, Clemens Heitzinger, Andreas Uhl, Bernhard Strobl
{"title":"Improving Sensor Interoperability between Contactless and Contact-Based Fingerprints Using Pose Correction and Unwarping","authors":"L. Ruzicka, Dominik Söllinger, Bernhard Kohn, Clemens Heitzinger, Andreas Uhl, Bernhard Strobl","doi":"10.1049/2023/7519499","DOIUrl":"https://doi.org/10.1049/2023/7519499","url":null,"abstract":"Current fingerprint identification systems face significant challenges in achieving interoperability between contact-based and contactless fingerprint sensors. In contrast to existing literature, we propose a novel approach that can combine pose correction with further enhancement operations. It uses deep learning models to steer the correction of the viewing angle, therefore enhancing the matching features of contactless fingerprints. The proposed approach was tested on real data of 78 participants (37,162 contactless fingerprints) acquired by national police officers using both contact-based and contactless sensors. The study found that the effectiveness of pose correction and unwarping varied significantly based on the individual characteristics of each fingerprint. However, when the various extension methods were combined on a finger-wise basis, an average decrease of 36.9% in equal error rates (EERs) was observed. Additionally, the combined impact of pose correction and bidirectional unwarping led to an average increase of 3.72% in NFIQ 2 scores across all fingers, coupled with a 6.4% decrease in EERs relative to the baseline. The addition of deep learning techniques presents a promising approach for achieving high-quality fingerprint acquisition using contactless sensors, enhancing recognition accuracy in various domains.","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"19 2","pages":""},"PeriodicalIF":2.0,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139175263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2023-12-06DOI: 10.1049/2023/6636386
Jingwen Li, Jiuzhen Liang, Hao Liu, Zhenjie Hou
{"title":"Adaptive Weighted Face Alignment by Multi-Scale Feature and Offset Prediction","authors":"Jingwen Li, Jiuzhen Liang, Hao Liu, Zhenjie Hou","doi":"10.1049/2023/6636386","DOIUrl":"https://doi.org/10.1049/2023/6636386","url":null,"abstract":"Traditional heatmap regression methods have some problems such as the lower limit of theoretical error and the lack of global constraints, which may lead to the collapse of the results in practical application. In this paper, we develop a facial landmark detection model aided by offset prediction to constrain the global shape. First, the hybrid detection model is used to roughly locate the initial coordinates predicted by the backbone network. At the same time, the head rotation attitude prediction module is added to the backbone network, and the Euler angle is used as the adaptive weight to modify the loss function so that the model has better robustness to the large pose image. Then, we introduce an offset prediction network. It uses the heatmap corresponding to the initial coordinates as an attention mask to fuze with the features, so the network can focus on the area around landmarks. This model shares the global features and regresses the offset relative to the real coordinates based on the initial coordinates to further enhance the continuity. In addition, we also add a multi-scale feature pre-extraction module to preprocess features so that we can increase feature scales and receptive fields. Experiments on several challenging public datasets show that our method gets better performance than the existing detection methods, confirming the effectiveness of our method.","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"56 5","pages":""},"PeriodicalIF":2.0,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138596728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2023-11-14DOI: 10.1049/2023/5087083
Sameera Khan, Dileep Kumar Singh, Mahesh Singh, Desta Faltaso Mena
{"title":"Automatic Signature Verifier Using Gaussian Gated Recurrent Unit Neural Network","authors":"Sameera Khan, Dileep Kumar Singh, Mahesh Singh, Desta Faltaso Mena","doi":"10.1049/2023/5087083","DOIUrl":"https://doi.org/10.1049/2023/5087083","url":null,"abstract":"Handwritten signatures are one of the most extensively utilized biometrics used for authentication, and forgeries of this behavioral biometric are quite widespread. Biometric databases are also difficult to access for training purposes due to privacy issues. The efficiency of automated authentication systems has been severely harmed as a result of this. Verification of static handwritten signatures with high efficiency remains an open research problem to date. This paper proposes an innovative introselect median filter for preprocessing and a novel Gaussian gated recurrent unit neural network (2GRUNN) as a classifier for designing an automatic verifier for handwritten signatures. The proposed classifier has achieved an FPR of 1.82 and an FNR of 3.03. The efficacy of the proposed method has been compared with the various existing neural network-based verifiers.","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"29 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134957429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET BiometricsPub Date : 2023-11-10DOI: 10.1049/2023/9353816
U. M. Kelly, M. Nauta, L. Liu, L. J. Spreeuwers, R. N. J. Veldhuis
{"title":"Worst-Case Morphs Using Wasserstein ALI and Improved MIPGAN","authors":"U. M. Kelly, M. Nauta, L. Liu, L. J. Spreeuwers, R. N. J. Veldhuis","doi":"10.1049/2023/9353816","DOIUrl":"https://doi.org/10.1049/2023/9353816","url":null,"abstract":"A morph is a combination of two separate facial images and contains the identity information of two different people. When used in an identity document, both people can be authenticated by a biometric face recognition (FR) system. Morphs can be generated using either a landmark-based approach or approaches based on deep learning, such as generative adversarial networks (GANs). In a recent paper, we introduced a worst-case upper bound on how challenging morphing attacks can be for an FR system. The closer morphs are to this upper bound, the bigger the challenge they pose to FR. We introduced an approach with which it was possible to generate morphs that approximate this upper bound for a known FR system (white box) but not for unknown (black box) FR systems. In this paper, we introduce a morph generation method that can approximate worst-case morphs even when the FR system is not known. A key contribution is that we include the goal of generating difficult morphs during training. Our method is based on adversarially learned inference (ALI) and uses concepts from Wasserstein GANs trained with gradient penalty, which were introduced to stabilise the training of GANs. We include these concepts to achieve a similar improvement in training stability and call the resulting method Wasserstein ALI (WALI). We finetune WALI using loss functions designed specifically to improve the ability to manipulate identity information in facial images and show how it can generate morphs that are more challenging for FR systems than landmark- or GAN-based morphs. We also show how our findings can be used to improve MIPGAN, an existing StyleGAN-based morph generator.","PeriodicalId":48821,"journal":{"name":"IET Biometrics","volume":"81 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135091763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}