IEEE transactions on biometrics, behavior, and identity science最新文献

筛选
英文 中文
Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency 基于相对姿态一致性的头部姿态估计域自适应
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2023-01-19 DOI: 10.1109/TBIOM.2023.3237039
Felix Kuhnke;Jörn Ostermann
{"title":"Domain Adaptation for Head Pose Estimation Using Relative Pose Consistency","authors":"Felix Kuhnke;Jörn Ostermann","doi":"10.1109/TBIOM.2023.3237039","DOIUrl":"https://doi.org/10.1109/TBIOM.2023.3237039","url":null,"abstract":"Head pose estimation plays a vital role in biometric systems related to facial and human behavior analysis. Typically, neural networks are trained on head pose datasets. Unfortunately, manual or sensor-based annotation of head pose is impractical. A solution is synthetic training data generated from 3D face models, which can provide an infinite number of perfect labels. However, computer generated images only provide an approximation of real-world images, leading to a performance gap between training and application domain. Therefore, there is a need for strategies that allow simultaneous learning on labeled synthetic data and unlabeled real-world data to overcome the domain gap. In this work we propose relative pose consistency, a semi-supervised learning strategy for head pose estimation based on consistency regularization. Consistency regularization enforces consistent network predictions under random image augmentations, including pose-preserving and pose-altering augmentations. We propose a strategy to exploit the relative pose introduced by pose-altering augmentations between augmented image pairs, to allow the network to benefit from relative pose labels during training on unlabeled data. We evaluate our approach in a domain-adaptation scenario and in a commonly used cross-dataset scenario. Furthermore, we reproduce related works to enforce consistent evaluation protocols and show that for both scenarios we outperform SOTA.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"348-359"},"PeriodicalIF":0.0,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8423754/10210132/10021684.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
RGB-D Face Recognition With Identity-Style Disentanglement and Depth Augmentation 基于身份风格解缠和深度增强的RGB-D人脸识别
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2023-01-09 DOI: 10.1109/TBIOM.2022.3233769
Meng-Tzu Chiu;Hsun-Ying Cheng;Chien-Yi Wang;Shang-Hong Lai
{"title":"RGB-D Face Recognition With Identity-Style Disentanglement and Depth Augmentation","authors":"Meng-Tzu Chiu;Hsun-Ying Cheng;Chien-Yi Wang;Shang-Hong Lai","doi":"10.1109/TBIOM.2022.3233769","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3233769","url":null,"abstract":"Deep learning approaches achieve highly accurate face recognition by training the models with huge face image datasets. Unlike 2D face image datasets, there is a lack of large 3D face datasets available to the public. Existing public 3D face datasets were usually collected with few subjects, leading to the over-fitting problem. This paper proposes two CNN models to improve the RGB-D face recognition task. The first is a segmentation-aware depth estimation network, called DepthNet, which estimates depth maps from RGB face images by exploiting semantic segmentation for more accurate face region localization. The other is a novel segmentation-guided RGB-D face recognition model that contains an RGB recognition branch, a depth map recognition branch, and an auxiliary segmentation mask branch. In our multi-modality face recognition model, a feature disentanglement scheme is employed to factorize the feature representation into identity-related and style-related components. DepthNet is applied to augment a large 2D face image dataset to a large RGB-D face dataset, which is used for training our RGB-D face recognition model. Our experimental results show that DepthNet can produce more reliable depth maps from face images with the segmentation mask. Our multi-modality face recognition model fully exploits the depth map and outperforms state-of-the-art methods on several public 3D face datasets with challenging variations.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"334-347"},"PeriodicalIF":0.0,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Audio–Visual Fusion for Emotion Recognition in the Valence–Arousal Space Using Joint Cross-Attention 基于联合交叉注意的效价觉醒空间的视听融合情绪识别
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2023-01-04 DOI: 10.1109/TBIOM.2022.3233083
R. Gnana Praveen;Patrick Cardinal;Eric Granger
{"title":"Audio–Visual Fusion for Emotion Recognition in the Valence–Arousal Space Using Joint Cross-Attention","authors":"R. Gnana Praveen;Patrick Cardinal;Eric Granger","doi":"10.1109/TBIOM.2022.3233083","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3233083","url":null,"abstract":"Automatic emotion recognition (ER) has recently gained much interest due to its potential in many real-world applications. In this context, multimodal approaches have been shown to improve performance (over unimodal approaches) by combining diverse and complementary sources of information, providing some robustness to noisy and missing modalities. In this paper, we focus on dimensional ER based on the fusion of facial and vocal modalities extracted from videos, where complementary audio-visual (A-V) relationships are explored to predict an individual’s emotional states in valence-arousal space. Most state-of-the-art fusion techniques rely on recurrent networks or conventional attention mechanisms that do not effectively leverage the complementary nature of A-V modalities. To address this problem, we introduce a joint cross-attentional model for A-V fusion that extracts the salient features across A-V modalities, and allows to effectively leverage the inter-modal relationships, while retaining the intra-modal relationships. In particular, it computes the cross-attention weights based on correlation between the joint feature representation and that of individual modalities. Deploying the joint A-V feature representation into the cross-attention module helps to simultaneously leverage both the intra and inter modal relationships, thereby significantly improving the performance of the system over the vanilla cross-attention module. The effectiveness of our proposed approach is validated experimentally on challenging videos from the RECOLA and AffWild2 datasets. Results indicate that our joint cross-attentional A-V fusion model provides a cost-effective solution that can outperform state-of-the-art approaches, even when the modalities are noisy or absent. Code is available at \u0000<uri>https://github.com/praveena2j/Joint-Cross-Attention-for-Audio-Visual-Fusion</uri>\u0000.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"360-373"},"PeriodicalIF":0.0,"publicationDate":"2023-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Multi-Scale Spatio-Temporal Network for Violence Behavior Detection 基于多尺度时空网络的暴力行为检测
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2023-01-02 DOI: 10.1109/TBIOM.2022.3233399
Wei Zhou;Xuanlin Min;Yiheng Zhao;Yiran Pang;Jun Yi
{"title":"A Multi-Scale Spatio-Temporal Network for Violence Behavior Detection","authors":"Wei Zhou;Xuanlin Min;Yiheng Zhao;Yiran Pang;Jun Yi","doi":"10.1109/TBIOM.2022.3233399","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3233399","url":null,"abstract":"Violence behavior detection has played an important role in computer vision, its widely used in unmanned security monitoring systems, Internet video filtration, etc. However, automatically detecting violence behavior from surveillance cameras has long been a challenging issue due to the real-time and detection accuracy. In this brief, a novel multi-scale spatio-temporal network termed as MSTN is proposed to detect violence behavior from video stream. To begin with, the spatio-temporal feature extraction module (STM) is developed to extract the key features between foreground and background of the original video. Then, temporal pooling and cross channel pooling are designed to obtain short frame rate and long frame rate from STM, respectively. Furthermore, short-time building (STB) branch and long-time building (LTB) branch are presented to extract the violence features from different spatio-temporal scales, where STB module is used to capture the spatial feature and LTB module is used to extract useful temporal feature for video recognition. Finally, a Trans module is presented to fuse the features of STB and LTB through lateral connection operation, where LTB feature is compressed into STB to improve the accuracy. Experimental results show the effectiveness and superiority of the proposed method on computational efficiency and detection accuracy.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"266-276"},"PeriodicalIF":0.0,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Biometrics, Behavior, and Identity Science Publication Information IEEE生物计量学、行为与身份科学学报
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2022-12-23 DOI: 10.1109/TBIOM.2022.3226338
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Publication Information","authors":"","doi":"10.1109/TBIOM.2022.3226338","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3226338","url":null,"abstract":"Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 1","pages":"C2-C2"},"PeriodicalIF":0.0,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8423754/9997805/09997808.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49950246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Biometrics, Behavior, and Identity Science Information for Authors IEEE生物识别、行为和身份科学信息作者汇刊
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2022-12-23 DOI: 10.1109/TBIOM.2022.3226339
{"title":"IEEE Transactions on Biometrics, Behavior, and Identity Science Information for Authors","authors":"","doi":"10.1109/TBIOM.2022.3226339","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3226339","url":null,"abstract":"These instructions give guidelines for preparing papers for this publication. Presents information for authors publishing in this journal.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 1","pages":"C3-C3"},"PeriodicalIF":0.0,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/8423754/9997805/09997806.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49950184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Context Grouped Attention for Unsupervised Person Re-Identification 无监督人再识别的多情境分组注意
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2022-12-09 DOI: 10.1109/TBIOM.2022.3226678
Kshitij Nikhal;Benjamin S. Riggan
{"title":"Multi-Context Grouped Attention for Unsupervised Person Re-Identification","authors":"Kshitij Nikhal;Benjamin S. Riggan","doi":"10.1109/TBIOM.2022.3226678","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3226678","url":null,"abstract":"Recent advancements like multiple contextual analysis, attention mechanisms, distance-aware optimization, and multi-task guidance have been widely used for supervised person re-identification (ReID), but the implementation and effects of such methods in unsupervised ReID frameworks are non-trivial and unclear, respectively. Moreover, with increasing size and complexity of image- and video-based ReID datasets, manual or semi-automated annotation procedures for supervised ReID are becoming labor intensive and cost prohibitive, which is undesirable especially considering the likelihood of annotation errors increase with scale/complexity of data collections. Therefore, we propose a new iterative clustering framework that is insensitive to annotation errors and over-fitting ReID annotations (i.e., labels). Our proposed unsupervised framework incorporates (a) a novel multi-context group attention architecture that learns a holistic attention map from multiple local and global contexts, (b) an unsupervised clustering loss function that down-weights easily discriminative identities, and (c) a background diversity term that helps cluster persons across different cross-camera views without leveraging any identification or camera labels. We perform extensive analysis using the DukeMTMC-VideoReID and MARS video-based ReID datasets and the MSMT17 image-based ReID dataset. Our approach is shown to provide a new state-of-the-art performance for unsupervised ReID, reducing the rank-1 performance gap between supervised and unsupervised ReID to 1.1%, 12.1%, and 21.9% from 6.1%, 17.9%, and 22.6% for DukeMTMC, MARS, and MSMT17 datasets, respectively.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"170-182"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49964209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Single-Sample Finger Vein Recognition via Competitive and Progressive Sparse Representation 基于竞争和渐进稀疏表示的单样本手指静脉识别
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2022-12-07 DOI: 10.1109/TBIOM.2022.3226270
Pengyang Zhao;Zhiquan Chen;Jing-Hao Xue;Jianjiang Feng;Wenming Yang;Qingmin Liao;Jie Zhou
{"title":"Single-Sample Finger Vein Recognition via Competitive and Progressive Sparse Representation","authors":"Pengyang Zhao;Zhiquan Chen;Jing-Hao Xue;Jianjiang Feng;Wenming Yang;Qingmin Liao;Jie Zhou","doi":"10.1109/TBIOM.2022.3226270","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3226270","url":null,"abstract":"As an emerging biometric technology, finger vein recognition has attracted much attention in recent years. However, single-sample recognition is a practical and longstanding challenge in this field, referring to only one finger vein image per class in the training set. In single-sample finger vein recognition, the illumination variations under low contrast and the lack of information of intra-class variations severely affect the recognition performance. Despite of its high robustness against noise and illumination variations, sparse representation has rarely been explored for single-sample finger vein recognition. Therefore, in this paper, we focus on developing a new approach called Progressive Sparse Representation Classification (PSRC) to address the challenging issue of single-sample finger vein recognition. Firstly, as residual may become too large under the scenario of single-sample finger vein recognition, we propose a progressive strategy for representation refinement of SRC. Secondly, to adaptively optimize progressions, a progressive index called Max Energy Residual Index (MERI) is defined as the guidance. Furthermore, we extend PSRC to bimodal biometrics and propose a Competitive PSRC (C-PSRC) fusion approach. The C-PSRC creates more discriminative fused sample and fusion dictionary by comparing residual errors of different modalities. By comparing with several state-of-the-art methods on three finger vein benchmarks, the superiority of the proposed PSRC and C-PSRC is clearly demonstrated.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 2","pages":"209-220"},"PeriodicalIF":0.0,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49964208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
iGROWL: Improved Group Detection With Link Prediction iGROWL:基于链路预测的改进组检测
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2022-12-02 DOI: 10.1109/TBIOM.2022.3225654
Viktor Schmuck;Oya Celiktutan
{"title":"iGROWL: Improved Group Detection With Link Prediction","authors":"Viktor Schmuck;Oya Celiktutan","doi":"10.1109/TBIOM.2022.3225654","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3225654","url":null,"abstract":"One of the main challenges robots need to overcome is crowd analysis. Crowd analysis deals with the detection of individuals and interaction groups as well as the recognition of their activities. This paper focuses on the detection of conversational groups, where there have been a number of approaches addressing this problem in both supervised and unsupervised ways. Supervised bottom-up approaches primarily relied on pairwise affinity matrices and were limited to static, third-person views. In this work, we present our approach based on Graph Neural Networks (GNNs) to the problem of interaction group detection, called improved Group Detection With Link Prediction (iGROWL). iGROWL utilises the fact that interaction groups exist in certain inherent spatial configurations and improves its predecessor, GROWL, by introducing an ensemble learning-based sample balancing technique to the algorithm. Our results show that iGROWL outperforms other state-of-the-art methods by 16.7% and 26.4% in terms of \u0000<inline-formula> <tex-math>$F_{1}$ </tex-math></inline-formula>\u0000-score when evaluated on the Salsa Poster Session and Cocktail Party datasets, respectively. Moreover, we show that sample balancing with GNNs is not trivial, but consistent results can be achieved by employing ensemble learning.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"400-410"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generating 2-D and 3-D Master Faces for Dictionary Attacks With a Network-Assisted Latent Space Evolution 基于网络辅助潜空间演化的字典攻击二维和三维主面生成
IEEE transactions on biometrics, behavior, and identity science Pub Date : 2022-11-24 DOI: 10.1109/TBIOM.2022.3223738
Tomer Friedlander;Ron Shmelkin;Lior Wolf
{"title":"Generating 2-D and 3-D Master Faces for Dictionary Attacks With a Network-Assisted Latent Space Evolution","authors":"Tomer Friedlander;Ron Shmelkin;Lior Wolf","doi":"10.1109/TBIOM.2022.3223738","DOIUrl":"https://doi.org/10.1109/TBIOM.2022.3223738","url":null,"abstract":"A master face is a face image that passes face-based identity authentication for a high percentage of the population. These faces can be used to impersonate, with a high probability of success, any user, without having access to any user information. We optimize these faces for 2D and 3D face verification models, by using an evolutionary algorithm in the latent embedding space of the StyleGAN face generator. For 2D face verification, multiple evolutionary strategies are compared, and we propose a novel approach that employs a neural network to direct the search toward promising samples, without adding fitness evaluations. The results we present demonstrate that it is possible to obtain a considerable coverage of the identities in the LFW or RFW datasets with less than 10 master faces, for six leading deep face recognition systems. In 3D, we generate faces using the 2D StyleGAN2 generator and predict a 3D structure using a deep 3D face reconstruction network. When employing two different 3D face recognition systems, we are able to obtain a coverage of 40%-50%. Additionally, we present the generation of paired 2D RGB and 3D master faces, which simultaneously match 2D and 3D models with high impersonation rates.","PeriodicalId":73307,"journal":{"name":"IEEE transactions on biometrics, behavior, and identity science","volume":"5 3","pages":"385-399"},"PeriodicalIF":0.0,"publicationDate":"2022-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49989778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信