Sketch face recognition based on light semantic Transformer network

IF 1.5 4区计算机科学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IET Computer Vision Pub Date : 2023-05-30 DOI:10.1049/cvi2.12209

Lin Cao, Jianqiang Yin, Yanan Guo, Kangning Du, Fan Zhang

{"title":"Sketch face recognition based on light semantic Transformer network","authors":"Lin Cao, Jianqiang Yin, Yanan Guo, Kangning Du, Fan Zhang","doi":"10.1049/cvi2.12209","DOIUrl":null,"url":null,"abstract":"<p>Sketch face recognition has a wide range of applications in criminal investigation, but it remains a challenging task due to the small-scale sample and the semantic deficiencies caused by cross-modality differences. The authors propose a light semantic Transformer network to extract and model the semantic information of cross-modality images. First, the authors employ a meta-learning training strategy to obtain task-related training samples to solve the small sample problem. Then to solve the contradiction between the high complexity of the Transformer and the small sample problem of sketch face recognition, the authors build the light semantic transformer network by proposing a hierarchical group linear transformation and introducing parameter sharing, which can extract highly discriminative semantic features on small–scale datasets. Finally, the authors propose a domain-adaptive focal loss to reduce the cross-modality differences between sketches and photos and improve the training effect of the light semantic Transformer network. Extensive experiments have shown that the features extracted by the proposed method have significant discriminative effects. The authors’ method improves the recognition rate by 7.6% on the UoM-SGFSv2 dataset, and the recognition rate reaches 92.59% on the CUFSF dataset.</p>","PeriodicalId":56304,"journal":{"name":"IET Computer Vision","volume":"17 8","pages":"962-976"},"PeriodicalIF":1.5000,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cvi2.12209","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cvi2.12209","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Sketch face recognition has a wide range of applications in criminal investigation, but it remains a challenging task due to the small-scale sample and the semantic deficiencies caused by cross-modality differences. The authors propose a light semantic Transformer network to extract and model the semantic information of cross-modality images. First, the authors employ a meta-learning training strategy to obtain task-related training samples to solve the small sample problem. Then to solve the contradiction between the high complexity of the Transformer and the small sample problem of sketch face recognition, the authors build the light semantic transformer network by proposing a hierarchical group linear transformation and introducing parameter sharing, which can extract highly discriminative semantic features on small–scale datasets. Finally, the authors propose a domain-adaptive focal loss to reduce the cross-modality differences between sketches and photos and improve the training effect of the light semantic Transformer network. Extensive experiments have shown that the features extracted by the proposed method have significant discriminative effects. The authors’ method improves the recognition rate by 7.6% on the UoM-SGFSv2 dataset, and the recognition rate reaches 92.59% on the CUFSF dataset.

Abstract Image

查看原文本刊更多论文

基于光语义变换器网络的素描人脸识别

素描人脸识别在刑事侦查中有着广泛的应用，但由于样本规模小以及跨模态差异造成的语义缺陷，它仍然是一项具有挑战性的任务。作者提出了一种轻语义变换器网络来提取跨模态图像的语义信息并建立模型。首先，作者采用元学习训练策略获取与任务相关的训练样本，以解决小样本问题。然后，为了解决变换器的高复杂性与素描人脸识别的小样本问题之间的矛盾，作者通过提出分层群线性变换并引入参数共享，构建了轻语义变换器网络，该网络可以在小规模数据集上提取高辨别度的语义特征。最后，作者提出了一种域自适应焦点损失，以减少草图和照片之间的跨模态差异，提高光语义变换器网络的训练效果。大量实验表明，所提方法提取的特征具有显著的识别效果。作者的方法在 UoM-SGFSv2 数据集上的识别率提高了 7.6%，在 CUFSF 数据集上的识别率达到了 92.59%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IET Computer Vision 工程技术-工程：电子与电气

CiteScore

3.30

自引率

11.80%

发文量

审稿时长

3.4 months

期刊介绍： IET Computer Vision seeks original research papers in a wide range of areas of computer vision. The vision of the journal is to publish the highest quality research work that is relevant and topical to the field, but not forgetting those works that aim to introduce new horizons and set the agenda for future avenues of research in computer vision. IET Computer Vision welcomes submissions on the following topics: Biologically and perceptually motivated approaches to low level vision (feature detection, etc.); Perceptual grouping and organisation Representation, analysis and matching of 2D and 3D shape Shape-from-X Object recognition Image understanding Learning with visual inputs Motion analysis and object tracking Multiview scene analysis Cognitive approaches in low, mid and high level vision Control in visual systems Colour, reflectance and light Statistical and probabilistic models Face and gesture Surveillance Biometrics and security Robotics Vehicle guidance Automatic model aquisition Medical image analysis and understanding Aerial scene analysis and remote sensing Deep learning models in computer vision Both methodological and applications orientated papers are welcome. Manuscripts submitted are expected to include a detailed and analytical review of the literature and state-of-the-art exposition of the original proposed research and its methodology, its thorough experimental evaluation, and last but not least, comparative evaluation against relevant and state-of-the-art methods. Submissions not abiding by these minimum requirements may be returned to authors without being sent to review. Special Issues Current Call for Papers: Computer Vision for Smart Cameras and Camera Networks - https://digital-library.theiet.org/files/IET_CVI_SC.pdf Computer Vision for the Creative Industries - https://digital-library.theiet.org/files/IET_CVI_CVCI.pdf