Hierarchical Multivariate Representation Learning for Face Sketch Recognition

IF 5.3 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Emerging Topics in Computational Intelligence Pub Date : 2024-02-12 DOI:10.1109/TETCI.2024.3359090

Jiahao Zheng;Yu Tang;Anthony Huang;Dapeng Wu

{"title":"Hierarchical Multivariate Representation Learning for Face Sketch Recognition","authors":"Jiahao Zheng;Yu Tang;Anthony Huang;Dapeng Wu","doi":"10.1109/TETCI.2024.3359090","DOIUrl":null,"url":null,"abstract":"Face Sketch Recognition (FSR) is extremely challenging because of the heterogeneous gap between sketches and images. Relying on the ability to generative models, prior generation-based works have dominated FSR for a long time by decomposing FSR into two steps, namely, heterogeneous data synthesis and homogeneous data matching. However, decomposing FSR into two steps introduces noise and uncertainty, and the first step, heterogeneous data synthesis, is an even general and challenging problem. Solving a specific problem requires solving a more general one is to put the cart before the horse. In order to solve FSR smoothly and circumvent the above problems of generation-based methods, we propose a multi-view representation learning (MRL) framework based on Multivariate Loss and Hierarchical Loss (MvHi). Specifically, by using triplet loss as a bridge to connect the augmented representations generated by InfoNCE, we propose Multivariate Loss (Mv) to construct a more robust common feature subspace between sketches and images and directly solve FSR in this subspace. Moreover, Hierarchical Loss (Hi) is proposed to improve the training stability by utilizing the hidden states of the feature extractor. Comprehensive experiments on two commonly used datasets, CUFS and CUFSF, show that the proposed approach outperforms state-of-the-art methods by more than 7%. In addition, visualization experiments show that the proposed approach can extract the common representations among multi-view data compared to the baseline methods.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":null,"pages":null},"PeriodicalIF":5.3000,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10432989/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Face Sketch Recognition (FSR) is extremely challenging because of the heterogeneous gap between sketches and images. Relying on the ability to generative models, prior generation-based works have dominated FSR for a long time by decomposing FSR into two steps, namely, heterogeneous data synthesis and homogeneous data matching. However, decomposing FSR into two steps introduces noise and uncertainty, and the first step, heterogeneous data synthesis, is an even general and challenging problem. Solving a specific problem requires solving a more general one is to put the cart before the horse. In order to solve FSR smoothly and circumvent the above problems of generation-based methods, we propose a multi-view representation learning (MRL) framework based on Multivariate Loss and Hierarchical Loss (MvHi). Specifically, by using triplet loss as a bridge to connect the augmented representations generated by InfoNCE, we propose Multivariate Loss (Mv) to construct a more robust common feature subspace between sketches and images and directly solve FSR in this subspace. Moreover, Hierarchical Loss (Hi) is proposed to improve the training stability by utilizing the hidden states of the feature extractor. Comprehensive experiments on two commonly used datasets, CUFS and CUFSF, show that the proposed approach outperforms state-of-the-art methods by more than 7%. In addition, visualization experiments show that the proposed approach can extract the common representations among multi-view data compared to the baseline methods.

查看原文本刊更多论文

人脸素描识别的分层多元表征学习

人脸草图识别（FSR）是一项极具挑战性的工作，因为草图与图像之间存在异质差距。依靠生成模型的能力，之前基于生成的工作将人脸草图识别分解为两个步骤，即异质数据合成和同质数据匹配，从而在很长一段时间内主导了人脸草图识别。然而，将 FSR 分解为两个步骤会带来噪声和不确定性，而第一步，即异构数据合成，更是一个具有挑战性的通用问题。解决一个具体问题需要解决一个更普遍的问题，这是本末倒置。为了顺利解决 FSR 问题，规避基于生成的方法存在的上述问题，我们提出了一种基于多变量损失和层次损失（MvHi）的多视图表示学习（MRL）框架。具体来说，通过使用三重损失（triplet loss）作为连接 InfoNCE 生成的增强表示的桥梁，我们提出了多变量损失（Multivariate Loss，Mv）来构建草图和图像之间更稳健的共同特征子空间，并直接解决该子空间中的 FSR 问题。此外，我们还提出了层次损失法（Hi），通过利用特征提取器的隐藏状态来提高训练的稳定性。在两个常用数据集 CUFS 和 CUFSF 上进行的综合实验表明，所提出的方法比最先进的方法优越 7% 以上。此外，可视化实验表明，与基线方法相比，所提出的方法可以提取多视角数据的共同表征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Emerging Topics in Computational Intelligence Mathematics-Control and Optimization

CiteScore

10.30

自引率

7.50%

发文量

147

期刊介绍： The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys. TETCI is an electronics only publication. TETCI publishes six issues per year. Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.

文献相关原料

公司名称	产品信息	采购帮参考价格