Semantic Consistent Embedding for Domain Adaptive Zero-Shot Learning

IF 13.7

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2023-07-13 DOI:10.1109/TIP.2023.3293769

Jianyang Zhang;Guowu Yang;Ping Hu;Guosheng Lin;Fengmao Lv

{"title":"Semantic Consistent Embedding for Domain Adaptive Zero-Shot Learning","authors":"Jianyang Zhang;Guowu Yang;Ping Hu;Guosheng Lin;Fengmao Lv","doi":"10.1109/TIP.2023.3293769","DOIUrl":null,"url":null,"abstract":"Unsupervised domain adaptation has limitations when encountering label discrepancy between the source and target domains. While open-set domain adaptation approaches can address situations when the target domain has additional categories, these methods can only detect them but not further classify them. In this paper, we focus on a more challenging setting dubbed Domain Adaptive Zero-Shot Learning (DAZSL), which uses semantic embeddings of class tags as the bridge between seen and unseen classes to learn the classifier for recognizing all categories in the target domain when only the supervision of seen categories in the source domain is available. The main challenge of DAZSL is to perform knowledge transfer across categories and domain styles simultaneously. To this end, we propose a novel end-to-end learning mechanism dubbed Three-way Semantic Consistent Embedding (TSCE) to embed the source domain, target domain, and semantic space into a shared space. Specifically, TSCE learns domain-irrelevant categorical prototypes from the semantic embedding of class tags and uses them as the pivots of the shared space. The source domain features are aligned with the prototypes via their supervised information. On the other hand, the mutual information maximization mechanism is introduced to push the target domain features and prototypes towards each other. By this way, our approach can align domain differences between source and target images, as well as promote knowledge transfer towards unseen classes. Moreover, as there is no supervision in the target domain, the shared space may suffer from the catastrophic forgetting problem. Hence, we further propose a ranking-based embedding alignment mechanism to maintain the consistency between the semantic space and the shared space. Experimental results on both I2AwA and I2WebV clearly validate the effectiveness of our method. Code is available at \n<uri>https://github.com/tiggers23/TSCE-Domain-Adaptive-Zero-Shot-Learning</uri>\n.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"32 ","pages":"4024-4035"},"PeriodicalIF":13.7000,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10183844/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Unsupervised domain adaptation has limitations when encountering label discrepancy between the source and target domains. While open-set domain adaptation approaches can address situations when the target domain has additional categories, these methods can only detect them but not further classify them. In this paper, we focus on a more challenging setting dubbed Domain Adaptive Zero-Shot Learning (DAZSL), which uses semantic embeddings of class tags as the bridge between seen and unseen classes to learn the classifier for recognizing all categories in the target domain when only the supervision of seen categories in the source domain is available. The main challenge of DAZSL is to perform knowledge transfer across categories and domain styles simultaneously. To this end, we propose a novel end-to-end learning mechanism dubbed Three-way Semantic Consistent Embedding (TSCE) to embed the source domain, target domain, and semantic space into a shared space. Specifically, TSCE learns domain-irrelevant categorical prototypes from the semantic embedding of class tags and uses them as the pivots of the shared space. The source domain features are aligned with the prototypes via their supervised information. On the other hand, the mutual information maximization mechanism is introduced to push the target domain features and prototypes towards each other. By this way, our approach can align domain differences between source and target images, as well as promote knowledge transfer towards unseen classes. Moreover, as there is no supervision in the target domain, the shared space may suffer from the catastrophic forgetting problem. Hence, we further propose a ranking-based embedding alignment mechanism to maintain the consistency between the semantic space and the shared space. Experimental results on both I2AwA and I2WebV clearly validate the effectiveness of our method. Code is available at https://github.com/tiggers23/TSCE-Domain-Adaptive-Zero-Shot-Learning .

查看原文本刊更多论文

用于领域自适应零样本学习的语义一致嵌入

当遇到源域和目标域之间的标签差异时，无监督的域自适应具有局限性。虽然开集域自适应方法可以解决目标域具有额外类别的情况，但这些方法只能检测它们，而不能对它们进行进一步分类。在本文中，我们关注一种更具挑战性的设置，称为域自适应零样本学习（DAZSL），该设置使用类标签的语义嵌入作为可见类和不可见类之间的桥梁，以学习在只有源域中可见类的监督可用时识别目标域中所有类别的分类器。DAZSL的主要挑战是同时跨类别和领域风格执行知识转移。为此，我们提出了一种新的端到端学习机制，称为三向语义一致嵌入（TSCE），将源域、目标域和语义空间嵌入到共享空间中。具体来说，TSCE从类标签的语义嵌入中学习与领域无关的分类原型，并将其用作共享空间的支点。源域特征通过原型的监督信息与原型对齐。另一方面，引入互信息最大化机制，将目标领域特征和原型推向彼此。通过这种方式，我们的方法可以调整源图像和目标图像之间的领域差异，并促进知识向看不见的类转移。此外，由于在目标域中没有监督，共享空间可能会出现灾难性的遗忘问题。因此，我们进一步提出了一种基于排名的嵌入对齐机制，以保持语义空间和共享空间之间的一致性。在I2AwA和I2WebV上的实验结果清楚地验证了我们方法的有效性。代码可在https://github.com/tiggers23/TSCE-Domain-Adaptive-Zero-Shot-Learning.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量