基于原型的选择性知识精馏零拍草图图像检索

Kai Wang, Yifan Wang, Xing Xu, Xin Liu, Weihua Ou, Huimin Lu
{"title":"基于原型的选择性知识精馏零拍草图图像检索","authors":"Kai Wang, Yifan Wang, Xing Xu, Xin Liu, Weihua Ou, Huimin Lu","doi":"10.1145/3503161.3548382","DOIUrl":null,"url":null,"abstract":"Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is an emerging research task that aims to retrieve data of new classes across sketches and images. It is challenging due to the heterogeneous distributions and the inconsistent semantics across seen and unseen classes of the cross-modal data of sketches and images. To realize knowledge transfer, the latest approaches introduce knowledge distillation, which optimizes the student network through the teacher signal distilled from the teacher network pre-trained on large-scale datasets. However, these methods often ignore the mispredictions of the teacher signal, which may make the model vulnerable when disturbed by the wrong output of the teacher network. To tackle the above issues, we propose a novel method termed Prototype-based Selective Knowledge Distillation (PSKD) for ZS-SBIR. Our PSKD method first learns a set of prototypes to represent categories and then utilizes an instance-level adaptive learning strategy to strengthen semantic relations between categories. Afterwards, a correlation matrix targeted for the downstream task is established through the prototypes. With the learned correlation matrix, the teacher signal given by transformers pre-trained on ImageNet and fine-tuned on the downstream dataset, can be reconstructed to weaken the impact of mispredictions and selectively distill knowledge on the student network. Extensive experiments conducted on three widely-used datasets demonstrate that the proposed PSKD method establishes the new state-of-the-art performance on all datasets for ZS-SBIR.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval\",\"authors\":\"Kai Wang, Yifan Wang, Xing Xu, Xin Liu, Weihua Ou, Huimin Lu\",\"doi\":\"10.1145/3503161.3548382\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is an emerging research task that aims to retrieve data of new classes across sketches and images. It is challenging due to the heterogeneous distributions and the inconsistent semantics across seen and unseen classes of the cross-modal data of sketches and images. To realize knowledge transfer, the latest approaches introduce knowledge distillation, which optimizes the student network through the teacher signal distilled from the teacher network pre-trained on large-scale datasets. However, these methods often ignore the mispredictions of the teacher signal, which may make the model vulnerable when disturbed by the wrong output of the teacher network. To tackle the above issues, we propose a novel method termed Prototype-based Selective Knowledge Distillation (PSKD) for ZS-SBIR. Our PSKD method first learns a set of prototypes to represent categories and then utilizes an instance-level adaptive learning strategy to strengthen semantic relations between categories. Afterwards, a correlation matrix targeted for the downstream task is established through the prototypes. With the learned correlation matrix, the teacher signal given by transformers pre-trained on ImageNet and fine-tuned on the downstream dataset, can be reconstructed to weaken the impact of mispredictions and selectively distill knowledge on the student network. Extensive experiments conducted on three widely-used datasets demonstrate that the proposed PSKD method establishes the new state-of-the-art performance on all datasets for ZS-SBIR.\",\"PeriodicalId\":412792,\"journal\":{\"name\":\"Proceedings of the 30th ACM International Conference on Multimedia\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th ACM International Conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3503161.3548382\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3548382","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

摘要

基于零拍摄草图的图像检索(ZS-SBIR)是一项新兴的研究任务,旨在检索草图和图像之间的新类别数据。由于草图和图像的跨模态数据在可见类和不可见类之间的异构分布和语义不一致,因此具有挑战性。为了实现知识转移,最新的方法引入了知识蒸馏,通过从大规模数据集上预训练的教师网络中提取教师信号来优化学生网络。然而,这些方法往往忽略了教师信号的错误预测,这可能会使模型在受到教师网络错误输出的干扰时变得脆弱。为了解决上述问题,我们提出了一种基于原型的选择性知识蒸馏(PSKD)方法。我们的PSKD方法首先学习一组原型来表示类别,然后利用实例级自适应学习策略来加强类别之间的语义关系。然后,通过原型建立针对下游任务的关联矩阵。通过学习到的相关矩阵,可以重建在ImageNet上预训练并在下游数据集上微调的变压器给出的教师信号,以削弱错误预测的影响,并有选择地提取学生网络上的知识。在三个广泛使用的数据集上进行的大量实验表明,所提出的PSKD方法在ZS-SBIR的所有数据集上都建立了新的最先进的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval
Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is an emerging research task that aims to retrieve data of new classes across sketches and images. It is challenging due to the heterogeneous distributions and the inconsistent semantics across seen and unseen classes of the cross-modal data of sketches and images. To realize knowledge transfer, the latest approaches introduce knowledge distillation, which optimizes the student network through the teacher signal distilled from the teacher network pre-trained on large-scale datasets. However, these methods often ignore the mispredictions of the teacher signal, which may make the model vulnerable when disturbed by the wrong output of the teacher network. To tackle the above issues, we propose a novel method termed Prototype-based Selective Knowledge Distillation (PSKD) for ZS-SBIR. Our PSKD method first learns a set of prototypes to represent categories and then utilizes an instance-level adaptive learning strategy to strengthen semantic relations between categories. Afterwards, a correlation matrix targeted for the downstream task is established through the prototypes. With the learned correlation matrix, the teacher signal given by transformers pre-trained on ImageNet and fine-tuned on the downstream dataset, can be reconstructed to weaken the impact of mispredictions and selectively distill knowledge on the student network. Extensive experiments conducted on three widely-used datasets demonstrate that the proposed PSKD method establishes the new state-of-the-art performance on all datasets for ZS-SBIR.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信