基于少镜头二维图像的三维模型检索的跨域原型对比损失

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI:10.1109/ICME55011.2023.00492

Yaqian Zhou, Yu Liu, Dan Song, Jiayu Li, Xuanya Li, Anjin Liu

{"title":"基于少镜头二维图像的三维模型检索的跨域原型对比损失","authors":"Yaqian Zhou, Yu Liu, Dan Song, Jiayu Li, Xuanya Li, Anjin Liu","doi":"10.1109/ICME55011.2023.00492","DOIUrl":null,"url":null,"abstract":"2D image-based 3D model retrieval (IBMR) usually relies on abundant explicit supervision on 2D images, together with unlabeled 3D models to learn domain-aligned yet class-discriminative features for the retrieval task. However, collecting large-scale 2D labels is cost-effective and time-consuming. Therefore, we explore a challenging IBMR task, where only few-shot labeled 2D images are available while the rest of the 2D and 3D samples remain unlabeled. Limited annotation of 2D images further increases the difficulty of domain-aligned yet discriminative feature learning. Therefore, we propose cross-domain prototype contrastive loss (CPCL) for the few-shot IBMR task. Specifically, we capture semantic information to learn class-discriminative features in each domain by minimizing intra-domain prototype contrastive loss. Besides, we perform inter-domain transferable contrastive learning to align the features between instances and prototypes of the same class across domains. Comprehensive experiments on popular benchmarks, MI3DOR and MI3DOR-2, validate the superiority of CPCL.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-domain Prototype Contrastive loss for Few-shot 2D Image-Based 3D Model Retrieval\",\"authors\":\"Yaqian Zhou, Yu Liu, Dan Song, Jiayu Li, Xuanya Li, Anjin Liu\",\"doi\":\"10.1109/ICME55011.2023.00492\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"2D image-based 3D model retrieval (IBMR) usually relies on abundant explicit supervision on 2D images, together with unlabeled 3D models to learn domain-aligned yet class-discriminative features for the retrieval task. However, collecting large-scale 2D labels is cost-effective and time-consuming. Therefore, we explore a challenging IBMR task, where only few-shot labeled 2D images are available while the rest of the 2D and 3D samples remain unlabeled. Limited annotation of 2D images further increases the difficulty of domain-aligned yet discriminative feature learning. Therefore, we propose cross-domain prototype contrastive loss (CPCL) for the few-shot IBMR task. Specifically, we capture semantic information to learn class-discriminative features in each domain by minimizing intra-domain prototype contrastive loss. Besides, we perform inter-domain transferable contrastive learning to align the features between instances and prototypes of the same class across domains. Comprehensive experiments on popular benchmarks, MI3DOR and MI3DOR-2, validate the superiority of CPCL.\",\"PeriodicalId\":321830,\"journal\":{\"name\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Multimedia and Expo (ICME)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME55011.2023.00492\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo (ICME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME55011.2023.00492","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

基于二维图像的三维模型检索(IBMR)通常依赖于对二维图像的大量显式监督，并结合未标记的三维模型来学习检索任务的领域对齐和类别区分特征。然而，收集大规模二维标签成本低且耗时长。因此，我们探索了一个具有挑战性的IBMR任务，其中只有少数镜头标记的2D图像可用，而其余的2D和3D样本仍未标记。对二维图像的有限标注进一步增加了区域对齐的判别特征学习的难度。因此，我们提出了基于跨域原型对比损失(CPCL)的少弹IBMR任务。具体来说，我们捕获语义信息，通过最小化域内原型对比损失来学习每个域中的类别区分特征。此外，我们还进行了跨领域可转移的对比学习，以在跨领域的同一类的实例和原型之间对齐特征。在MI3DOR和MI3DOR-2等常用基准上的综合实验验证了CPCL的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cross-domain Prototype Contrastive loss for Few-shot 2D Image-Based 3D Model Retrieval

2D image-based 3D model retrieval (IBMR) usually relies on abundant explicit supervision on 2D images, together with unlabeled 3D models to learn domain-aligned yet class-discriminative features for the retrieval task. However, collecting large-scale 2D labels is cost-effective and time-consuming. Therefore, we explore a challenging IBMR task, where only few-shot labeled 2D images are available while the rest of the 2D and 3D samples remain unlabeled. Limited annotation of 2D images further increases the difficulty of domain-aligned yet discriminative feature learning. Therefore, we propose cross-domain prototype contrastive loss (CPCL) for the few-shot IBMR task. Specifically, we capture semantic information to learn class-discriminative features in each domain by minimizing intra-domain prototype contrastive loss. Besides, we perform inter-domain transferable contrastive learning to align the features between instances and prototypes of the same class across domains. Comprehensive experiments on popular benchmarks, MI3DOR and MI3DOR-2, validate the superiority of CPCL.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE International Conference on Multimedia and Expo (ICME)

自引率

0.00%

发文量