一种面向超细粒度视觉分类的组合特征嵌入与相似度度量

2021 Digital Image Computing: Techniques and Applications (DICTA) Pub Date : 2021-09-25 DOI:10.1109/DICTA52665.2021.9647081

Yajie Sun, Miaohua Zhang, Xiaohan Yu, Yi Liao, Yongsheng Gao

{"title":"一种面向超细粒度视觉分类的组合特征嵌入与相似度度量","authors":"Yajie Sun, Miaohua Zhang, Xiaohan Yu, Yi Liao, Yongsheng Gao","doi":"10.1109/DICTA52665.2021.9647081","DOIUrl":null,"url":null,"abstract":"Fine-grained visual categorization (FGVC), which aims at classifying objects with small inter-class variances, has been significantly advanced in recent years. However, ultra-fine-grained visual categorization (ultra-FGVC), which targets at identifying subclasses with extremely similar patterns, has not received much attention. In ultra-FGVC datasets, the samples per category are always scarce as the granularity moves down, which will lead to overfitting problems. Moreover, the difference among different categories is too subtle to distinguish even for professional experts. Motivated by these issues, this paper proposes a novel compositional feature embedding and similarity metric (CECS). Specifically, in the compositional feature embedding module, we randomly select patches in the original input image, and these patches are then replaced by patches from the images of different categories or masked out. Then the replaced and masked images are used to augment the original input images, which can provide more diverse samples and thus largely alleviate overfitting problem resulted from limited training samples. Besides, learning with diverse samples forces the model to learn not only the most discriminative features but also other informative features in remaining regions, enhancing the generalization and robustness of the model. In the compositional similarity metric module, a new similarity metric is developed to improve the classification performance by narrowing the intra-category distance and enlarging the inter-category distance. Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-of-the-art performance.","PeriodicalId":424950,"journal":{"name":"2021 Digital Image Computing: Techniques and Applications (DICTA)","volume":"176 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Compositional Feature Embedding and Similarity Metric for Ultra-Fine-Grained Visual Categorization\",\"authors\":\"Yajie Sun, Miaohua Zhang, Xiaohan Yu, Yi Liao, Yongsheng Gao\",\"doi\":\"10.1109/DICTA52665.2021.9647081\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fine-grained visual categorization (FGVC), which aims at classifying objects with small inter-class variances, has been significantly advanced in recent years. However, ultra-fine-grained visual categorization (ultra-FGVC), which targets at identifying subclasses with extremely similar patterns, has not received much attention. In ultra-FGVC datasets, the samples per category are always scarce as the granularity moves down, which will lead to overfitting problems. Moreover, the difference among different categories is too subtle to distinguish even for professional experts. Motivated by these issues, this paper proposes a novel compositional feature embedding and similarity metric (CECS). Specifically, in the compositional feature embedding module, we randomly select patches in the original input image, and these patches are then replaced by patches from the images of different categories or masked out. Then the replaced and masked images are used to augment the original input images, which can provide more diverse samples and thus largely alleviate overfitting problem resulted from limited training samples. Besides, learning with diverse samples forces the model to learn not only the most discriminative features but also other informative features in remaining regions, enhancing the generalization and robustness of the model. In the compositional similarity metric module, a new similarity metric is developed to improve the classification performance by narrowing the intra-category distance and enlarging the inter-category distance. Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-of-the-art performance.\",\"PeriodicalId\":424950,\"journal\":{\"name\":\"2021 Digital Image Computing: Techniques and Applications (DICTA)\",\"volume\":\"176 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Digital Image Computing: Techniques and Applications (DICTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DICTA52665.2021.9647081\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA52665.2021.9647081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

细粒度视觉分类(FGVC)是一种旨在对类间差异较小的目标进行分类的方法，近年来得到了很大的发展。然而，超细粒度视觉分类(ultra-FGVC)，其目标是识别具有极其相似模式的子类，并没有受到太多的关注。在超fgvc数据集中，随着粒度的下降，每个类别的样本总是稀缺的，这将导致过拟合问题。此外，不同类别之间的差异太微妙，即使是专业专家也无法区分。针对这些问题，本文提出了一种新的组合特征嵌入和相似度度量(CECS)。具体而言，在组合特征嵌入模块中，我们在原始输入图像中随机选择patch，然后将这些patch替换为来自不同类别图像的patch或屏蔽掉这些patch。然后用替换和屏蔽后的图像对原始输入图像进行增强，可以提供更多样化的样本，从而在很大程度上缓解了训练样本有限导致的过拟合问题。此外，不同样本的学习迫使模型不仅要学习最具判别性的特征，还要学习剩余区域的其他信息特征，从而增强模型的泛化和鲁棒性。在组合相似度度量模块中，提出了一种新的相似度度量，通过缩小类别内距离和增大类别间距离来提高分类性能。在两个超FGVC数据集和一个FGVC数据集上使用最新基准方法的实验结果一致表明，所提出的CECS方法达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Compositional Feature Embedding and Similarity Metric for Ultra-Fine-Grained Visual Categorization

Fine-grained visual categorization (FGVC), which aims at classifying objects with small inter-class variances, has been significantly advanced in recent years. However, ultra-fine-grained visual categorization (ultra-FGVC), which targets at identifying subclasses with extremely similar patterns, has not received much attention. In ultra-FGVC datasets, the samples per category are always scarce as the granularity moves down, which will lead to overfitting problems. Moreover, the difference among different categories is too subtle to distinguish even for professional experts. Motivated by these issues, this paper proposes a novel compositional feature embedding and similarity metric (CECS). Specifically, in the compositional feature embedding module, we randomly select patches in the original input image, and these patches are then replaced by patches from the images of different categories or masked out. Then the replaced and masked images are used to augment the original input images, which can provide more diverse samples and thus largely alleviate overfitting problem resulted from limited training samples. Besides, learning with diverse samples forces the model to learn not only the most discriminative features but also other informative features in remaining regions, enhancing the generalization and robustness of the model. In the compositional similarity metric module, a new similarity metric is developed to improve the classification performance by narrowing the intra-category distance and enlarging the inter-category distance. Experimental results on two ultra-FGVC datasets and one FGVC dataset with recent benchmark methods consistently demonstrate that the proposed CECS method achieves the state-of-the-art performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 Digital Image Computing: Techniques and Applications (DICTA)

自引率

0.00%

发文量