深度分类与检索的跨批参考学习

Huei-Fang Yang, Kevin Lin, Chu-Song Chen
{"title":"深度分类与检索的跨批参考学习","authors":"Huei-Fang Yang, Kevin Lin, Chu-Song Chen","doi":"10.1145/2964284.2964324","DOIUrl":null,"url":null,"abstract":"Learning feature representations for image retrieval is essential to multimedia search and mining applications. Recently, deep convolutional networks (CNNs) have gained much attention due to their impressive performance on object detection and image classification, and the feature representations learned from a large-scale generic dataset (e.g., ImageNet) can be transferred to or fine-tuned on the datasets of other domains. However, when the feature representations learned with a deep CNN are applied to image retrieval, the performance is still not as good as they are used for classification, which restricts their applicability to relevant image search. To ensure the retrieval capability of the learned feature space, we introduce a new idea called cross-batch reference (CBR) to enhance the stochastic-gradient-descent (SGD) training of CNNs. In each iteration of our training process, the network adjustment relies not only on the training samples in a single batch, but also on the information passed by the samples in the other batches. This inter-batches communication mechanism is formulated as a cross-batch retrieval process based on the mean average precision (MAP) criterion, where the relevant and irrelevant samples are encouraged to be placed on top and rear of the retrieval list, respectively. The learned feature space is not only discriminative to different classes, but the samples that are relevant to each other or of the same class are also enforced to be centralized. To maximize the cross-batch MAP, we design a loss function that is an approximated lower bound of the MAP on the feature layer of the network, which is differentiable and easier for optimization. By combining the intra-batch classification and inter-batch cross-reference losses, the learned features are effective for both classification and retrieval tasks. Experimental results on various benchmarks demonstrate the effectiveness of our approach.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"126 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Cross-batch Reference Learning for Deep Classification and Retrieval\",\"authors\":\"Huei-Fang Yang, Kevin Lin, Chu-Song Chen\",\"doi\":\"10.1145/2964284.2964324\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning feature representations for image retrieval is essential to multimedia search and mining applications. Recently, deep convolutional networks (CNNs) have gained much attention due to their impressive performance on object detection and image classification, and the feature representations learned from a large-scale generic dataset (e.g., ImageNet) can be transferred to or fine-tuned on the datasets of other domains. However, when the feature representations learned with a deep CNN are applied to image retrieval, the performance is still not as good as they are used for classification, which restricts their applicability to relevant image search. To ensure the retrieval capability of the learned feature space, we introduce a new idea called cross-batch reference (CBR) to enhance the stochastic-gradient-descent (SGD) training of CNNs. In each iteration of our training process, the network adjustment relies not only on the training samples in a single batch, but also on the information passed by the samples in the other batches. This inter-batches communication mechanism is formulated as a cross-batch retrieval process based on the mean average precision (MAP) criterion, where the relevant and irrelevant samples are encouraged to be placed on top and rear of the retrieval list, respectively. The learned feature space is not only discriminative to different classes, but the samples that are relevant to each other or of the same class are also enforced to be centralized. To maximize the cross-batch MAP, we design a loss function that is an approximated lower bound of the MAP on the feature layer of the network, which is differentiable and easier for optimization. By combining the intra-batch classification and inter-batch cross-reference losses, the learned features are effective for both classification and retrieval tasks. Experimental results on various benchmarks demonstrate the effectiveness of our approach.\",\"PeriodicalId\":140670,\"journal\":{\"name\":\"Proceedings of the 24th ACM international conference on Multimedia\",\"volume\":\"126 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 24th ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2964284.2964324\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2964284.2964324","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

摘要

学习图像检索的特征表示对于多媒体搜索和挖掘应用是必不可少的。近年来,深度卷积网络(cnn)因其在目标检测和图像分类方面令人印象深刻的性能而受到广泛关注,并且从大规模通用数据集(例如ImageNet)中学习到的特征表示可以转移到其他领域的数据集上或对其进行微调。然而,当使用深度CNN学习到的特征表示应用于图像检索时,其性能仍然不如用于分类,这限制了其在相关图像搜索中的适用性。为了保证学习到的特征空间的检索能力,我们引入了一种新的思想——交叉批引用(cross-batch reference, CBR)来增强cnn的随机梯度下降(SGD)训练。在训练过程的每次迭代中,网络调整不仅依赖于单个批次的训练样本,还依赖于其他批次样本传递的信息。这种批次间通信机制被制定为基于平均精度(MAP)标准的跨批次检索过程,其中鼓励将相关和不相关的样本分别放在检索列表的顶部和后部。学习到的特征空间不仅对不同的类别具有区别性,而且对彼此相关或同一类别的样本也强制集中。为了最大化跨批MAP,我们设计了一个损失函数,它是网络特征层上MAP的近似下界,它是可微的,更容易优化。通过结合批内分类和批间交叉参考损失,学习到的特征对分类和检索任务都是有效的。各种基准测试的实验结果证明了我们方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cross-batch Reference Learning for Deep Classification and Retrieval
Learning feature representations for image retrieval is essential to multimedia search and mining applications. Recently, deep convolutional networks (CNNs) have gained much attention due to their impressive performance on object detection and image classification, and the feature representations learned from a large-scale generic dataset (e.g., ImageNet) can be transferred to or fine-tuned on the datasets of other domains. However, when the feature representations learned with a deep CNN are applied to image retrieval, the performance is still not as good as they are used for classification, which restricts their applicability to relevant image search. To ensure the retrieval capability of the learned feature space, we introduce a new idea called cross-batch reference (CBR) to enhance the stochastic-gradient-descent (SGD) training of CNNs. In each iteration of our training process, the network adjustment relies not only on the training samples in a single batch, but also on the information passed by the samples in the other batches. This inter-batches communication mechanism is formulated as a cross-batch retrieval process based on the mean average precision (MAP) criterion, where the relevant and irrelevant samples are encouraged to be placed on top and rear of the retrieval list, respectively. The learned feature space is not only discriminative to different classes, but the samples that are relevant to each other or of the same class are also enforced to be centralized. To maximize the cross-batch MAP, we design a loss function that is an approximated lower bound of the MAP on the feature layer of the network, which is differentiable and easier for optimization. By combining the intra-batch classification and inter-batch cross-reference losses, the learned features are effective for both classification and retrieval tasks. Experimental results on various benchmarks demonstrate the effectiveness of our approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信