基于随机森林线性投影哈希的灵活二进制码学习

2014 22nd International Conference on Pattern Recognition Pub Date : 2014-12-08 DOI:10.1109/ICPR.2014.464

Shuze Du, Wei Zhang, Shifeng Chen, Y. Wen

{"title":"基于随机森林线性投影哈希的灵活二进制码学习","authors":"Shuze Du, Wei Zhang, Shifeng Chen, Y. Wen","doi":"10.1109/ICPR.2014.464","DOIUrl":null,"url":null,"abstract":"Existing linear projection based hashing methods have witnessed many progresses in finding the approximate nearest neighbor(s) of a given query. They perform well when using a short code. But their code length depends on the original data dimension, thus their performance can not be further improved with higher number of bits for low dimensional data. In addition, in the case of high dimensional data, it is not a good choice to produce each bit by a sign function. In this paper, we propose a novel random forest based approach to cope with the above shortcomings. The bits are obtained by recording the paths when a point traversing each tree in the forest. Then we propose a new metric to calculate the similarity between any two codes. Experimental results on two large benchmark datasets show that our approach outperforms its counterparts and demonstrate its superiority over the existing state-of-the-art hashing methods for descriptor retrieval.","PeriodicalId":142159,"journal":{"name":"2014 22nd International Conference on Pattern Recognition","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Learning Flexible Binary Code for Linear Projection Based Hashing with Random Forest\",\"authors\":\"Shuze Du, Wei Zhang, Shifeng Chen, Y. Wen\",\"doi\":\"10.1109/ICPR.2014.464\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing linear projection based hashing methods have witnessed many progresses in finding the approximate nearest neighbor(s) of a given query. They perform well when using a short code. But their code length depends on the original data dimension, thus their performance can not be further improved with higher number of bits for low dimensional data. In addition, in the case of high dimensional data, it is not a good choice to produce each bit by a sign function. In this paper, we propose a novel random forest based approach to cope with the above shortcomings. The bits are obtained by recording the paths when a point traversing each tree in the forest. Then we propose a new metric to calculate the similarity between any two codes. Experimental results on two large benchmark datasets show that our approach outperforms its counterparts and demonstrate its superiority over the existing state-of-the-art hashing methods for descriptor retrieval.\",\"PeriodicalId\":142159,\"journal\":{\"name\":\"2014 22nd International Conference on Pattern Recognition\",\"volume\":\"140 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 22nd International Conference on Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPR.2014.464\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 22nd International Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPR.2014.464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

现有的基于线性投影的哈希方法在查找给定查询的近似最近邻居方面取得了许多进展。它们在使用短代码时表现良好。但是它们的码长依赖于原始数据维数，因此对于低维数据，它们的性能不能随着比特数的增加而进一步提高。此外，在高维数据的情况下，通过符号函数生成每个位不是一个好的选择。在本文中，我们提出了一种新的基于随机森林的方法来克服上述缺点。当一个点遍历森林中的每棵树时，通过记录路径获得比特。然后，我们提出了一个新的度量来计算任意两个代码之间的相似度。在两个大型基准数据集上的实验结果表明，我们的方法优于同类方法，并证明了其优于现有的描述符检索的最先进的哈希方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning Flexible Binary Code for Linear Projection Based Hashing with Random Forest

Existing linear projection based hashing methods have witnessed many progresses in finding the approximate nearest neighbor(s) of a given query. They perform well when using a short code. But their code length depends on the original data dimension, thus their performance can not be further improved with higher number of bits for low dimensional data. In addition, in the case of high dimensional data, it is not a good choice to produce each bit by a sign function. In this paper, we propose a novel random forest based approach to cope with the above shortcomings. The bits are obtained by recording the paths when a point traversing each tree in the forest. Then we propose a new metric to calculate the similarity between any two codes. Experimental results on two large benchmark datasets show that our approach outperforms its counterparts and demonstrate its superiority over the existing state-of-the-art hashing methods for descriptor retrieval.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 22nd International Conference on Pattern Recognition

自引率

0.00%

发文量