Shengyingjie Liu, Jianwen Sun, Zhi Liu, Xian Peng, Sanya Liu
{"title":"余弦相似度的查询定向探测LSH","authors":"Shengyingjie Liu, Jianwen Sun, Zhi Liu, Xian Peng, Sanya Liu","doi":"10.1145/3033288.3033318","DOIUrl":null,"url":null,"abstract":"Locality-sensitive hashing (LSH) considered as an efficient algorithm for large-scale similarity search has become increasingly popular. Recently, many of its variants have been applied widely in high-dimensional similarity search. To overcome the drawback of requirement for a large number of hash tables, researchers proposed the famous Multi-Probe LSH (MP-LSH). It has been used to improve the utilization of hash tables. There are two major probing sequences mentioned in MP-LSH, i.e., Step-Wise Probing (SWP) sequence and Query-Directed Probing (QDP) sequence. It is verified that QDP sequence is better than SWP sequence in number of probes and query time. However, the proposed QDP sequence is based on the E2LSH. It means that the method is only adopted for Euclidean distance. For cosine similarity, SWP sequence is still the only feasible method to perform Multi-Probe LSH.\n This paper proposes an approach based on QDP sequence for cosine similarity search. Moreover, we give a set of complete theories and the corresponding proof for our method. Several experiments are performed on two types of open data sets. The experiments demonstrate our algorithm requires a small amount of probes and less time to achieve a high query quality than SWP sequence for cosine similarity.","PeriodicalId":253625,"journal":{"name":"International Conference on Network, Communication and Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Query-Directed Probing LSH for Cosine Similarity\",\"authors\":\"Shengyingjie Liu, Jianwen Sun, Zhi Liu, Xian Peng, Sanya Liu\",\"doi\":\"10.1145/3033288.3033318\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Locality-sensitive hashing (LSH) considered as an efficient algorithm for large-scale similarity search has become increasingly popular. Recently, many of its variants have been applied widely in high-dimensional similarity search. To overcome the drawback of requirement for a large number of hash tables, researchers proposed the famous Multi-Probe LSH (MP-LSH). It has been used to improve the utilization of hash tables. There are two major probing sequences mentioned in MP-LSH, i.e., Step-Wise Probing (SWP) sequence and Query-Directed Probing (QDP) sequence. It is verified that QDP sequence is better than SWP sequence in number of probes and query time. However, the proposed QDP sequence is based on the E2LSH. It means that the method is only adopted for Euclidean distance. For cosine similarity, SWP sequence is still the only feasible method to perform Multi-Probe LSH.\\n This paper proposes an approach based on QDP sequence for cosine similarity search. Moreover, we give a set of complete theories and the corresponding proof for our method. Several experiments are performed on two types of open data sets. The experiments demonstrate our algorithm requires a small amount of probes and less time to achieve a high query quality than SWP sequence for cosine similarity.\",\"PeriodicalId\":253625,\"journal\":{\"name\":\"International Conference on Network, Communication and Computing\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Network, Communication and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3033288.3033318\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Network, Communication and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3033288.3033318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Locality-sensitive hashing (LSH) considered as an efficient algorithm for large-scale similarity search has become increasingly popular. Recently, many of its variants have been applied widely in high-dimensional similarity search. To overcome the drawback of requirement for a large number of hash tables, researchers proposed the famous Multi-Probe LSH (MP-LSH). It has been used to improve the utilization of hash tables. There are two major probing sequences mentioned in MP-LSH, i.e., Step-Wise Probing (SWP) sequence and Query-Directed Probing (QDP) sequence. It is verified that QDP sequence is better than SWP sequence in number of probes and query time. However, the proposed QDP sequence is based on the E2LSH. It means that the method is only adopted for Euclidean distance. For cosine similarity, SWP sequence is still the only feasible method to perform Multi-Probe LSH.
This paper proposes an approach based on QDP sequence for cosine similarity search. Moreover, we give a set of complete theories and the corresponding proof for our method. Several experiments are performed on two types of open data sets. The experiments demonstrate our algorithm requires a small amount of probes and less time to achieve a high query quality than SWP sequence for cosine similarity.