{"title":"Locality Sensitive Hashing for Set-Queries, Motivated by Group Recommendations","authors":"Haim Kaplan, J. Tenenbaum","doi":"10.4230/LIPIcs.SWAT.2020.28","DOIUrl":null,"url":null,"abstract":"Locality Sensitive Hashing (LSH) is an effective method to index a set of points such that we can efficiently find the nearest neighbors of a query point. We extend this method to our novel Set-query LSH (SLSH), such that it can find the nearest neighbors of a set of points, given as a query. \nLet $ s(x,y) $ be the similarity between two points $ x $ and $ y $. We define a similarity between a set $ Q$ and a point $ x $ by aggregating the similarities $ s(p,x) $ for all $ p\\in Q $. For example, we can take $ s(p,x) $ to be the angular similarity between $ p $ and $ x $ (i.e., $1-{\\angle (x,p)}/{\\pi}$), and aggregate by arithmetic or geometric averaging, or taking the lowest similarity. \nWe develop locality sensitive hash families and data structures for a large set of such arithmetic and geometric averaging similarities, and analyze their collision probabilities. We also establish an analogous framework and hash families for distance functions. Specifically, we give a structure for the euclidean distance aggregated by either averaging or taking the maximum. \nWe leverage SLSH to solve a geometric extension of the approximate near neighbors problem. In this version, we consider a metric for which the unit ball is an ellipsoid and its orientation is specified with the query. \nAn important application that motivates our work is group recommendation systems. Such a system embeds movies and users in the same feature space, and the task of recommending a movie for a group to watch together, translates to a set-query $ Q $ using an appropriate similarity.","PeriodicalId":447445,"journal":{"name":"Scandinavian Workshop on Algorithm Theory","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scandinavian Workshop on Algorithm Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.SWAT.2020.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Locality Sensitive Hashing (LSH) is an effective method to index a set of points such that we can efficiently find the nearest neighbors of a query point. We extend this method to our novel Set-query LSH (SLSH), such that it can find the nearest neighbors of a set of points, given as a query.
Let $ s(x,y) $ be the similarity between two points $ x $ and $ y $. We define a similarity between a set $ Q$ and a point $ x $ by aggregating the similarities $ s(p,x) $ for all $ p\in Q $. For example, we can take $ s(p,x) $ to be the angular similarity between $ p $ and $ x $ (i.e., $1-{\angle (x,p)}/{\pi}$), and aggregate by arithmetic or geometric averaging, or taking the lowest similarity.
We develop locality sensitive hash families and data structures for a large set of such arithmetic and geometric averaging similarities, and analyze their collision probabilities. We also establish an analogous framework and hash families for distance functions. Specifically, we give a structure for the euclidean distance aggregated by either averaging or taking the maximum.
We leverage SLSH to solve a geometric extension of the approximate near neighbors problem. In this version, we consider a metric for which the unit ball is an ellipsoid and its orientation is specified with the query.
An important application that motivates our work is group recommendation systems. Such a system embeds movies and users in the same feature space, and the task of recommending a movie for a group to watch together, translates to a set-query $ Q $ using an appropriate similarity.