P. Indyk, S. Mahabadi, Mohammad Mahdian, V. Mirrokni
{"title":"Composable core-sets for diversity and coverage maximization","authors":"P. Indyk, S. Mahabadi, Mohammad Mahdian, V. Mirrokni","doi":"10.1145/2594538.2594560","DOIUrl":null,"url":null,"abstract":"In this paper we consider efficient construction of \"composable core-sets\" for basic diversity and coverage maximization problems. A core-set for a point-set in a metric space is a subset of the point-set with the property that an approximate solution to the whole point-set can be obtained given the core-set alone. A composable core-set has the property that for a collection of sets, the approximate solution to the union of the sets in the collection can be obtained given the union of the composable core-sets for the point sets in the collection. Using composable core-sets one can obtain efficient solutions to a wide variety of massive data processing applications, including nearest neighbor search, streaming algorithms and map-reduce computation. Our main results are algorithms for constructing composable core-sets for several notions of \"diversity objective functions\", a topic that attracted a significant amount of research over the last few years. The composable core-sets we construct are small and accurate: their approximation factor almost matches that of the best \"off-line\" algorithms for the relevant optimization problems (up to a constant factor). Moreover, we also show applications of our results to diverse nearest neighbor search, streaming algorithms and map-reduce computation. Finally, we show that for an alternative notion of diversity maximization based on the maximum coverage problem small composable core-sets do not exist.","PeriodicalId":302451,"journal":{"name":"Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems","volume":"407 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"135","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2594538.2594560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 135
Abstract
In this paper we consider efficient construction of "composable core-sets" for basic diversity and coverage maximization problems. A core-set for a point-set in a metric space is a subset of the point-set with the property that an approximate solution to the whole point-set can be obtained given the core-set alone. A composable core-set has the property that for a collection of sets, the approximate solution to the union of the sets in the collection can be obtained given the union of the composable core-sets for the point sets in the collection. Using composable core-sets one can obtain efficient solutions to a wide variety of massive data processing applications, including nearest neighbor search, streaming algorithms and map-reduce computation. Our main results are algorithms for constructing composable core-sets for several notions of "diversity objective functions", a topic that attracted a significant amount of research over the last few years. The composable core-sets we construct are small and accurate: their approximation factor almost matches that of the best "off-line" algorithms for the relevant optimization problems (up to a constant factor). Moreover, we also show applications of our results to diverse nearest neighbor search, streaming algorithms and map-reduce computation. Finally, we show that for an alternative notion of diversity maximization based on the maximum coverage problem small composable core-sets do not exist.