Composable core-sets for diversity and coverage maximization

P. Indyk, S. Mahabadi, Mohammad Mahdian, V. Mirrokni
{"title":"Composable core-sets for diversity and coverage maximization","authors":"P. Indyk, S. Mahabadi, Mohammad Mahdian, V. Mirrokni","doi":"10.1145/2594538.2594560","DOIUrl":null,"url":null,"abstract":"In this paper we consider efficient construction of \"composable core-sets\" for basic diversity and coverage maximization problems. A core-set for a point-set in a metric space is a subset of the point-set with the property that an approximate solution to the whole point-set can be obtained given the core-set alone. A composable core-set has the property that for a collection of sets, the approximate solution to the union of the sets in the collection can be obtained given the union of the composable core-sets for the point sets in the collection. Using composable core-sets one can obtain efficient solutions to a wide variety of massive data processing applications, including nearest neighbor search, streaming algorithms and map-reduce computation. Our main results are algorithms for constructing composable core-sets for several notions of \"diversity objective functions\", a topic that attracted a significant amount of research over the last few years. The composable core-sets we construct are small and accurate: their approximation factor almost matches that of the best \"off-line\" algorithms for the relevant optimization problems (up to a constant factor). Moreover, we also show applications of our results to diverse nearest neighbor search, streaming algorithms and map-reduce computation. Finally, we show that for an alternative notion of diversity maximization based on the maximum coverage problem small composable core-sets do not exist.","PeriodicalId":302451,"journal":{"name":"Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems","volume":"407 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"135","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2594538.2594560","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 135

Abstract

In this paper we consider efficient construction of "composable core-sets" for basic diversity and coverage maximization problems. A core-set for a point-set in a metric space is a subset of the point-set with the property that an approximate solution to the whole point-set can be obtained given the core-set alone. A composable core-set has the property that for a collection of sets, the approximate solution to the union of the sets in the collection can be obtained given the union of the composable core-sets for the point sets in the collection. Using composable core-sets one can obtain efficient solutions to a wide variety of massive data processing applications, including nearest neighbor search, streaming algorithms and map-reduce computation. Our main results are algorithms for constructing composable core-sets for several notions of "diversity objective functions", a topic that attracted a significant amount of research over the last few years. The composable core-sets we construct are small and accurate: their approximation factor almost matches that of the best "off-line" algorithms for the relevant optimization problems (up to a constant factor). Moreover, we also show applications of our results to diverse nearest neighbor search, streaming algorithms and map-reduce computation. Finally, we show that for an alternative notion of diversity maximization based on the maximum coverage problem small composable core-sets do not exist.
可组合的核心集,以实现多样性和覆盖最大化
本文研究了基本分集和覆盖最大化问题的“可组合核心集”的有效构造。度量空间中点集的核心集是点集的子集,它具有这样的性质:给定核心集,可以得到整个点集的近似解。可组合核集具有这样的性质:对于集合的集合,给定集合中点集的可组合核集的并集,可以得到集合中集合的并集的近似解。使用可组合核心集,可以获得各种大规模数据处理应用的有效解决方案,包括最近邻搜索、流算法和地图约简计算。我们的主要成果是为“多样性目标函数”的几个概念构建可组合核心集的算法,这个主题在过去几年中吸引了大量的研究。我们构建的可组合核心集小而准确:它们的近似因子几乎与相关优化问题的最佳“离线”算法相匹配(直到一个常数因子)。此外,我们还展示了我们的结果在各种最近邻搜索,流算法和地图约简计算中的应用。最后,我们证明了基于最大覆盖问题的多样性最大化的替代概念不存在小的可组合核心集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信