Max-Sum diversification, monotone submodular functions and dynamic updates

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems Pub Date : 2012-05-21 DOI:10.1145/2213556.2213580

A. Borodin, Hyun Chul Lee, Yuli Ye

{"title":"Max-Sum diversification, monotone submodular functions and dynamic updates","authors":"A. Borodin, Hyun Chul Lee, Yuli Ye","doi":"10.1145/2213556.2213580","DOIUrl":null,"url":null,"abstract":"Result diversification has many important applications in databases, operations research, information retrieval, and finance. In this paper, we study and extend a particular version of result diversification, known as max-sum diversification. More specifically, we consider the setting where we are given a set of elements in a metric space and a set valuation function f defined on every subset. For any given subset S, the overall objective is a linear combination of f(S) and the sum of the distances induced by S. The goal is to find a subset S satisfying some constraints that maximizes the overall objective.\n This problem is first studied by Gollapudi and Sharma in [17] for modular set functions and for sets satisfying a cardinality constraint (uniform matroids). In their paper, they give a 2-approximation algorithm by reducing to an earlier result in [20]. The first part of this paper considers an extension of the modular case to the monotone submodular case, for which the algorithm in [17] no longer applies. Interestingly, we are able to maintain the same 2-approximation using a natural, but different greedy algorithm. We then further extend the problem by considering any matroid constraint and show that a natural single swap local search algorithm provides a 2-approximation in this more general setting. This extends the Nemhauser, Wolsey and Fisher approximation result [20] for the problem of submodular function maximization subject to a matroid constraint (without the distance function component).\n The second part of the paper focuses on dynamic updates for the modular case. Suppose we have a good initial approximate solution and then there is a single weight-perturbation either on the valuation of an element or on the distance between two elements. Given that users expect some stability in the results they see, we ask how easy is it to maintain a good approximation without significantly changing the initial set. We measure this by the number of updates, where each update is a swap of a single element in the current solution with a single element outside the current solution. We show that we can maintain an approximation ratio of 3 by just a single update if the perturbation is not too large.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"1 1","pages":"155-166"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2213556.2213580","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 39

Abstract

Result diversification has many important applications in databases, operations research, information retrieval, and finance. In this paper, we study and extend a particular version of result diversification, known as max-sum diversification. More specifically, we consider the setting where we are given a set of elements in a metric space and a set valuation function f defined on every subset. For any given subset S, the overall objective is a linear combination of f(S) and the sum of the distances induced by S. The goal is to find a subset S satisfying some constraints that maximizes the overall objective. This problem is first studied by Gollapudi and Sharma in [17] for modular set functions and for sets satisfying a cardinality constraint (uniform matroids). In their paper, they give a 2-approximation algorithm by reducing to an earlier result in [20]. The first part of this paper considers an extension of the modular case to the monotone submodular case, for which the algorithm in [17] no longer applies. Interestingly, we are able to maintain the same 2-approximation using a natural, but different greedy algorithm. We then further extend the problem by considering any matroid constraint and show that a natural single swap local search algorithm provides a 2-approximation in this more general setting. This extends the Nemhauser, Wolsey and Fisher approximation result [20] for the problem of submodular function maximization subject to a matroid constraint (without the distance function component). The second part of the paper focuses on dynamic updates for the modular case. Suppose we have a good initial approximate solution and then there is a single weight-perturbation either on the valuation of an element or on the distance between two elements. Given that users expect some stability in the results they see, we ask how easy is it to maintain a good approximation without significantly changing the initial set. We measure this by the number of updates, where each update is a swap of a single element in the current solution with a single element outside the current solution. We show that we can maintain an approximation ratio of 3 by just a single update if the perturbation is not too large.

查看原文本刊更多论文

最大和多样化，单调子模函数和动态更新

结果多样化在数据库、运筹学、信息检索、金融等领域有着重要的应用。本文研究并推广了结果分散的一种特殊形式，即最大和分散。更具体地说，我们考虑在度量空间中给定一组元素和在每个子集上定义的一组估值函数f的设置。对于任意给定的子集S，总体目标是f(S)和S引起的距离之和的线性组合。目标是找到一个满足某些约束的子集S，使总体目标最大化。Gollapudi和Sharma在[17]中首先对模集函数和满足基数约束的集合(一致拟阵)进行了研究。在他们的论文中，他们给出了一个2逼近算法，将其还原为[20]中较早的结果。本文的第一部分考虑将模情况推广到单调次模情况，在这种情况下，[17]中的算法不再适用。有趣的是，我们可以使用一种自然但不同的贪心算法来维持相同的2-近似。然后，我们通过考虑任何矩阵约束进一步扩展了这个问题，并表明在这种更一般的设置中，自然的单交换局部搜索算法提供了2逼近。这扩展了Nemhauser, Wolsey和Fisher在矩阵约束下(不含距离函数分量)的次模函数最大化问题的近似结果[20]。论文的第二部分着重于模块化案例的动态更新。假设我们有一个很好的初始近似解，然后在一个元素的估值或两个元素之间的距离上有一个单一的权重扰动。考虑到用户期望他们看到的结果具有一定的稳定性，我们要问在不显著改变初始集的情况下保持一个良好的近似值有多容易。我们通过更新的次数来衡量这一点，其中每次更新都是将当前解决方案中的单个元素与当前解决方案之外的单个元素进行交换。我们表明，如果扰动不是太大，我们可以通过一次更新保持近似比为3。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

CiteScore

4.40

自引率

0.00%

发文量