Being Happy with the Least: Achieving α-happiness with Minimum Number of Tuples

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI:10.1109/ICDE48307.2020.00092

Min Xie, R. C. Wong, Peng Peng, V. Tsotras

{"title":"Being Happy with the Least: Achieving α-happiness with Minimum Number of Tuples","authors":"Min Xie, R. C. Wong, Peng Peng, V. Tsotras","doi":"10.1109/ICDE48307.2020.00092","DOIUrl":null,"url":null,"abstract":"When faced with a database containing millions of products, a user may be only interested in a (typically much) smaller representative subset. Various approaches were proposed to create a good representative subset that fits the user’s needs which are expressed in the form of a utility function (e.g., the top-k and diversification query). Recently, a regret minimization query was proposed: it does not require users to provide their utility functions and returns a small set of tuples such that any user’s favorite tuple in this subset is guaranteed to be not much worse than his/her favorite tuple in the whole database. In a sense, this query finds a small set of tuples that makes the user happy (i.e., not regretful) even if s/he gets the best tuple in the selected set but not the best tuple among all tuples in the database.In this paper, we study the min-size version of the regret minimization query; that is, we want to determine the least tuples needed to keep users happy at a given level. We term this problem as the α-happiness query where we quantify the user’s happiness level by a criterion, called the happiness ratio, and guarantee that each user is at least α happy with the set returned (i.e., the happiness ratio is at least α) where α is a real number from 0 to 1. As this is an NP-hard problem, we derive an approximate solution with theoretical guarantee by considering the problem from a geometric perspective. Since in practical scenarios, users are interested in achieving higher happiness levels (i.e., α is closer to 1), we performed extensive experiments for these scenarios, using both real and synthetic datasets. Our evaluations show that our algorithm outperforms the best-known previous approaches in two ways: (i) it answers the α-happiness query by returning fewer tuples to users and, (ii) it answers much faster (up to two orders of magnitude times improvement for large α).","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"95 1","pages":"1009-1020"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE48307.2020.00092","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

When faced with a database containing millions of products, a user may be only interested in a (typically much) smaller representative subset. Various approaches were proposed to create a good representative subset that fits the user’s needs which are expressed in the form of a utility function (e.g., the top-k and diversification query). Recently, a regret minimization query was proposed: it does not require users to provide their utility functions and returns a small set of tuples such that any user’s favorite tuple in this subset is guaranteed to be not much worse than his/her favorite tuple in the whole database. In a sense, this query finds a small set of tuples that makes the user happy (i.e., not regretful) even if s/he gets the best tuple in the selected set but not the best tuple among all tuples in the database.In this paper, we study the min-size version of the regret minimization query; that is, we want to determine the least tuples needed to keep users happy at a given level. We term this problem as the α-happiness query where we quantify the user’s happiness level by a criterion, called the happiness ratio, and guarantee that each user is at least α happy with the set returned (i.e., the happiness ratio is at least α) where α is a real number from 0 to 1. As this is an NP-hard problem, we derive an approximate solution with theoretical guarantee by considering the problem from a geometric perspective. Since in practical scenarios, users are interested in achieving higher happiness levels (i.e., α is closer to 1), we performed extensive experiments for these scenarios, using both real and synthetic datasets. Our evaluations show that our algorithm outperforms the best-known previous approaches in two ways: (i) it answers the α-happiness query by returning fewer tuples to users and, (ii) it answers much faster (up to two orders of magnitude times improvement for large α).

查看原文本刊更多论文

用最少的元组获得α-幸福

当面对包含数百万个产品的数据库时，用户可能只对(通常是)较小的代表性子集感兴趣。提出了各种方法来创建适合用户需求的良好代表性子集，这些需求以效用函数的形式表示(例如，top-k和多样化查询)。最近，提出了一种遗憾最小化查询:它不要求用户提供他们的实用函数，并返回一个小的元组集，这样任何用户在这个子集中最喜欢的元组都保证不会比他/她在整个数据库中最喜欢的元组差太多。从某种意义上说，这个查询找到了一小部分让用户满意(即不会后悔)的元组，即使他/她得到了所选集合中最好的元组，但不是数据库中所有元组中最好的元组。在本文中，我们研究了最小尺寸版本的后悔最小化查询;也就是说，我们想要确定在给定级别上保持用户满意所需的最小元组。我们将这个问题称为α-幸福查询，其中我们通过一个称为幸福比率的标准来量化用户的幸福水平，并保证每个用户对返回的集合至少感到α满意(即，幸福比率至少为α)，其中α是从0到1的实数。由于这是一个np困难问题，我们从几何角度考虑问题，得到了一个具有理论保证的近似解。由于在实际场景中，用户对获得更高的幸福水平(即，α更接近1)感兴趣，因此我们使用真实和合成数据集对这些场景进行了广泛的实验。我们的评估表明，我们的算法在两个方面优于之前最著名的方法:(i)它通过向用户返回更少的元组来回答α-幸福查询，(ii)它的回答速度要快得多(对于大α，它的速度提高了两个数量级)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE 36th International Conference on Data Engineering (ICDE)

自引率

0.00%

发文量