用最少的元组获得α-幸福

2020 IEEE 36th International Conference on Data Engineering (ICDE) Pub Date : 2020-04-01 DOI:10.1109/ICDE48307.2020.00092

Min Xie, R. C. Wong, Peng Peng, V. Tsotras

{"title":"用最少的元组获得α-幸福","authors":"Min Xie, R. C. Wong, Peng Peng, V. Tsotras","doi":"10.1109/ICDE48307.2020.00092","DOIUrl":null,"url":null,"abstract":"When faced with a database containing millions of products, a user may be only interested in a (typically much) smaller representative subset. Various approaches were proposed to create a good representative subset that fits the user’s needs which are expressed in the form of a utility function (e.g., the top-k and diversification query). Recently, a regret minimization query was proposed: it does not require users to provide their utility functions and returns a small set of tuples such that any user’s favorite tuple in this subset is guaranteed to be not much worse than his/her favorite tuple in the whole database. In a sense, this query finds a small set of tuples that makes the user happy (i.e., not regretful) even if s/he gets the best tuple in the selected set but not the best tuple among all tuples in the database.In this paper, we study the min-size version of the regret minimization query; that is, we want to determine the least tuples needed to keep users happy at a given level. We term this problem as the α-happiness query where we quantify the user’s happiness level by a criterion, called the happiness ratio, and guarantee that each user is at least α happy with the set returned (i.e., the happiness ratio is at least α) where α is a real number from 0 to 1. As this is an NP-hard problem, we derive an approximate solution with theoretical guarantee by considering the problem from a geometric perspective. Since in practical scenarios, users are interested in achieving higher happiness levels (i.e., α is closer to 1), we performed extensive experiments for these scenarios, using both real and synthetic datasets. Our evaluations show that our algorithm outperforms the best-known previous approaches in two ways: (i) it answers the α-happiness query by returning fewer tuples to users and, (ii) it answers much faster (up to two orders of magnitude times improvement for large α).","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"95 1","pages":"1009-1020"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Being Happy with the Least: Achieving α-happiness with Minimum Number of Tuples\",\"authors\":\"Min Xie, R. C. Wong, Peng Peng, V. Tsotras\",\"doi\":\"10.1109/ICDE48307.2020.00092\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When faced with a database containing millions of products, a user may be only interested in a (typically much) smaller representative subset. Various approaches were proposed to create a good representative subset that fits the user’s needs which are expressed in the form of a utility function (e.g., the top-k and diversification query). Recently, a regret minimization query was proposed: it does not require users to provide their utility functions and returns a small set of tuples such that any user’s favorite tuple in this subset is guaranteed to be not much worse than his/her favorite tuple in the whole database. In a sense, this query finds a small set of tuples that makes the user happy (i.e., not regretful) even if s/he gets the best tuple in the selected set but not the best tuple among all tuples in the database.In this paper, we study the min-size version of the regret minimization query; that is, we want to determine the least tuples needed to keep users happy at a given level. We term this problem as the α-happiness query where we quantify the user’s happiness level by a criterion, called the happiness ratio, and guarantee that each user is at least α happy with the set returned (i.e., the happiness ratio is at least α) where α is a real number from 0 to 1. As this is an NP-hard problem, we derive an approximate solution with theoretical guarantee by considering the problem from a geometric perspective. Since in practical scenarios, users are interested in achieving higher happiness levels (i.e., α is closer to 1), we performed extensive experiments for these scenarios, using both real and synthetic datasets. Our evaluations show that our algorithm outperforms the best-known previous approaches in two ways: (i) it answers the α-happiness query by returning fewer tuples to users and, (ii) it answers much faster (up to two orders of magnitude times improvement for large α).\",\"PeriodicalId\":6709,\"journal\":{\"name\":\"2020 IEEE 36th International Conference on Data Engineering (ICDE)\",\"volume\":\"95 1\",\"pages\":\"1009-1020\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 36th International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE48307.2020.00092\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE48307.2020.00092","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

当面对包含数百万个产品的数据库时，用户可能只对(通常是)较小的代表性子集感兴趣。提出了各种方法来创建适合用户需求的良好代表性子集，这些需求以效用函数的形式表示(例如，top-k和多样化查询)。最近，提出了一种遗憾最小化查询:它不要求用户提供他们的实用函数，并返回一个小的元组集，这样任何用户在这个子集中最喜欢的元组都保证不会比他/她在整个数据库中最喜欢的元组差太多。从某种意义上说，这个查询找到了一小部分让用户满意(即不会后悔)的元组，即使他/她得到了所选集合中最好的元组，但不是数据库中所有元组中最好的元组。在本文中，我们研究了最小尺寸版本的后悔最小化查询;也就是说，我们想要确定在给定级别上保持用户满意所需的最小元组。我们将这个问题称为α-幸福查询，其中我们通过一个称为幸福比率的标准来量化用户的幸福水平，并保证每个用户对返回的集合至少感到α满意(即，幸福比率至少为α)，其中α是从0到1的实数。由于这是一个np困难问题，我们从几何角度考虑问题，得到了一个具有理论保证的近似解。由于在实际场景中，用户对获得更高的幸福水平(即，α更接近1)感兴趣，因此我们使用真实和合成数据集对这些场景进行了广泛的实验。我们的评估表明，我们的算法在两个方面优于之前最著名的方法:(i)它通过向用户返回更少的元组来回答α-幸福查询，(ii)它的回答速度要快得多(对于大α，它的速度提高了两个数量级)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Being Happy with the Least: Achieving α-happiness with Minimum Number of Tuples

When faced with a database containing millions of products, a user may be only interested in a (typically much) smaller representative subset. Various approaches were proposed to create a good representative subset that fits the user’s needs which are expressed in the form of a utility function (e.g., the top-k and diversification query). Recently, a regret minimization query was proposed: it does not require users to provide their utility functions and returns a small set of tuples such that any user’s favorite tuple in this subset is guaranteed to be not much worse than his/her favorite tuple in the whole database. In a sense, this query finds a small set of tuples that makes the user happy (i.e., not regretful) even if s/he gets the best tuple in the selected set but not the best tuple among all tuples in the database.In this paper, we study the min-size version of the regret minimization query; that is, we want to determine the least tuples needed to keep users happy at a given level. We term this problem as the α-happiness query where we quantify the user’s happiness level by a criterion, called the happiness ratio, and guarantee that each user is at least α happy with the set returned (i.e., the happiness ratio is at least α) where α is a real number from 0 to 1. As this is an NP-hard problem, we derive an approximate solution with theoretical guarantee by considering the problem from a geometric perspective. Since in practical scenarios, users are interested in achieving higher happiness levels (i.e., α is closer to 1), we performed extensive experiments for these scenarios, using both real and synthetic datasets. Our evaluations show that our algorithm outperforms the best-known previous approaches in two ways: (i) it answers the α-happiness query by returning fewer tuples to users and, (ii) it answers much faster (up to two orders of magnitude times improvement for large α).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE 36th International Conference on Data Engineering (ICDE)

自引率

0.00%

发文量