Evaluating top-k queries over Web-accessible databases

Proceedings 18th International Conference on Data Engineering Pub Date : 2002-02-26 DOI:10.1109/ICDE.2002.994751

Nicolas Bruno, L. Gravano, A. Marian

{"title":"Evaluating top-k queries over Web-accessible databases","authors":"Nicolas Bruno, L. Gravano, A. Marian","doi":"10.1109/ICDE.2002.994751","DOIUrl":null,"url":null,"abstract":"A query to a Web search engine usually consists of a list of keywords, to which the search engine responds with the best or \"top\" k pages for the query. This top-k query model is prevalent over multimedia collections in general, but also over plain relational data for certain applications. For example, consider a relation with information on available restaurants, including their location, price range for one diner, and overall food rating. A user who queries such a relation might simply specify the user's location and target price range, and expect in return the best 10 restaurants in terms of some combination-of proximity to the user, closeness of match to the target price range, and overall food rating. Processing such top-k queries efficiently is challenging for a number of reasons. One critical such reason is that, in many Web applications, the relation attributes might not be available other than through external Web-accessible form interfaces, which we will have to query repeatedly for a potentially large set of candidate objects. In this paper, we study how to process top-k queries efficiently in this setting, where the attributes for which users specify target values might be handled by external, autonomous sources with a variety of access interfaces. We present several algorithms for processing such queries, and evaluate them thoroughly using both synthetic and real Web-accessible data.","PeriodicalId":191529,"journal":{"name":"Proceedings 18th International Conference on Data Engineering","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"559","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 18th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2002.994751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 559

Abstract

A query to a Web search engine usually consists of a list of keywords, to which the search engine responds with the best or "top" k pages for the query. This top-k query model is prevalent over multimedia collections in general, but also over plain relational data for certain applications. For example, consider a relation with information on available restaurants, including their location, price range for one diner, and overall food rating. A user who queries such a relation might simply specify the user's location and target price range, and expect in return the best 10 restaurants in terms of some combination-of proximity to the user, closeness of match to the target price range, and overall food rating. Processing such top-k queries efficiently is challenging for a number of reasons. One critical such reason is that, in many Web applications, the relation attributes might not be available other than through external Web-accessible form interfaces, which we will have to query repeatedly for a potentially large set of candidate objects. In this paper, we study how to process top-k queries efficiently in this setting, where the attributes for which users specify target values might be handled by external, autonomous sources with a variety of access interfaces. We present several algorithms for processing such queries, and evaluate them thoroughly using both synthetic and real Web-accessible data.

查看原文本刊更多论文

评估web可访问数据库上的top-k查询

对Web搜索引擎的查询通常由关键字列表组成，搜索引擎对关键字列表做出响应，给出该查询的最佳或“最热门”k个页面。这种top-k查询模型一般用于多媒体集合，但也用于某些应用程序的普通关系数据。例如，考虑与可用餐馆信息的关系，包括它们的位置、用餐者的价格范围和总体食物评级。查询这种关系的用户可以简单地指定用户的位置和目标价格范围，并期望根据与用户的接近程度、与目标价格范围的匹配程度以及总体食物评级等组合获得最好的10家餐馆。由于许多原因，高效地处理此类top-k查询具有挑战性。其中一个关键的原因是，在许多Web应用程序中，关系属性可能只能通过外部Web可访问的表单接口使用，我们将不得不反复查询这些接口以获取潜在的大量候选对象。在本文中，我们研究了如何在这种设置中有效地处理top-k查询，其中用户指定目标值的属性可能由具有各种访问接口的外部自治源处理。我们提出了几种处理此类查询的算法，并使用合成数据和真实的web可访问数据对它们进行了全面评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings 18th International Conference on Data Engineering

自引率

0.00%

发文量