Ranking objects based on relationships

Proceedings of the 2006 ACM SIGMOD international conference on Management of data Pub Date : 2006-06-27 DOI:10.1145/1142473.1142516

K. Chakrabarti, Venkatesh Ganti, Jiawei Han, Dong Xin

{"title":"Ranking objects based on relationships","authors":"K. Chakrabarti, Venkatesh Ganti, Jiawei Han, Dong Xin","doi":"10.1145/1142473.1142516","DOIUrl":null,"url":null,"abstract":"In many document collections, documents are related to objects such as document authors, products described in the document, or persons referred to in the document. In many applications, the goal is to find these objects that best match a set of keywords. However, the keywords may not necessarily occur in the target objects; they occur only in the documents. For example, in a product review database, a user might search for names of products (say, laptops) using keywords like \"lightweight\" and \"business use\" that occur only in the reviews but not in the names of laptops. In order to answer these queries, we need to exploit relationships between documents containing the keywords and the target objects related to those documents. Current keyword query paradigms do not exploit these relationships effectively and hence are inefficient for these queries.In this paper, we consider a class of queries called the \"object finder\" queries. Our main intuition is to exploit the relationships between searchable documents and related objects and further \"aggregate\" the document scores from these relationships in order to find the best ranking target objects. Building upon existing keyword search engines such as full text search, we design efficient algorithms that exploit the requirement of only the best k target objects to terminate early. The main challenge here is to push early termination through blocking operators such as group by and aggregation. Our experiments with real datasets and workloads demonstrate the effectiveness of our techniques. Although we present our techniques in the context of keyword search, our techniques apply to other types of ranked searches (e.g., multimedia search) as well.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1142473.1142516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 63

Abstract

In many document collections, documents are related to objects such as document authors, products described in the document, or persons referred to in the document. In many applications, the goal is to find these objects that best match a set of keywords. However, the keywords may not necessarily occur in the target objects; they occur only in the documents. For example, in a product review database, a user might search for names of products (say, laptops) using keywords like "lightweight" and "business use" that occur only in the reviews but not in the names of laptops. In order to answer these queries, we need to exploit relationships between documents containing the keywords and the target objects related to those documents. Current keyword query paradigms do not exploit these relationships effectively and hence are inefficient for these queries.In this paper, we consider a class of queries called the "object finder" queries. Our main intuition is to exploit the relationships between searchable documents and related objects and further "aggregate" the document scores from these relationships in order to find the best ranking target objects. Building upon existing keyword search engines such as full text search, we design efficient algorithms that exploit the requirement of only the best k target objects to terminate early. The main challenge here is to push early termination through blocking operators such as group by and aggregation. Our experiments with real datasets and workloads demonstrate the effectiveness of our techniques. Although we present our techniques in the context of keyword search, our techniques apply to other types of ranked searches (e.g., multimedia search) as well.

查看原文本刊更多论文

根据关系对对象进行排序

在许多文档集合中，文档都与文档作者、文档中描述的产品或文档中提到的人员等对象相关。在许多应用程序中，目标是找到与一组关键字最匹配的对象。但是，关键字不一定会出现在目标对象中;它们只出现在文档中。例如，在产品评论数据库中，用户可能会使用“轻量级”和“商业用途”等关键字搜索产品名称(比如笔记本电脑)，这些关键字只出现在评论中，而不会出现在笔记本电脑的名称中。为了回答这些查询，我们需要利用包含关键字的文档与与这些文档相关的目标对象之间的关系。当前的关键字查询范例不能有效地利用这些关系，因此对于这些查询来说效率很低。在本文中，我们考虑一类称为“对象查找器”的查询。我们的主要直觉是利用可搜索文档和相关对象之间的关系，并进一步从这些关系中“聚合”文档分数，以找到排名最佳的目标对象。在现有关键字搜索引擎(如全文搜索)的基础上，我们设计了高效的算法，该算法只要求最好的k个目标对象提前终止。这里的主要挑战是通过阻塞操作符(如group by和aggregation)来推动早期终止。我们对真实数据集和工作负载的实验证明了我们技术的有效性。虽然我们在关键词搜索的背景下展示了我们的技术，但我们的技术也适用于其他类型的排名搜索(例如，多媒体搜索)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2006 ACM SIGMOD international conference on Management of data

自引率

0.00%

发文量