Ranking Methods for Query Relaxation in Book Search

2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI) Pub Date : 2018-12-01 DOI:10.1109/WI.2018.00-51

Momo Kyozuka, Keishi Tajima

{"title":"Ranking Methods for Query Relaxation in Book Search","authors":"Momo Kyozuka, Keishi Tajima","doi":"10.1109/WI.2018.00-51","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a method to support book search tasks where users issue a query describing the story in a book to a database storing brief descriptions of books. Such a query may include extraneous words that do not appear in the brief description of the book in the database. In addition, queries by users who only have vague memories of the stories may even include wrong keywords. In order to find books with such queries, we need a query relaxation scheme. In the scheme we propose in this paper, we classify words in a user query describing a book into four types based on their roles in the description, and for each type, we estimate the probability of their appearance in the description in the database. We estimate it based on statistics we obtained through an analysis of an archive of queries and answers in the past. We then generate relaxed queries by using every subset of the words in the user query, and rank the queries based on the expected ranking of the target book in their results. The expected ranking of the target book in a query result is estimated by using appearance probabilities of words in the query and the number of books matching the query. We conducted an experiment for comparing various ranking schemes by their MRR, and our ranking scheme that uses both the word appearance probabilities and the number of matching books showed a good performance.","PeriodicalId":405966,"journal":{"name":"2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2018.00-51","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In this paper, we propose a method to support book search tasks where users issue a query describing the story in a book to a database storing brief descriptions of books. Such a query may include extraneous words that do not appear in the brief description of the book in the database. In addition, queries by users who only have vague memories of the stories may even include wrong keywords. In order to find books with such queries, we need a query relaxation scheme. In the scheme we propose in this paper, we classify words in a user query describing a book into four types based on their roles in the description, and for each type, we estimate the probability of their appearance in the description in the database. We estimate it based on statistics we obtained through an analysis of an archive of queries and answers in the past. We then generate relaxed queries by using every subset of the words in the user query, and rank the queries based on the expected ranking of the target book in their results. The expected ranking of the target book in a query result is estimated by using appearance probabilities of words in the query and the number of books matching the query. We conducted an experiment for comparing various ranking schemes by their MRR, and our ranking scheme that uses both the word appearance probabilities and the number of matching books showed a good performance.

查看原文本刊更多论文

图书搜索中查询松弛的排序方法

在本文中，我们提出了一种支持图书搜索任务的方法，用户向存储图书简要描述的数据库发出描述图书故事的查询。这样的查询可能包括数据库中书的简要描述中没有出现的无关单词。此外，对故事只有模糊记忆的用户的查询甚至可能包含错误的关键词。为了找到具有此类查询的图书，我们需要一个查询松弛方案。在本文提出的方案中，我们将描述一本书的用户查询中的单词根据其在描述中的角色分为四种类型，对于每种类型，我们估计它们在数据库中的描述中出现的概率。我们根据我们通过分析过去的查询和回答档案获得的统计数据来估计它。然后，我们使用用户查询中的每个单词子集生成轻松的查询，并根据目标图书在查询结果中的预期排名对查询进行排序。通过使用查询中单词的出现概率和与查询匹配的图书数量来估计目标图书在查询结果中的预期排名。我们通过MRR对各种排序方案进行了实验比较，我们的排序方案同时使用了单词出现概率和匹配书籍的数量，表现出了良好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)

自引率

0.00%

发文量