Evaluation of Pseudo-Relevance Feedback using Wikipedia

Murtadha Aljubran
{"title":"Evaluation of Pseudo-Relevance Feedback using Wikipedia","authors":"Murtadha Aljubran","doi":"10.1145/3342827.3342845","DOIUrl":null,"url":null,"abstract":"Users have specific information needs which are expressed in short queries to information retrieval systems. The queries are unstructured, and they tend to be short and ambiguous in most cases. Using the shallow language statistics including probabilistic or language models such as BM25 or Indri respectively can enhance the retrieval system metrics like Mean Average Precision (MAP). However, such methods depend on query terms and their presence in the retrieved document to define relevance. Query expansion is a technique that can be used to overcome this problem by expanding the query with terms from an initial top few relevant documents. The question that we try to answer is whether the quality of the corpus used for expansion produce a significant improvement MAP and precision at top 30 retrieved documents. We show that the quality and the selection criteria of expansion documents are important factors in query expansion performance.","PeriodicalId":254461,"journal":{"name":"Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3342827.3342845","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Users have specific information needs which are expressed in short queries to information retrieval systems. The queries are unstructured, and they tend to be short and ambiguous in most cases. Using the shallow language statistics including probabilistic or language models such as BM25 or Indri respectively can enhance the retrieval system metrics like Mean Average Precision (MAP). However, such methods depend on query terms and their presence in the retrieved document to define relevance. Query expansion is a technique that can be used to overcome this problem by expanding the query with terms from an initial top few relevant documents. The question that we try to answer is whether the quality of the corpus used for expansion produce a significant improvement MAP and precision at top 30 retrieved documents. We show that the quality and the selection criteria of expansion documents are important factors in query expansion performance.
使用维基百科评估伪相关反馈
用户有特定的信息需求,这些需求可以通过对信息检索系统的简短查询来表达。查询是非结构化的,在大多数情况下,它们往往很短而且含糊不清。使用浅层语言统计(包括概率或语言模型,如BM25或Indri)可以提高检索系统的指标,如平均精度(MAP)。但是,这些方法依赖于查询术语及其在检索文档中的存在来定义相关性。查询扩展是一种可以用来克服这个问题的技术,它通过使用最初的几个最重要的相关文档中的术语来扩展查询。我们试图回答的问题是,用于扩展的语料库的质量是否会显著提高前30个检索文档的MAP和精度。研究表明,扩展文档的质量和选择标准是影响查询扩展性能的重要因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信