Pseudo test collections for training and tuning microblog rankers

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI:10.1145/2484028.2484063

R. Berendsen, M. Tsagkias, W. Weerkamp, M. de Rijke

{"title":"Pseudo test collections for training and tuning microblog rankers","authors":"R. Berendsen, M. Tsagkias, W. Weerkamp, M. de Rijke","doi":"10.1145/2484028.2484063","DOIUrl":null,"url":null,"abstract":"Recent years have witnessed a persistent interest in generating pseudo test collections, both for training and evaluation purposes. We describe a method for generating queries and relevance judgments for microblog search in an unsupervised way. Our starting point is this intuition: tweets with a hashtag are relevant to the topic covered by the hashtag and hence to a suitable query derived from the hashtag. Our baseline method selects all commonly used hashtags, and all associated tweets as relevance judgments; we then generate a query from these tweets. Next, we generate a timestamp for each query, allowing us to use temporal information in the training process. We then enrich the generation process with knowledge derived from an editorial test collection for microblog search. We use our pseudo test collections in two ways. First, we tune parameters of a variety of well known retrieval methods on them. Correlations with parameter sweeps on an editorial test collection are high on average, with a large variance over retrieval algorithms. Second, we use the pseudo test collections as training sets in a learning to rank scenario. Performance close to training on an editorial test collection is achieved in many cases. Our results demonstrate the utility of tuning and training microblog search algorithms on automatically generated training material.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2484028.2484063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

Abstract

Recent years have witnessed a persistent interest in generating pseudo test collections, both for training and evaluation purposes. We describe a method for generating queries and relevance judgments for microblog search in an unsupervised way. Our starting point is this intuition: tweets with a hashtag are relevant to the topic covered by the hashtag and hence to a suitable query derived from the hashtag. Our baseline method selects all commonly used hashtags, and all associated tweets as relevance judgments; we then generate a query from these tweets. Next, we generate a timestamp for each query, allowing us to use temporal information in the training process. We then enrich the generation process with knowledge derived from an editorial test collection for microblog search. We use our pseudo test collections in two ways. First, we tune parameters of a variety of well known retrieval methods on them. Correlations with parameter sweeps on an editorial test collection are high on average, with a large variance over retrieval algorithms. Second, we use the pseudo test collections as training sets in a learning to rank scenario. Performance close to training on an editorial test collection is achieved in many cases. Our results demonstrate the utility of tuning and training microblog search algorithms on automatically generated training material.

查看原文本刊更多论文

用于训练和调优微博排名的伪测试集合

近年来，为了训练和评估的目的，人们一直对生成伪测试集合很感兴趣。本文描述了一种无监督的微博搜索查询和相关性判断生成方法。我们的出发点是这样的直觉:带有hashtag的tweet与该hashtag所涵盖的主题相关，因此与从该hashtag派生的合适查询相关。我们的基线方法选择所有常用的标签，并将所有相关的推文作为相关性判断;然后，我们从这些tweet生成一个查询。接下来，我们为每个查询生成时间戳，允许我们在训练过程中使用时间信息。然后，我们使用来自微博搜索的编辑测试集的知识来丰富生成过程。我们以两种方式使用伪测试集合。首先，我们对各种已知检索方法的参数进行了调优。与编辑测试集合上的参数扫描的相关性平均很高，在检索算法上有很大的差异。其次，我们使用伪测试集合作为学习排序场景中的训练集。在许多情况下，在编辑测试集合上实现接近训练的性能。我们的结果证明了在自动生成的训练材料上调整和训练微博搜索算法的实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

自引率

0.00%

发文量