Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval最新文献

User model-based metrics for offline query suggestion evaluation 用于离线查询建议评估的基于用户模型的度量

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484041

E. Kharitonov, C. Macdonald, P. Serdyukov, I. Ounis

{"title":"User model-based metrics for offline query suggestion evaluation","authors":"E. Kharitonov, C. Macdonald, P. Serdyukov, I. Ounis","doi":"10.1145/2484028.2484041","DOIUrl":"https://doi.org/10.1145/2484028.2484041","url":null,"abstract":"Query suggestion or auto-completion mechanisms are widely used by search engines and are increasingly attracting interest from the research community. However, the lack of commonly accepted evaluation methodology and metrics means that it is not possible to compare results and approaches from the literature. Moreover, often the metrics used to evaluate query suggestions tend to be an adaptation from other domains without a proper justification. Hence, it is not necessarily clear if the improvements reported in the literature would result in an actual improvement in the users' experience. Inspired by the cascade user models and state-of-the-art evaluation metrics in the web search domain, we address the query suggestion evaluation, by first studying the users behaviour from a search engine's query log and thereby deriving a new family of user models describing the users interaction with a query suggestion mechanism. Next, assuming a query log-based evaluation approach, we propose two new metrics to evaluate query suggestions, pSaved and eSaved. Both metrics are parameterised by a user model. pSaved is defined as the probability of using the query suggestions while submitting a query. eSaved equates to the expected relative amount of effort (keypresses) a user can avoid due to the deployed query suggestion mechanism. Finally, we experiment with both metrics using four user model instantiations as well as metrics previously used in the literature on a dataset of 6.1M sessions. Our results demonstrate that pSaved and eSaved show the best alignment with the users satisfaction amongst the considered metrics.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120987264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

An incremental approach to efficient pseudo-relevance feedback 一种有效伪相关反馈的增量方法

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484051

Hao Wu, Hui Fang

{"title":"An incremental approach to efficient pseudo-relevance feedback","authors":"Hao Wu, Hui Fang","doi":"10.1145/2484028.2484051","DOIUrl":"https://doi.org/10.1145/2484028.2484051","url":null,"abstract":"Pseudo-relevance feedback is an important strategy to improve search accuracy. It is often implemented as a two-round retrieval process: the first round is to retrieve an initial set of documents relevant to an original query, and the second round is to retrieve final retrieval results using the original query expanded with terms selected from the previously retrieved documents. This two-round retrieval process is clearly time consuming, which could arguably be one of main reasons that hinder the wide adaptation of the pseudo-relevance feedback methods in real-world IR systems. In this paper, we study how to improve the efficiency of pseudo-relevance feedback methods. The basic idea is to reduce the time needed for the second round of retrieval by leveraging the query processing results of the first round. Specifically, instead of processing the expand query as a newly submitted query, we propose an incremental approach, which resumes the query processing results (i.e. document accumulators) for the first round of retrieval and process the second round of retrieval mainly as a step of adjusting the scores in the accumulators. Experimental results on TREC Terabyte collections show that the proposed incremental approach can improve the efficiency of pseudo-relevance feedback methods by a factor of two without sacrificing their effectiveness.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121235197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Sopra: a new social personalized ranking function for improving web search Sopra:一个新的社会个性化排名功能，用于改善网络搜索

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484131

Mohamed Reda Bouadjenek, Hakim Hacid, M. Bouzeghoub

引用次数: 43

An effective implicit relevance feedback technique using affective, physiological and behavioural features 使用情感、生理和行为特征的有效内隐关联反馈技术

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484074

Yashar Moshfeghi, J. Jose

{"title":"An effective implicit relevance feedback technique using affective, physiological and behavioural features","authors":"Yashar Moshfeghi, J. Jose","doi":"10.1145/2484028.2484074","DOIUrl":"https://doi.org/10.1145/2484028.2484074","url":null,"abstract":"The effectiveness of various behavioural signals for implicit relevance feedback models has been exhaustively studied. Despite the advantages of such techniques for a real time information retrieval system, most of the behavioural signals are noisy and therefore not reliable enough to be employed. Among many, a combination of dwell time and task information has been shown to be effective for relevance judgement prediction. However, the task information might not be available to the system at all times. Thus, there is a need for other sources of information which can be used as a substitute for task information. Recently, affective and physiological signals have shown promise as a potential source of information for relevance judgement prediction. However, their accuracy is not high enough to be applicable on their own. This paper investigates whether affective and physiological signals can be used as a complementary source of information for behavioural signals (i.e. dwell time) to create a reliable signal for relevance judgement prediction. Using a video retrieval system as a use case, we study and compare the effectiveness of the affective and physiological signals on their own, as well as in combination with behavioural signals for the relevance judgment prediction task across four different search intentions: seeking information, re-finding a particular information object, and two different entertainment intentions (i.e. entertainment by adjusting arousal level, and entertainment by adjusting mood). Our experimental results show that the effectiveness of studied signals varies across different search intentions, and when affective and physiological signals are combined with dwell time, a significant improvement can be achieved. Overall, these findings will help to implement better search engines in the future.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122076137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 58

Pseudo test collections for training and tuning microblog rankers 用于训练和调优微博排名的伪测试集合

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484063

R. Berendsen, M. Tsagkias, W. Weerkamp, M. de Rijke

{"title":"Pseudo test collections for training and tuning microblog rankers","authors":"R. Berendsen, M. Tsagkias, W. Weerkamp, M. de Rijke","doi":"10.1145/2484028.2484063","DOIUrl":"https://doi.org/10.1145/2484028.2484063","url":null,"abstract":"Recent years have witnessed a persistent interest in generating pseudo test collections, both for training and evaluation purposes. We describe a method for generating queries and relevance judgments for microblog search in an unsupervised way. Our starting point is this intuition: tweets with a hashtag are relevant to the topic covered by the hashtag and hence to a suitable query derived from the hashtag. Our baseline method selects all commonly used hashtags, and all associated tweets as relevance judgments; we then generate a query from these tweets. Next, we generate a timestamp for each query, allowing us to use temporal information in the training process. We then enrich the generation process with knowledge derived from an editorial test collection for microblog search. We use our pseudo test collections in two ways. First, we tune parameters of a variety of well known retrieval methods on them. Correlations with parameter sweeps on an editorial test collection are high on average, with a large variance over retrieval algorithms. Second, we use the pseudo test collections as training sets in a learning to rank scenario. Performance close to training on an editorial test collection is achieved in many cases. Our results demonstrate the utility of tuning and training microblog search algorithms on automatically generated training material.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129546994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Hybrid retrieval approaches to geospatial music recommendation 地理空间音乐推荐的混合检索方法

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484146

M. Schedl, Dominik Schnitzer

引用次数: 38

Competition-based networks for expert finding 基于竞争的专家寻找网络

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484183

Çigdem Aslay, Neil O'Hare, L. Aiello, A. Jaimes

引用次数: 43

An information-theoretic account of static index pruning 静态索引修剪的信息论描述

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484061

Ruey-Cheng Chen, Chia-Jung Lee

引用次数: 12

Indexing and querying overlapping structures 索引和查询重叠的结构

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484234

Faegheh Hasibi

{"title":"Indexing and querying overlapping structures","authors":"Faegheh Hasibi","doi":"10.1145/2484028.2484234","DOIUrl":"https://doi.org/10.1145/2484028.2484234","url":null,"abstract":"Structural information retrieval is mostly based on hierarchy. However, in real life information is not purely hierarchical and structural elements may overlap each other. The most common example is a document with two distinct structural views, where the logical view is section/ subsection/ paragraph and the physical view is page/ line. Each single structural view of this document is a hierarchy and the components are either disjoint or nested inside each other. The overlapping issue arises when one structural element cannot be neatly nested into others. For instance, when a paragraph starts in one page and terminates in the next page. Similar situations can appear in videos and other multimedia contents, where temporal or spatial constituents of a media file may overlap each other. Querying over overlapping structures is one of the challenges of large scale search engines. For instance, FSIS (FAST Search for Internet Sites) [1] is a Microsoft search platform, which encounters overlaps while analysing content of textual data. FSIS uses a pipeline process to extract structure and semantic information of documents. The pipeline contains several components, where each component writes annotations to the input data. These annotations consist of structural elements and some of them may overlap each other. Handling overlapping structures in search engines will add a novel capability of searching, where users can ask queries such as \"Find all the words that overlap two lines\" or \"Find the music played during Intro scene of Avatar movie\". There are also other use cases, where the user of the search engine is not a person, but is a specific program with complex, non-traditional information retrieval needs. This research attempts to index overlapping structures and provide efficient query processing for large-scale search engines. The current research on overlapping structures revolves around encoding and modelling data, while indexing and query processing methods need investigations. Moreover, due to intrinsic complexity of overlaps, XML indexing and query processing techniques cannot be used for overlapping structures. Hence, my research on overlapping structures comprises three main parts: (1) an indexing method that supports both hierarchies and overlaps; (2) a query processing method based on the indexing technique and (3) a query language that is close to natural language and supports both full text and structural queries. Our approach for indexing overlaps is to adapt the PrePost [3] XML indexing method to overlapping structures. This method labels each node with its start and end positions and requires modest storage space. However, PrePost indexing cannot be used for overlapping nodes. To overcome this issue, we need to define a data model for overlapping structures. Since hierarchies are not sufficient to describe overlapping components, several data structures have been introduced by scholars. One of the most interesting data models is GODDAG [","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"375 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115786422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Kernel-based learning to rank with syntactic and semantic structures 根据句法和语义结构进行排序的基于核的学习

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI: 10.1145/2484028.2484196

Alessandro Moschitti

引用次数: 2