Shayan A. Tabrizi, J. Dadashkarimi, Mostafa Dehghani, H. Esfahani, A. Shakery
{"title":"Revisiting Optimal Rank Aggregation: A Dynamic Programming Approach","authors":"Shayan A. Tabrizi, J. Dadashkarimi, Mostafa Dehghani, H. Esfahani, A. Shakery","doi":"10.1145/2808194.2809490","DOIUrl":"https://doi.org/10.1145/2808194.2809490","url":null,"abstract":"Rank aggregation, that is merging multiple ranked lists, is a pivotal challenge in many information retrieval (IR) systems, especially in distributed IR and multilingual IR. From the evaluation point of view, being able to calculate the upper-bound of performance of the final aggregated list lays the ground for evaluating different aggregation strategies, independently. In this paper, we propose an algorithm based on dynamic programming which, using relevancy information, obtains the aggregated list with the maximum performance that could be possibly achieved by any aggregation strategy. We also provide a detailed proof for the optimality of the result of the algorithm. Furthermore, we demonstrate that the previous proposed algorithm fails to reach the optimal result in many circumstances, due to its greedy essence.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134639934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session Search by Direct Policy Learning","authors":"Jiyun Luo, Xuchu Dong, G. Yang","doi":"10.1145/2808194.2809461","DOIUrl":"https://doi.org/10.1145/2808194.2809461","url":null,"abstract":"This paper proposes a novel retrieval model for session search. Through gradient descent, the model finds optimal policies for the best search engine actions from what is observed in the user and search engine interactions. The proposed framework applies direct policy learning to session search such that it greatly reduce the model complexity than prior work. It is also a flexible design, which includes a wide range of features describing the rich interactions in session search. The framework is shown to be highly effective evaluated on the recent TREC Session Tracks. As part of the efforts to bring reinforcement learning to information retrieval, this paper makes a novel contribution in theoretical modeling for session search.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"337 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115608860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Behavior of PRES Using Incomplete Judgment Sets","authors":"E. Voorhees","doi":"10.1145/2808194.2809484","DOIUrl":"https://doi.org/10.1145/2808194.2809484","url":null,"abstract":"PRES, the Patent Retrieval Evaluation Score, is a family of retrieval evaluation measures that combines recall and user effort to reflect the quality of a retrieval run with respect to recall-oriented search tasks. Previous analysis of the measure was done using the test collection for the CLEF-IP 2009 track, a collection that contains a limited range of number of relevant documents, making it difficult to assess the behavior of PRES for varying recall contexts. This paper examines the effect of incomplete judgments on PRES scores using the well-studied TREC-8 ad hoc test collection, a collection with a much more varied number-of-relevants profile. Experiments with small judgment sets created through a typical collection-building process show the PRES measures are resilient to incomplete judgment sets.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123519799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Balancing Aspects in Retrieved Search Results","authors":"David Wemhoener, James Allan","doi":"10.1145/2808194.2809492","DOIUrl":"https://doi.org/10.1145/2808194.2809492","url":null,"abstract":"Many queries contain explicit aspects which must be balanced in any retrieved result in order to meet a user's information need: if aspects of the query are missing or disproportionately represented in documents, the results will be of lower quality than desired. This balancing thus needs to occur both within the retrieved documents individually and across the entire set. We introduce the concept of query-aspect balance and describe a new evaluation measure, β-NDCG, that allows the evaluation of query-aspect balance on multivalued query-aspect judgments. We apply β-NDCG to a small test collection and explore its utility. We show that β-NDCG-NDCG captures problems of query aspect balance within and across documents in the ranked list.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122121060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Pseudo Relevance Feedback in the Divergence from Randomness Model","authors":"Dipasree Pal, Mandar Mitra, S. Bhattacharya","doi":"10.1145/2808194.2809494","DOIUrl":"https://doi.org/10.1145/2808194.2809494","url":null,"abstract":"In an earlier analysis of Pseudo Relevance Feedback (PRF) models by Clinchant and Gaussier (2013), five desirable properties that PRF models should satisfy were formalised. Also, modifications to two PRF models were proposed in order to improve compliance with the desirable properties. These resulted in improved retrieval effectiveness. In this study, we introduce a sixth property that we believe PRF models should satisfy. We also extend the earlier exercise to Bo1, a standard PRF model. Experimental results on the robust, wt10g and gov2 datasets show that the proposed modifications yield improvements in effectiveness.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125129136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Relevance Feedback Effectiveness with the Help of the Principle of Polyrepresentation in MIR","authors":"David Zellhöfer","doi":"10.1145/2808194.2809485","DOIUrl":"https://doi.org/10.1145/2808194.2809485","url":null,"abstract":"The principle of polyrepresentation - a representative of the cognitive viewpoint on IR, takes a holistic perspective on interactive IR research. One of the principle's core hypotheses is that a document is described by different representations such as visual low-level features, textual content, or relational metadata. The conjunctive combination of these representations, the so-called cognitive overlap, is assumed to compensate the inherent insecurity in relevance assessments of documents w.r.t. an information need. Recently, the cognitively motivated principle of polyrepresentation has been shown to correlate with quantum mechanics-inspired IR models. However, the principle's effectiveness has not been examined in relevance feedback-based interactive MIR. In this work, the principle's utility is studied in interactive MIR in order to investigate whether its main hypothesis can serve as a predictor of retrieval performance during relevance feedback. In order to obtain resilient results all experiments have been carried out with 6 different standard test sets that provide evidence of the utility of the presented approach and the underlying polyrepresentative hypothesis.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128037330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Signaling Game Approach to Databases Querying and Interaction","authors":"Arash Termehchy, B. Touri","doi":"10.1145/2808194.2809487","DOIUrl":"https://doi.org/10.1145/2808194.2809487","url":null,"abstract":"As most database users cannot precisely express their information needs, it is challenging for database querying and exploration interfaces to understand them. We propose a novel formal framework for representing and understanding information needs in database querying and exploration. Our framework considers querying as a collaboration between the user and the database system to establish a mutual language for representing information needs. We formalize this collaboration as a signaling game, where each mutual language is an equilibrium for the game. A query interface is more effective if it establishes a less ambiguous mutual language faster. We discuss some equilibria, strategies, and the convergence rates in this game. In particular, we propose a reinforcement learning mechanism and analyze it within our framework. We prove that this adaptation mechanism for the query interface improves the effectiveness of answering queries stochastically speaking, and converges almost surely.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128813785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Verboseness Fission for BM25 Document Length Normalization","authors":"Aldo Lipani, M. Lupu, A. Hanbury, Akiko Aizawa","doi":"10.1145/2808194.2809486","DOIUrl":"https://doi.org/10.1145/2808194.2809486","url":null,"abstract":"BM25 is probably the most well known term weighting model in Information Retrieval. It has, depending on the formula variant at hand, 2 or 3 parameters (k1, b, and k3). This paper addresses b - the document length normalization parameter. Based on the observation that the two cases previously discussed for length normalization (multi-topicality and verboseness) are actually three: multi-topicality, verboseness with word repetition (repetitiveness) and verboseness with synonyms, we propose and test a new length normalization method that removes the need for a b parameter in BM25. Testing the new method on a set of purposefully varied test collections, we observe that we can obtain results statistically indistinguishable from the optimal results, therefore removing the need for ground-truth based optimization.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126551449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards a Formal Framework for Utility-oriented Measurements of Retrieval Effectiveness","authors":"M. Ferrante, N. Ferro, Maria Maistro","doi":"10.1145/2808194.2809452","DOIUrl":"https://doi.org/10.1145/2808194.2809452","url":null,"abstract":"In this paper we present a formal framework to define and study the properties of utility-oriented measurements of retrieval effectiveness, like AP, RBP, ERR and many other popular IR evaluation measures. The proposed framework is laid in the wake of the representational theory of measurement, which provides the foundations of the modern theory of measurement in both physical and social sciences, thus contributing to explicitly link IR evaluation to a broader context. The proposed framework is minimal, in the sense that it relies on just one axiom, from which other properties are derived. Finally, it contributes to a better understanding and a clear separation of what issues are due to the inherent problems in comparing systems in terms of retrieval effectiveness and what others are due to the expected numerical properties of a measurement.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128314795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hong Wang, Anqi Liu, Jing Wang, Brian D. Ziebart, Clement T. Yu, Warren Shen
{"title":"Context Retrieval for Web Tables","authors":"Hong Wang, Anqi Liu, Jing Wang, Brian D. Ziebart, Clement T. Yu, Warren Shen","doi":"10.1145/2808194.2809453","DOIUrl":"https://doi.org/10.1145/2808194.2809453","url":null,"abstract":"Many modern knowledge bases are built by extracting information from millions of web pages. Though existing extraction methods primarily focus on web pages' main text, a huge amount of information is embedded within other web structures, such as web tables. Previous studies have shown that linking web page tables and textual context is beneficial for extracting more information from web pages. However, using the text surrounding each table without carefully assessing its relevance introduces noise in the extracted information, degrading its accuracy. To the best of our knowledge, we provide the first systematic study of the problem of table-related context retrieval: given a table and the sentences within the same web page, determine for each sentence whether it is relevant to the table. We define the concept of relevance and introduce a Table-Related Context Retrieval system (TRCR) in this paper. We experiment with different machine learning algorithms, including a recently developed algorithm that is robust to biases in the training data, and show that our system retrieves table-related context with F1=0.735.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131128206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}