{"title":"An Assumption-Free Approach to the Dynamic Truncation of Ranked Lists","authors":"Yen-Chieh Lien, Daniel Cohen, W. Bruce Croft","doi":"10.1145/3341981.3344234","DOIUrl":null,"url":null,"abstract":"In traditional retrieval environments, a ranked list of candidate documents is produced without regard to the number of documents. With the rise in interactive IR as well as professional searches such as legal retrieval, this results in a substantial ranked list which is scanned by a user until their information need is satisfied. Determining the point at which the ranking model has low confidence in the relevance score is a challenging, but potentially very useful, task. Truncation of the ranked list must balance the needs of the user with the confidence of the retrieval model. Unlike query performance prediction where the task is to estimate the performance of a model based on an initial query and a given set documents, dynamic truncation minimizes the risk of viewing a non-relevant document given an external metric by estimating the confidence of the retrieval model using a distribution over its already calculated output scores, and subsequently truncating the ranking at that position. In this paper, we propose an assumption-free approach to learning a non-parametric score distribution over any retrieval model and demonstrate the efficacy of our method on Robust04, significantly improving user defined metrics compared to previous approaches.","PeriodicalId":173154,"journal":{"name":"Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"149 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341981.3344234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
In traditional retrieval environments, a ranked list of candidate documents is produced without regard to the number of documents. With the rise in interactive IR as well as professional searches such as legal retrieval, this results in a substantial ranked list which is scanned by a user until their information need is satisfied. Determining the point at which the ranking model has low confidence in the relevance score is a challenging, but potentially very useful, task. Truncation of the ranked list must balance the needs of the user with the confidence of the retrieval model. Unlike query performance prediction where the task is to estimate the performance of a model based on an initial query and a given set documents, dynamic truncation minimizes the risk of viewing a non-relevant document given an external metric by estimating the confidence of the retrieval model using a distribution over its already calculated output scores, and subsequently truncating the ranking at that position. In this paper, we propose an assumption-free approach to learning a non-parametric score distribution over any retrieval model and demonstrate the efficacy of our method on Robust04, significantly improving user defined metrics compared to previous approaches.