多级检索系统的动态截止预测

J. Culpepper, C. Clarke, Jimmy J. Lin
{"title":"多级检索系统的动态截止预测","authors":"J. Culpepper, C. Clarke, Jimmy J. Lin","doi":"10.1145/3015022.3015026","DOIUrl":null,"url":null,"abstract":"Modern multi-stage retrieval systems are comprised of a candidate generation stage followed by one or more reranking stages. In such an architecture, the quality of the final ranked list may not be sensitive to the quality of the initial candidate pool, especially in terms of early precision. This provides several opportunities to increase retrieval efficiency without significantly sacrificing effectiveness. In this paper, we explore a new approach to dynamically predicting the size of an initial result set in the candidate generation stage, which can directly affect the overall efficiency and effectiveness of the entire system. Previous work exploring this tradeoff has focused on global parameter settings that apply to all queries, even though optimal settings vary across queries. In contrast, we propose a technique that makes a parameter prediction to maximize efficiency within an effectiveness envelope on a per query basis, using only static pre-retrieval features. Experimental results show that substantial efficiency gains are achievable. In addition, our framework provides a versatile tool that can be used to estimate the effectiveness-efficiency tradeoffs that are possible before selecting and tuning algorithms to make machine-learned predictions.","PeriodicalId":334601,"journal":{"name":"Proceedings of the 21st Australasian Document Computing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":"{\"title\":\"Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems\",\"authors\":\"J. Culpepper, C. Clarke, Jimmy J. Lin\",\"doi\":\"10.1145/3015022.3015026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern multi-stage retrieval systems are comprised of a candidate generation stage followed by one or more reranking stages. In such an architecture, the quality of the final ranked list may not be sensitive to the quality of the initial candidate pool, especially in terms of early precision. This provides several opportunities to increase retrieval efficiency without significantly sacrificing effectiveness. In this paper, we explore a new approach to dynamically predicting the size of an initial result set in the candidate generation stage, which can directly affect the overall efficiency and effectiveness of the entire system. Previous work exploring this tradeoff has focused on global parameter settings that apply to all queries, even though optimal settings vary across queries. In contrast, we propose a technique that makes a parameter prediction to maximize efficiency within an effectiveness envelope on a per query basis, using only static pre-retrieval features. Experimental results show that substantial efficiency gains are achievable. In addition, our framework provides a versatile tool that can be used to estimate the effectiveness-efficiency tradeoffs that are possible before selecting and tuning algorithms to make machine-learned predictions.\",\"PeriodicalId\":334601,\"journal\":{\"name\":\"Proceedings of the 21st Australasian Document Computing Symposium\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"46\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st Australasian Document Computing Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3015022.3015026\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st Australasian Document Computing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3015022.3015026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 46

摘要

现代多阶段检索系统由候选生成阶段和一个或多个重新排序阶段组成。在这样的体系结构中,最终排名列表的质量可能对初始候选池的质量不敏感,特别是在早期精度方面。这为在不显著牺牲有效性的情况下提高检索效率提供了几个机会。在本文中,我们探索了一种在候选生成阶段动态预测初始结果集大小的新方法,它直接影响整个系统的整体效率和有效性。以前探索这种权衡的工作主要集中在适用于所有查询的全局参数设置上,尽管最佳设置因查询而异。相比之下,我们提出了一种技术,该技术仅使用静态预检索特征,在每个查询的有效性范围内进行参数预测以最大化效率。实验结果表明,该方法可以实现较大的效率提高。此外,我们的框架提供了一个多功能工具,可用于在选择和调优算法以进行机器学习预测之前估计可能的有效性和效率权衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems
Modern multi-stage retrieval systems are comprised of a candidate generation stage followed by one or more reranking stages. In such an architecture, the quality of the final ranked list may not be sensitive to the quality of the initial candidate pool, especially in terms of early precision. This provides several opportunities to increase retrieval efficiency without significantly sacrificing effectiveness. In this paper, we explore a new approach to dynamically predicting the size of an initial result set in the candidate generation stage, which can directly affect the overall efficiency and effectiveness of the entire system. Previous work exploring this tradeoff has focused on global parameter settings that apply to all queries, even though optimal settings vary across queries. In contrast, we propose a technique that makes a parameter prediction to maximize efficiency within an effectiveness envelope on a per query basis, using only static pre-retrieval features. Experimental results show that substantial efficiency gains are achievable. In addition, our framework provides a versatile tool that can be used to estimate the effectiveness-efficiency tradeoffs that are possible before selecting and tuning algorithms to make machine-learned predictions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信