A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2017-02-02 DOI:10.1145/3018661.3018726

Matt Crane, J. Culpepper, Jimmy J. Lin, J. Mackenzie, A. Trotman, D. Cheriton

{"title":"A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation","authors":"Matt Crane, J. Culpepper, Jimmy J. Lin, J. Mackenzie, A. Trotman, D. Cheriton","doi":"10.1145/3018661.3018726","DOIUrl":null,"url":null,"abstract":"We present an empirical comparison between document-at-a-time (DaaT) and score-at-a-time (SaaT) document ranking strategies within a common framework. Although both strategies have been extensively explored, the literature lacks a fair, direct comparison: such a study has been difficult due to vastly different query evaluation mechanics and index organizations. Our work controls for score quantization, document processing, compression, implementation language, implementation effort, and a number of details, arriving at an empirical evaluation that fairly characterizes the performance of three specific techniques: WAND (DaaT), BMW (DaaT), and JASS (SaaT). Experiments reveal a number of interesting findings. The performance gap between WAND and BMW is not as clear as the literature suggests, and both methods are susceptible to tail queries that may take orders of magnitude longer than the median query to execute. Surprisingly, approximate query evaluation in WAND and BMW does not significantly reduce the risk of these tail queries. Overall, JASS is slightly slower than either WAND or BMW, but exhibits much lower variance in query latencies and is much less susceptible to tail query effects. Furthermore, JASS query latency is not particularly sensitive to the retrieval depth, making it an appealing solution for performance-sensitive applications where bounds on query latencies are desirable.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"56","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3018661.3018726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 56

Abstract

We present an empirical comparison between document-at-a-time (DaaT) and score-at-a-time (SaaT) document ranking strategies within a common framework. Although both strategies have been extensively explored, the literature lacks a fair, direct comparison: such a study has been difficult due to vastly different query evaluation mechanics and index organizations. Our work controls for score quantization, document processing, compression, implementation language, implementation effort, and a number of details, arriving at an empirical evaluation that fairly characterizes the performance of three specific techniques: WAND (DaaT), BMW (DaaT), and JASS (SaaT). Experiments reveal a number of interesting findings. The performance gap between WAND and BMW is not as clear as the literature suggests, and both methods are susceptible to tail queries that may take orders of magnitude longer than the median query to execute. Surprisingly, approximate query evaluation in WAND and BMW does not significantly reduce the risk of these tail queries. Overall, JASS is slightly slower than either WAND or BMW, but exhibits much lower variance in query latencies and is much less susceptible to tail query effects. Furthermore, JASS query latency is not particularly sensitive to the retrieval depth, making it an appealing solution for performance-sensitive applications where bounds on query latencies are desirable.

查看原文本刊更多论文

一次文档和一次分数查询评估的比较

我们在一个共同的框架内提出了一次文档(DaaT)和一次分数(SaaT)文档排名策略之间的经验比较。尽管对这两种策略都进行了广泛的探讨，但文献缺乏公平、直接的比较:由于查询评估机制和索引组织的巨大差异，这样的研究一直很困难。我们的工作控制了分数量化、文档处理、压缩、实现语言、实现工作和许多细节，得出了一个经验评估，该评估公平地描述了三种特定技术的性能:WAND (DaaT)、BMW (DaaT)和JASS (SaaT)。实验揭示了许多有趣的发现。WAND和BMW之间的性能差距并不像文献所表明的那样明显，而且这两种方法都容易受到尾部查询的影响，尾部查询的执行时间可能比中位数查询长几个数量级。令人惊讶的是，WAND和BMW中的近似查询评估并没有显著降低这些尾部查询的风险。总的来说，JASS比WAND或BMW慢一些，但是在查询延迟方面表现出更低的差异，并且更不容易受到尾查询影响。此外，JASS查询延迟对检索深度不是特别敏感，因此对于性能敏感的应用程序来说，它是一个很有吸引力的解决方案，因为这些应用程序需要对查询延迟进行限制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量