Griffin: uniting CPU and GPU in information retrieval systems for intra-query parallelism

Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming Pub Date : 2018-02-10 DOI:10.1145/3178487.3178512

Yang Liu, Jianguo Wang, S. Swanson

{"title":"Griffin: uniting CPU and GPU in information retrieval systems for intra-query parallelism","authors":"Yang Liu, Jianguo Wang, S. Swanson","doi":"10.1145/3178487.3178512","DOIUrl":null,"url":null,"abstract":"Interactive information retrieval services, such as enterprise search and document search, must provide relevant results with consistent, low response times in the face of rapidly growing data sets and query loads. These growing demands have led researchers to consider a wide range of optimizations to reduce response latency, including query processing parallelization and acceleration with co-processors such as GPUs. However, previous work runs queries either on GPU or CPU, ignoring the fact that the best processor for a given query depends on the query's characteristics, which may change as the processing proceeds. We present Griffin, an IR systems that dynamically combines GPU- and CPU-based algorithms to process individual queries according to their characteristics. Griffin uses state-of-the-art CPU-based query processing techniques and incorporates a novel approach to GPU-based query evaluation. Our GPU-based approach, as far as we know, achieves the best available GPU search performance by leveraging a new compression scheme and exploiting an advanced merge-based intersection algorithm. We evaluate Griffin with real world queries and datasets, and show that it improves query performance by 10x compared to a highly optimized CPU-only implementation, and 1.5x compared to our GPU-approach running alone. We also find that Griffin helps reduce the 95th-, 99th-, and 99.9th-percentile query response time by 10.4x, 16.1x, and 26.8x, respectively.","PeriodicalId":193776,"journal":{"name":"Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3178487.3178512","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

Interactive information retrieval services, such as enterprise search and document search, must provide relevant results with consistent, low response times in the face of rapidly growing data sets and query loads. These growing demands have led researchers to consider a wide range of optimizations to reduce response latency, including query processing parallelization and acceleration with co-processors such as GPUs. However, previous work runs queries either on GPU or CPU, ignoring the fact that the best processor for a given query depends on the query's characteristics, which may change as the processing proceeds. We present Griffin, an IR systems that dynamically combines GPU- and CPU-based algorithms to process individual queries according to their characteristics. Griffin uses state-of-the-art CPU-based query processing techniques and incorporates a novel approach to GPU-based query evaluation. Our GPU-based approach, as far as we know, achieves the best available GPU search performance by leveraging a new compression scheme and exploiting an advanced merge-based intersection algorithm. We evaluate Griffin with real world queries and datasets, and show that it improves query performance by 10x compared to a highly optimized CPU-only implementation, and 1.5x compared to our GPU-approach running alone. We also find that Griffin helps reduce the 95th-, 99th-, and 99.9th-percentile query response time by 10.4x, 16.1x, and 26.8x, respectively.

查看原文本刊更多论文

联合CPU和GPU在信息检索系统中的查询内并行性

交互式信息检索服务，如企业搜索和文档搜索，必须在面对快速增长的数据集和查询负载时提供一致的、低响应时间的相关结果。这些不断增长的需求促使研究人员考虑广泛的优化来减少响应延迟，包括查询处理并行化和gpu等协处理器的加速。然而，以前的工作要么在GPU上运行查询，要么在CPU上运行查询，忽略了一个事实，即给定查询的最佳处理器取决于查询的特征，而这些特征可能随着处理的进行而改变。我们提出了Griffin，一个动态结合基于GPU和cpu的算法来处理根据其特征的单个查询的IR系统。Griffin使用最先进的基于cpu的查询处理技术，并结合了一种基于gpu的查询评估的新方法。据我们所知，我们基于GPU的方法通过利用新的压缩方案和利用先进的基于合并的交叉算法实现了最佳可用GPU搜索性能。我们用真实世界的查询和数据集对Griffin进行了评估，结果表明，与仅高度优化的cpu实现相比，它将查询性能提高了10倍，与我们单独运行的gpu方法相比提高了1.5倍。我们还发现Griffin帮助将第95、99和99.9百分位的查询响应时间分别减少了10.4倍、16.1倍和26.8倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

自引率

0.00%

发文量