Shijie Cao, Lanshun Nie, D. Zhan, Wenqiang Wang, Ningyi Xu, Ramashis Das, Ming Wu, Lintao Zhang, Derek Chiou
{"title":"FlexSaaS","authors":"Shijie Cao, Lanshun Nie, D. Zhan, Wenqiang Wang, Ningyi Xu, Ramashis Das, Ming Wu, Lintao Zhang, Derek Chiou","doi":"10.1145/3301409","DOIUrl":null,"url":null,"abstract":"Web search engines deploy large-scale selection services on CPUs to identify a set of web pages that match user queries. An FPGA-based accelerator can exploit various levels of parallelism and provide a lower latency, higher throughput, more energy-efficient solution than commodity CPUs. However, maintaining such a customized accelerator in a commercial search engine is challenging because selection services are changed often. This article presents our design for FlexSaaS (Flexible Selection as a Service), an FPGA-based accelerator for web search selection. To address efficiency and flexibility challenges, FlexSaaS abstracts computing models and separates memory access from computation. Specifically, FlexSaaS (i) contains a reconfigurable number of matching processors that can handle various possible query plans, (ii) decouples index stream reading from matching computation to fetch and decode index files, and (iii) includes a universal memory accessor that hides the complex memory hierarchy and reduces host data access latency. Evaluated on FPGAs in the selection service of a commercial web search--the Bing web search engine—FlexSaaS can be evolved quickly to adapt to new updates. Compared to the software baseline, FlexSaaS on Arria 10 reduces average latency by 30% and increases throughput by 1.5×.","PeriodicalId":162787,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3301409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Web search engines deploy large-scale selection services on CPUs to identify a set of web pages that match user queries. An FPGA-based accelerator can exploit various levels of parallelism and provide a lower latency, higher throughput, more energy-efficient solution than commodity CPUs. However, maintaining such a customized accelerator in a commercial search engine is challenging because selection services are changed often. This article presents our design for FlexSaaS (Flexible Selection as a Service), an FPGA-based accelerator for web search selection. To address efficiency and flexibility challenges, FlexSaaS abstracts computing models and separates memory access from computation. Specifically, FlexSaaS (i) contains a reconfigurable number of matching processors that can handle various possible query plans, (ii) decouples index stream reading from matching computation to fetch and decode index files, and (iii) includes a universal memory accessor that hides the complex memory hierarchy and reduces host data access latency. Evaluated on FPGAs in the selection service of a commercial web search--the Bing web search engine—FlexSaaS can be evolved quickly to adapt to new updates. Compared to the software baseline, FlexSaaS on Arria 10 reduces average latency by 30% and increases throughput by 1.5×.
Web搜索引擎在cpu上部署大规模的选择服务,以识别一组与用户查询匹配的网页。基于fpga的加速器可以利用各种级别的并行性,并提供比商品cpu更低的延迟,更高的吞吐量,更节能的解决方案。然而,在商业搜索引擎中维护这样一个定制的加速器是具有挑战性的,因为选择服务经常发生变化。本文介绍了FlexSaaS (Flexible Selection as a Service)的设计,这是一个基于fpga的网页搜索选择加速器。为了解决效率和灵活性方面的挑战,FlexSaaS对计算模型进行了抽象,并将内存访问与计算分离。具体来说,FlexSaaS(1)包含可重构的匹配处理器数量,可以处理各种可能的查询计划;(2)将索引流读取与匹配计算解耦,以获取和解码索引文件;(3)包括一个通用内存访问器,它隐藏了复杂的内存层次结构,减少了主机数据访问延迟。flexsaas在商业网络搜索(Bing网络搜索引擎)的选择服务中的fpga上进行了评估,可以快速发展以适应新的更新。与软件基线相比,Arria 10上的FlexSaaS将平均延迟降低了30%,吞吐量提高了1.5倍。