Characterization and analysis of a web search benchmark

2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) Pub Date : 2015-03-29 DOI:10.1109/ISPASS.2015.7095818

Zacharias Hadjilambrou, Marios Kleanthous, Yiannakis Sazeides

{"title":"Characterization and analysis of a web search benchmark","authors":"Zacharias Hadjilambrou, Marios Kleanthous, Yiannakis Sazeides","doi":"10.1109/ISPASS.2015.7095818","DOIUrl":null,"url":null,"abstract":"Web search as a service is very impressive. Web search runs on thousands of servers which perform search on an index of billions of web pages. The search results must be both relevant to the user queries and reach the user in a fraction of a second. A web search service must guarantee the same QoS at all times even at the peak incoming traffic load. Not unjustifiably the web search service has attracted a lot of research attention. Despite the high research interest web search has gained, there are still plenty unknown about the functionality and the architecture of web search benchmarks. Much research has been done using commercial web search engines, like Bing or Google, but many details of these search engines are, of course, not disclosed to the public. We take an academically accepted web search benchmark and we perform a thorough characterization and analysis of it. We shed light in to the architecture and the functionality of the benchmark. We also investigate some prominent web search research issues. In particular, we study how intra-server index partitioning affects the response time and throughput and we also explore the potential use of low power servers for web search. Our results show that intra-server partitioning can reduce tail latencies and that low power servers given enough partitioning can provide same response times as conventional high performance servers.","PeriodicalId":189378,"journal":{"name":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2015.7095818","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Web search as a service is very impressive. Web search runs on thousands of servers which perform search on an index of billions of web pages. The search results must be both relevant to the user queries and reach the user in a fraction of a second. A web search service must guarantee the same QoS at all times even at the peak incoming traffic load. Not unjustifiably the web search service has attracted a lot of research attention. Despite the high research interest web search has gained, there are still plenty unknown about the functionality and the architecture of web search benchmarks. Much research has been done using commercial web search engines, like Bing or Google, but many details of these search engines are, of course, not disclosed to the public. We take an academically accepted web search benchmark and we perform a thorough characterization and analysis of it. We shed light in to the architecture and the functionality of the benchmark. We also investigate some prominent web search research issues. In particular, we study how intra-server index partitioning affects the response time and throughput and we also explore the potential use of low power servers for web search. Our results show that intra-server partitioning can reduce tail latencies and that low power servers given enough partitioning can provide same response times as conventional high performance servers.

查看原文本刊更多论文

网络搜索基准的特征和分析

Web搜索作为一种服务是非常令人印象深刻的。网络搜索在数千台服务器上运行，这些服务器对数十亿个网页的索引进行搜索。搜索结果必须与用户查询相关，并在几分之一秒内到达用户。web搜索服务必须在任何时候都保证相同的QoS，即使是在最高的传入流量负载下。毫无疑问，网络搜索服务吸引了大量的研究关注。尽管网络搜索已经获得了很高的研究兴趣，但关于网络搜索基准的功能和架构仍然有很多未知的东西。使用商业网络搜索引擎(如Bing或Google)进行了大量研究，但这些搜索引擎的许多细节当然没有向公众披露。我们采用学术上公认的网络搜索基准，并对其进行彻底的表征和分析。我们介绍了基准的架构和功能。我们还研究了一些突出的网络搜索研究问题。特别是，我们研究了服务器内部索引分区如何影响响应时间和吞吐量，我们还探索了低功耗服务器用于web搜索的潜在用途。我们的研究结果表明，服务器内部分区可以减少尾部延迟，并且给予足够分区的低功耗服务器可以提供与传统高性能服务器相同的响应时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

自引率

0.00%

发文量