Rhythm: harnessing data parallel hardware for server workloads

Sandeep R. Agrawal, Valentin Pistol, Jun Pang, J. Tran, D. Tarjan, A. Lebeck
{"title":"Rhythm: harnessing data parallel hardware for server workloads","authors":"Sandeep R. Agrawal, Valentin Pistol, Jun Pang, J. Tran, D. Tarjan, A. Lebeck","doi":"10.1145/2541940.2541956","DOIUrl":null,"url":null,"abstract":"Trends in increasing web traffic demand an increase in server throughput while preserving energy efficiency and total cost of ownership. Present work in optimizing data center efficiency primarily focuses on the data center as a whole, using off-the-shelf hardware for individual servers. Server capacity is typically increased by adding more machines, which is cheap, though inefficient in the long run in terms of energy and area. Our work builds on the observation that server workload execution patterns are not completely unique across multiple requests. We present a framework---called Rhythm---for high throughput servers that can exploit similarity across requests to improve server performance and power/energy efficiency by launching data parallel executions for request cohorts. An implementation of the SPECWeb Banking workload using Rhythm on NVIDIA GPUs provides a basis for evaluating both software and hardware for future cohort-based servers. Our evaluation of Rhythm on future server platforms shows that it achieves 4x the throughput (reqs/sec) of a core i7 at efficiencies (reqs/Joule) comparable to a dual core ARM Cortex A9. A Rhythm implementation that generates transposed responses achieves 8x the i7 throughput while processing 2.5x more requests/Joule compared to the A9.","PeriodicalId":128805,"journal":{"name":"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th international conference on Architectural support for programming languages and operating systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2541940.2541956","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37

Abstract

Trends in increasing web traffic demand an increase in server throughput while preserving energy efficiency and total cost of ownership. Present work in optimizing data center efficiency primarily focuses on the data center as a whole, using off-the-shelf hardware for individual servers. Server capacity is typically increased by adding more machines, which is cheap, though inefficient in the long run in terms of energy and area. Our work builds on the observation that server workload execution patterns are not completely unique across multiple requests. We present a framework---called Rhythm---for high throughput servers that can exploit similarity across requests to improve server performance and power/energy efficiency by launching data parallel executions for request cohorts. An implementation of the SPECWeb Banking workload using Rhythm on NVIDIA GPUs provides a basis for evaluating both software and hardware for future cohort-based servers. Our evaluation of Rhythm on future server platforms shows that it achieves 4x the throughput (reqs/sec) of a core i7 at efficiencies (reqs/Joule) comparable to a dual core ARM Cortex A9. A Rhythm implementation that generates transposed responses achieves 8x the i7 throughput while processing 2.5x more requests/Joule compared to the A9.
节奏:为服务器工作负载利用数据并行硬件
在保持能源效率和总拥有成本的同时,增加网络流量的趋势要求增加服务器吞吐量。目前优化数据中心效率的工作主要集中在数据中心作为一个整体,为单个服务器使用现成的硬件。服务器容量通常是通过增加更多的机器来增加的,这是便宜的,但从长远来看,在能源和面积方面效率不高。我们的工作建立在服务器工作负载执行模式在多个请求中并非完全唯一的观察基础之上。我们提出了一个名为Rhythm的框架,用于高吞吐量服务器,它可以利用请求之间的相似性,通过为请求队列启动数据并行执行来提高服务器性能和电源/能源效率。在NVIDIA gpu上使用Rhythm实现的SPECWeb Banking工作负载为评估未来基于队列的服务器的软件和硬件提供了基础。我们在未来服务器平台上对Rhythm的评估表明,在效率(请求/焦耳)上,它的吞吐量(请求/秒)是双核ARM Cortex A9的4倍。生成转置响应的Rhythm实现实现了8倍于i7的吞吐量,同时处理的请求/焦耳比A9多2.5倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信