肾上腺素:精确定位和控制尾部查询与快速电压提升

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA) Pub Date : 2015-03-09 DOI:10.1109/HPCA.2015.7056039

Chang-Hong Hsu, Yunqi Zhang, M. Laurenzano, David Meisner, T. Wenisch, Jason Mars, Lingjia Tang, R. Dreslinski

{"title":"肾上腺素:精确定位和控制尾部查询与快速电压提升","authors":"Chang-Hong Hsu, Yunqi Zhang, M. Laurenzano, David Meisner, T. Wenisch, Jason Mars, Lingjia Tang, R. Dreslinski","doi":"10.1109/HPCA.2015.7056039","DOIUrl":null,"url":null,"abstract":"Reducing the long tail of the query latency distribution in modern warehouse scale computers is critical for improving performance and quality of service of workloads such as Web Search and Memcached. Traditional turbo boost increases a processor's voltage and frequency during a coarse-grain sliding window, boosting all queries that are processed during that window. However, the inability of such a technique to pinpoint tail queries for boosting limits its tail reduction benefit. In this work, we propose Adrenaline, an approach to leverage finer granularity, 10's of nanoseconds, voltage boosting to effectively rein in the tail latency with query-level precision. Two key insights underlie this work. First, emerging finer granularity voltage/frequency boosting is an enabling mechanism for intelligent allocation of the power budget to precisely boost only the queries that contribute to the tail latency; and second, per-query characteristics can be used to design indicators for proactively pinpointing these queries, triggering boosting accordingly. Based on these insights, Adrenaline effectively pinpoints and boosts queries that are likely to increase the tail distribution and can reap more benefit from the voltage/frequency boost. By evaluating under various workload configurations, we demonstrate the effectiveness of our methodology. We achieve up to a 2.50x tail latency improvement for Memcached and up to a 3.03x for Web Search over coarse-grained DVFS given a fixed boosting power budget. When optimizing for energy reduction, Adrenaline achieves up to a 1.81x improvement for Memcached and up to a 1.99x for Web Search over coarse-grained DVFS.","PeriodicalId":6593,"journal":{"name":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)","volume":"15 1","pages":"271-282"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"109","resultStr":"{\"title\":\"Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting\",\"authors\":\"Chang-Hong Hsu, Yunqi Zhang, M. Laurenzano, David Meisner, T. Wenisch, Jason Mars, Lingjia Tang, R. Dreslinski\",\"doi\":\"10.1109/HPCA.2015.7056039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reducing the long tail of the query latency distribution in modern warehouse scale computers is critical for improving performance and quality of service of workloads such as Web Search and Memcached. Traditional turbo boost increases a processor's voltage and frequency during a coarse-grain sliding window, boosting all queries that are processed during that window. However, the inability of such a technique to pinpoint tail queries for boosting limits its tail reduction benefit. In this work, we propose Adrenaline, an approach to leverage finer granularity, 10's of nanoseconds, voltage boosting to effectively rein in the tail latency with query-level precision. Two key insights underlie this work. First, emerging finer granularity voltage/frequency boosting is an enabling mechanism for intelligent allocation of the power budget to precisely boost only the queries that contribute to the tail latency; and second, per-query characteristics can be used to design indicators for proactively pinpointing these queries, triggering boosting accordingly. Based on these insights, Adrenaline effectively pinpoints and boosts queries that are likely to increase the tail distribution and can reap more benefit from the voltage/frequency boost. By evaluating under various workload configurations, we demonstrate the effectiveness of our methodology. We achieve up to a 2.50x tail latency improvement for Memcached and up to a 3.03x for Web Search over coarse-grained DVFS given a fixed boosting power budget. When optimizing for energy reduction, Adrenaline achieves up to a 1.81x improvement for Memcached and up to a 1.99x for Web Search over coarse-grained DVFS.\",\"PeriodicalId\":6593,\"journal\":{\"name\":\"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)\",\"volume\":\"15 1\",\"pages\":\"271-282\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"109\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2015.7056039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2015.7056039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 109

摘要

在现代仓库规模的计算机中，减少查询延迟分布的长尾对于提高Web Search和Memcached等工作负载的性能和服务质量至关重要。传统的turbo boost在粗粒度滑动窗口期间增加处理器的电压和频率，从而提高在该窗口期间处理的所有查询。然而，这种技术无法精确定位尾部查询以进行提升，这限制了它的尾部减少效益。在这项工作中，我们提出了Adrenaline，这是一种利用更细粒度(10纳秒)的电压提升方法，以查询级精度有效地控制尾部延迟。这项工作的基础是两个关键的见解。首先，新兴的更细粒度电压/频率提升是一种智能分配功率预算的启用机制，可以精确地只提升导致尾部延迟的查询;其次，每个查询的特征可以用来设计指示器，以主动定位这些查询，从而触发相应的提升。基于这些见解，Adrenaline能够有效地定位并提升那些可能会增加尾部分布的查询，并能够从电压/频率提升中获得更多收益。通过在各种工作负载配置下进行评估，我们证明了我们的方法的有效性。对于Memcached，我们实现了高达2.50倍的尾部延迟改进，对于粗粒度DVFS，我们实现了高达3.03倍的尾部延迟改进。在优化能量减少时，Adrenaline在Memcached上实现了1.81倍的改进，在粗粒度DVFS上实现了1.99倍的Web搜索改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting

Reducing the long tail of the query latency distribution in modern warehouse scale computers is critical for improving performance and quality of service of workloads such as Web Search and Memcached. Traditional turbo boost increases a processor's voltage and frequency during a coarse-grain sliding window, boosting all queries that are processed during that window. However, the inability of such a technique to pinpoint tail queries for boosting limits its tail reduction benefit. In this work, we propose Adrenaline, an approach to leverage finer granularity, 10's of nanoseconds, voltage boosting to effectively rein in the tail latency with query-level precision. Two key insights underlie this work. First, emerging finer granularity voltage/frequency boosting is an enabling mechanism for intelligent allocation of the power budget to precisely boost only the queries that contribute to the tail latency; and second, per-query characteristics can be used to design indicators for proactively pinpointing these queries, triggering boosting accordingly. Based on these insights, Adrenaline effectively pinpoints and boosts queries that are likely to increase the tail distribution and can reap more benefit from the voltage/frequency boost. By evaluating under various workload configurations, we demonstrate the effectiveness of our methodology. We achieve up to a 2.50x tail latency improvement for Memcached and up to a 3.03x for Web Search over coarse-grained DVFS given a fixed boosting power budget. When optimizing for energy reduction, Adrenaline achieves up to a 1.81x improvement for Memcached and up to a 1.99x for Web Search over coarse-grained DVFS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

自引率

0.00%

发文量