CoreNap: Energy Efficient Core Allocation for Latency-Critical Workloads

IF 1.4 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Gyeongseo Park;Ki-Dong Kang;Minho Kim;Daehoon Kim
{"title":"CoreNap: Energy Efficient Core Allocation for Latency-Critical Workloads","authors":"Gyeongseo Park;Ki-Dong Kang;Minho Kim;Daehoon Kim","doi":"10.1109/LCA.2022.3227629","DOIUrl":null,"url":null,"abstract":"In data-center servers, the dynamic core allocation for Latency-Critical (LC) applications can play a crucial role in improving energy efficiency under Service Level Objective (SLO) constraints, allowing cores to enter idle states (i.e., C-states) that consume less power by turning off a part of hardware components of a processor. However, prior studies focus on the core allocation for application threads while not considering cores involved in network packet processing, even though packet processing affects not only response latency but also energy consumption considerably. In this paper, we first investigate the impacts of the explicit core allocation for network packet processing on the tail response latency and energy consumption while running LC applications. We observe that co-adjusting the number of cores for network packet processing along with the number of cores for LC application threads can improve energy efficiency substantially, compared with adjusting the number of cores only for application threads, as prior studies do. In addition, we propose a dynamic core allocation, called \n<monospace>CoreNap</monospace>\n, which allocates/de-allocates cores for both LC application threads and packet processing. \n<monospace>CoreNap</monospace>\n measures the CPU-utilization by application threads and packet processing individually, and predicts response latency and power consumption when the combination of core allocation is enforced via a lightweight prediction model. Based on the prediction, \n<monospace>CoreNap</monospace>\n chooses/enforces the energy-efficient combination of core allocation. Our experimental results show that \n<monospace>CoreNap</monospace>\n reduces energy consumption by up to 18.6% compared with state-of-the-art study that adjusts cores only for LC application in parallel packet processing environments.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"22 1","pages":"1-4"},"PeriodicalIF":1.4000,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Computer Architecture Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/9976222/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

In data-center servers, the dynamic core allocation for Latency-Critical (LC) applications can play a crucial role in improving energy efficiency under Service Level Objective (SLO) constraints, allowing cores to enter idle states (i.e., C-states) that consume less power by turning off a part of hardware components of a processor. However, prior studies focus on the core allocation for application threads while not considering cores involved in network packet processing, even though packet processing affects not only response latency but also energy consumption considerably. In this paper, we first investigate the impacts of the explicit core allocation for network packet processing on the tail response latency and energy consumption while running LC applications. We observe that co-adjusting the number of cores for network packet processing along with the number of cores for LC application threads can improve energy efficiency substantially, compared with adjusting the number of cores only for application threads, as prior studies do. In addition, we propose a dynamic core allocation, called CoreNap , which allocates/de-allocates cores for both LC application threads and packet processing. CoreNap measures the CPU-utilization by application threads and packet processing individually, and predicts response latency and power consumption when the combination of core allocation is enforced via a lightweight prediction model. Based on the prediction, CoreNap chooses/enforces the energy-efficient combination of core allocation. Our experimental results show that CoreNap reduces energy consumption by up to 18.6% compared with state-of-the-art study that adjusts cores only for LC application in parallel packet processing environments.
CoreNap:用于潜在关键工作负载的节能核心分配
在数据中心服务器中,在服务级别目标(SLO)约束下,关键延迟(LC)应用程序的动态内核分配可以在提高能效方面发挥关键作用,允许内核通过关闭处理器的一部分硬件组件进入功耗较低的空闲状态(即C状态)。然而,先前的研究集中在应用程序线程的核心分配上,而没有考虑网络数据包处理中涉及的核心,尽管数据包处理不仅会显著影响响应延迟,还会显著影响能耗。在本文中,我们首先研究了网络数据包处理的显式核心分配对运行LC应用程序时的尾部响应延迟和能耗的影响。我们观察到,与之前的研究一样,只为应用线程调整内核数量相比,共同调整网络数据包处理的内核数量和LC应用线程的内核数量可以显著提高能效。此外,我们提出了一种动态内核分配,称为CoreNap,其为LC应用线程和分组处理两者分配/取消分配核心。CoreNap分别测量应用程序线程和数据包处理的CPU利用率,并预测通过轻量级预测模型强制执行核心分配组合时的响应延迟和功耗。基于预测,CoreNap选择/强制执行核心分配的节能组合。我们的实验结果表明,与仅针对并行数据包处理环境中的LC应用调整内核的最先进研究相比,CoreNap可将能耗降低18.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Computer Architecture Letters
IEEE Computer Architecture Letters COMPUTER SCIENCE, HARDWARE & ARCHITECTURE-
CiteScore
4.60
自引率
4.30%
发文量
29
期刊介绍: IEEE Computer Architecture Letters is a rigorously peer-reviewed forum for publishing early, high-impact results in the areas of uni- and multiprocessor computer systems, computer architecture, microarchitecture, workload characterization, performance evaluation and simulation techniques, and power-aware computing. Submissions are welcomed on any topic in computer architecture, especially but not limited to: microprocessor and multiprocessor systems, microarchitecture and ILP processors, workload characterization, performance evaluation and simulation techniques, compiler-hardware and operating system-hardware interactions, interconnect architectures, memory and cache systems, power and thermal issues at the architecture level, I/O architectures and techniques, independent validation of previously published results, analysis of unsuccessful techniques, domain-specific processor architectures (e.g., embedded, graphics, network, etc.), real-time and high-availability architectures, reconfigurable systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信