Energy-Efficient Cache Coherence Protocols in Chip-Multiprocessors for Server Consolidation

Antonio García-Guirado, Ricardo Fernández Pascual, Alberto Ros, José M. García
{"title":"Energy-Efficient Cache Coherence Protocols in Chip-Multiprocessors for Server Consolidation","authors":"Antonio García-Guirado, Ricardo Fernández Pascual, Alberto Ros, José M. García","doi":"10.1109/ICPP.2011.44","DOIUrl":null,"url":null,"abstract":"As the number of cores in a chip increases, power consumption is becoming a major constraint in the design of chip multiprocessors. At the same time, server consolidation is gaining importance to take advantage of such a number of cores. Our goal is to alleviate this constraint by reducing the power consumption of chip multiprocessors used for consolidated workloads by means of the cache coherence protocol. For this, we statically divide the chip in areas, which allows us to reduce the directory overhead needed to support coherence and to reduce the network traffic. This translates into less power consumption without performance degradation. Cache coherence is maintained per area and pointers are used to link the areas, thereby achieving isolation among virtual machines and savings in memory requirements. Additionally, the coherence protocol dynamically selects one node per area as responsible for providing the data on a cache miss, thus lessening the average cache miss latency and the traffic among areas. Compared to a highly-optimized directory implementation, the leakage power consumption is reduced by 54% and the dynamic power consumption of the caches and the network-on-chip by up to 38% for a 64-tile chip multiprocessor with 4 virtual machines, showing no performance degradation.","PeriodicalId":115365,"journal":{"name":"2011 International Conference on Parallel Processing","volume":"184 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2011.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

As the number of cores in a chip increases, power consumption is becoming a major constraint in the design of chip multiprocessors. At the same time, server consolidation is gaining importance to take advantage of such a number of cores. Our goal is to alleviate this constraint by reducing the power consumption of chip multiprocessors used for consolidated workloads by means of the cache coherence protocol. For this, we statically divide the chip in areas, which allows us to reduce the directory overhead needed to support coherence and to reduce the network traffic. This translates into less power consumption without performance degradation. Cache coherence is maintained per area and pointers are used to link the areas, thereby achieving isolation among virtual machines and savings in memory requirements. Additionally, the coherence protocol dynamically selects one node per area as responsible for providing the data on a cache miss, thus lessening the average cache miss latency and the traffic among areas. Compared to a highly-optimized directory implementation, the leakage power consumption is reduced by 54% and the dynamic power consumption of the caches and the network-on-chip by up to 38% for a 64-tile chip multiprocessor with 4 virtual machines, showing no performance degradation.
用于服务器整合的芯片多处理器节能缓存一致性协议
随着芯片中核心数量的增加,功耗正成为芯片多处理器设计的主要制约因素。与此同时,为了利用如此多的核心,服务器整合变得越来越重要。我们的目标是通过使用缓存一致性协议来减少用于合并工作负载的芯片多处理器的功耗,从而减轻这种限制。为此,我们静态地将芯片划分为多个区域,这使我们能够减少支持一致性所需的目录开销,并减少网络流量。这意味着更少的功耗而不会降低性能。每个区域保持缓存一致性,并使用指针连接这些区域,从而实现虚拟机之间的隔离并节省内存需求。此外,一致性协议在每个区域动态地选择一个节点负责提供缓存丢失上的数据,从而减少了平均缓存丢失延迟和区域之间的流量。与高度优化的目录实现相比,对于具有4个虚拟机的64块芯片多处理器,泄漏功耗降低了54%,缓存和片上网络的动态功耗降低了38%,没有表现出性能下降。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信