ESP-NUCA: A low-cost adaptive Non-Uniform Cache Architecture

Javier Merino, Valentin Puente, J. Gregorio
{"title":"ESP-NUCA: A low-cost adaptive Non-Uniform Cache Architecture","authors":"Javier Merino, Valentin Puente, J. Gregorio","doi":"10.1109/HPCA.2010.5416641","DOIUrl":null,"url":null,"abstract":"This paper introduces a cost effective cache architecture called Enhanced Shared-Private Non-Uniform Cache Architecture (ESP-NUCA), which is suitable for highperformance Chip MultiProcessors (CMPs). This architecture enhances system stability by combining the advantages of private and shared caches. Starting from a shared NUCA, ESP-NUCA introduces a low-cost mechanism to dynamically allocate private cache blocks closer to their owner processor. In this way, average on-chip access latency is reduced and inter-core interference minimized. ESP-NUCA synergistically integrates victims and replicas thus making it possible to take advantage of multiple-readers for shared data, and to maximize cache usage under unbalanced core utilization. This architecture leads to stable behavior within the whole system across a broad spectrum of working scenarios. ESP-NUCA not only outperforms architectures with similar implementation costs such as private and shared caches by up to 20% and 40% respectively, but even outperforms much costlier architectures such as D-NUCA [13] by up to 28%, Adaptive Selective Replication [3] by up to 19%, and Cooperative Caching [5] by up to 15%. Moreover, performance variance throughout the set of benchmarks is 37% lower than with ASR, 87% lower than with D-NUCA, and 43% lower than with Cooperative Caching.","PeriodicalId":368621,"journal":{"name":"HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"53","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2010.5416641","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 53

Abstract

This paper introduces a cost effective cache architecture called Enhanced Shared-Private Non-Uniform Cache Architecture (ESP-NUCA), which is suitable for highperformance Chip MultiProcessors (CMPs). This architecture enhances system stability by combining the advantages of private and shared caches. Starting from a shared NUCA, ESP-NUCA introduces a low-cost mechanism to dynamically allocate private cache blocks closer to their owner processor. In this way, average on-chip access latency is reduced and inter-core interference minimized. ESP-NUCA synergistically integrates victims and replicas thus making it possible to take advantage of multiple-readers for shared data, and to maximize cache usage under unbalanced core utilization. This architecture leads to stable behavior within the whole system across a broad spectrum of working scenarios. ESP-NUCA not only outperforms architectures with similar implementation costs such as private and shared caches by up to 20% and 40% respectively, but even outperforms much costlier architectures such as D-NUCA [13] by up to 28%, Adaptive Selective Replication [3] by up to 19%, and Cooperative Caching [5] by up to 15%. Moreover, performance variance throughout the set of benchmarks is 37% lower than with ASR, 87% lower than with D-NUCA, and 43% lower than with Cooperative Caching.
ESP-NUCA:一种低成本自适应非统一缓存架构
本文介绍了一种适用于高性能芯片多处理器(cmp)的经济高效的增强型共享-私有非统一缓存体系结构(ESP-NUCA)。这种架构结合了私有和共享缓存的优点,增强了系统的稳定性。从共享的NUCA开始,ESP-NUCA引入了一种低成本的机制来动态分配更靠近其所有者处理器的私有缓存块。通过这种方式,平均片上访问延迟减少,核间干扰最小化。ESP-NUCA协同集成了受害者和副本,从而可以利用多个读取器共享数据,并在不平衡的核心利用率下最大化缓存使用。这种体系结构导致整个系统在广泛的工作场景范围内的稳定行为。ESP-NUCA不仅比具有类似实现成本的架构(如私有和共享缓存)分别高出20%和40%,而且甚至比成本更高的架构(如D-NUCA[13])高出28%,自适应选择性复制[3]高出19%,合作缓存[5]高出15%。此外,整个基准测试集的性能差异比ASR低37%,比D-NUCA低87%,比协同缓存低43%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信