芯片多处理器的自适应共享/私有NUCA缓存分区方案

2007 IEEE 13th International Symposium on High Performance Computer Architecture Pub Date : 2007-02-10 DOI:10.1109/HPCA.2007.346180

H. Dybdahl, P. Stenström

{"title":"芯片多处理器的自适应共享/私有NUCA缓存分区方案","authors":"H. Dybdahl, P. Stenström","doi":"10.1109/HPCA.2007.346180","DOIUrl":null,"url":null,"abstract":"The significant speed-gap between processor and memory and the limited chip memory bandwidth make last-level cache performance crucial for future chip multiprocessors. To use the capacity of shared last-level caches efficiently and to allow for a short access time, proposed non-uniform cache architectures (NUCAs) are organized into per-core partitions. If a core runs out of cache space, blocks are typically relocated to nearby partitions, thus managing the cache as a shared cache. This uncontrolled sharing of all resources may unfortunately result in pollution that degrades performance. We propose a novel non-uniform cache architecture in which the amount of cache space that can be shared among the cores is controlled dynamically. The adaptive scheme estimates, continuously, the effect of increasing/decreasing the shared partition size on the overall performance. We show that our scheme outperforms a private and shared cache organization as well as a hybrid NUCA organization in which blocks in a local partition can spill over to neighbor core partitions","PeriodicalId":177324,"journal":{"name":"2007 IEEE 13th International Symposium on High Performance Computer Architecture","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"140","resultStr":"{\"title\":\"An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors\",\"authors\":\"H. Dybdahl, P. Stenström\",\"doi\":\"10.1109/HPCA.2007.346180\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The significant speed-gap between processor and memory and the limited chip memory bandwidth make last-level cache performance crucial for future chip multiprocessors. To use the capacity of shared last-level caches efficiently and to allow for a short access time, proposed non-uniform cache architectures (NUCAs) are organized into per-core partitions. If a core runs out of cache space, blocks are typically relocated to nearby partitions, thus managing the cache as a shared cache. This uncontrolled sharing of all resources may unfortunately result in pollution that degrades performance. We propose a novel non-uniform cache architecture in which the amount of cache space that can be shared among the cores is controlled dynamically. The adaptive scheme estimates, continuously, the effect of increasing/decreasing the shared partition size on the overall performance. We show that our scheme outperforms a private and shared cache organization as well as a hybrid NUCA organization in which blocks in a local partition can spill over to neighbor core partitions\",\"PeriodicalId\":177324,\"journal\":{\"name\":\"2007 IEEE 13th International Symposium on High Performance Computer Architecture\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-02-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"140\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE 13th International Symposium on High Performance Computer Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2007.346180\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE 13th International Symposium on High Performance Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2007.346180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 140

摘要

处理器和存储器之间的显著速度差距以及有限的芯片存储器带宽使得最后一级缓存性能对未来的芯片多处理器至关重要。为了有效地使用共享的最后一级缓存的容量并允许较短的访问时间，建议将非统一缓存架构(nuca)组织到每核分区中。如果一个核心耗尽了缓存空间，块通常会被重新定位到附近的分区，从而将缓存作为共享缓存进行管理。不幸的是，这种不受控制的资源共享可能导致污染，从而降低性能。我们提出了一种新的非统一缓存架构，其中可以在内核之间共享的缓存空间量是动态控制的。自适应方案持续地估计增加/减少共享分区大小对整体性能的影响。我们表明，我们的方案优于私有和共享缓存组织以及混合NUCA组织，在混合NUCA组织中，本地分区中的块可以溢出到相邻的核心分区

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors

The significant speed-gap between processor and memory and the limited chip memory bandwidth make last-level cache performance crucial for future chip multiprocessors. To use the capacity of shared last-level caches efficiently and to allow for a short access time, proposed non-uniform cache architectures (NUCAs) are organized into per-core partitions. If a core runs out of cache space, blocks are typically relocated to nearby partitions, thus managing the cache as a shared cache. This uncontrolled sharing of all resources may unfortunately result in pollution that degrades performance. We propose a novel non-uniform cache architecture in which the amount of cache space that can be shared among the cores is controlled dynamically. The adaptive scheme estimates, continuously, the effect of increasing/decreasing the shared partition size on the overall performance. We show that our scheme outperforms a private and shared cache organization as well as a hybrid NUCA organization in which blocks in a local partition can spill over to neighbor core partitions

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2007 IEEE 13th International Symposium on High Performance Computer Architecture

自引率

0.00%

发文量