Cache-Poll: Containing Pollution in Non-Inclusive Caches Through Cache Partitioning

Proceedings of the 51st International Conference on Parallel Processing Pub Date : 2022-08-29 DOI:10.1145/3545008.3545083

Lucía Pons, J. Sahuquillo, S. Petit, Julio Pons

{"title":"Cache-Poll: Containing Pollution in Non-Inclusive Caches Through Cache Partitioning","authors":"Lucía Pons, J. Sahuquillo, S. Petit, Julio Pons","doi":"10.1145/3545008.3545083","DOIUrl":null,"url":null,"abstract":"Current server processors have redistributed the cache hierarchy space over previous generations. The private L2 cache has been made larger and the shared last level caches (LLC) smaller but designed as non-inclusive to reduce the number of replicated blocks. As a result, the new organization shrinks the per-core cache area. Cache management in this organization becomes more critical than in inclusive caches due to two main reasons: there is less storage capacity per core both in the L3 and when considering the sum of L2 and L3 cache sizes, and there is higher L2-L3 traffic especially when running high cache-demanding applications. This paper focuses on minimizing L3 cache pollution to make a more efficient use of the limited space. Three main types of pollution are identified and measured: useless prefetches, bad speculated loads, and poor locality. This paper proposes Cache-Poll, a pollution-aware management policy that concentrates on limiting the cache space to polluting and L3 insensitive applications, allowing critical applications occupy more space. Unlike state-of-the-art work on non-inclusive caches, Cache-Poll is able to improve performance in an Intel Xeon Scalable processor even when running heavy cache-demanding workloads, consisting of 12-application workloads, as many applications as cores in the processor. Results show that Cache-Poll improves fairness and turnaround time by 44% and 24%, respectively, over the Linux OS, while even improving performance up to 3.5%.","PeriodicalId":360504,"journal":{"name":"Proceedings of the 51st International Conference on Parallel Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 51st International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545008.3545083","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Current server processors have redistributed the cache hierarchy space over previous generations. The private L2 cache has been made larger and the shared last level caches (LLC) smaller but designed as non-inclusive to reduce the number of replicated blocks. As a result, the new organization shrinks the per-core cache area. Cache management in this organization becomes more critical than in inclusive caches due to two main reasons: there is less storage capacity per core both in the L3 and when considering the sum of L2 and L3 cache sizes, and there is higher L2-L3 traffic especially when running high cache-demanding applications. This paper focuses on minimizing L3 cache pollution to make a more efficient use of the limited space. Three main types of pollution are identified and measured: useless prefetches, bad speculated loads, and poor locality. This paper proposes Cache-Poll, a pollution-aware management policy that concentrates on limiting the cache space to polluting and L3 insensitive applications, allowing critical applications occupy more space. Unlike state-of-the-art work on non-inclusive caches, Cache-Poll is able to improve performance in an Intel Xeon Scalable processor even when running heavy cache-demanding workloads, consisting of 12-application workloads, as many applications as cores in the processor. Results show that Cache-Poll improves fairness and turnaround time by 44% and 24%, respectively, over the Linux OS, while even improving performance up to 3.5%.

查看原文本刊更多论文

Cache- poll:通过缓存分区控制非包容缓存中的污染

当前服务器处理器已经在前几代服务器上重新分配了缓存层次结构空间。私有L2缓存变得更大，共享的最后一级缓存(LLC)变小，但设计为不包含，以减少复制块的数量。因此，新组织缩小了每核缓存区域。与包容性缓存相比，这种组织中的缓存管理变得更加关键，主要有两个原因:L3中每个核心的存储容量更少，考虑到L2和L3缓存大小的总和时，L2-L3流量更高，特别是在运行高缓存要求的应用程序时。本文的重点是最小化L3缓存污染，以更有效地利用有限的空间。确定并测量了三种主要的污染类型:无用的预取、不良的推测负荷和不良的局域性。本文提出了一种污染感知管理策略cache - poll，它专注于将缓存空间限制在污染和L3不敏感的应用程序上，允许关键应用程序占用更多的空间。与非包容性缓存的最新工作不同，Cache-Poll能够提高英特尔至强可扩展处理器的性能，即使在运行高缓存要求的工作负载时也是如此，包括12个应用程序工作负载，处理器中的应用程序数量与内核数量一样多。结果表明，与Linux操作系统相比，Cache-Poll将公平性和周转时间分别提高了44%和24%，同时甚至将性能提高了3.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 51st International Conference on Parallel Processing

自引率

0.00%

发文量