dCat:用于高效、性能敏感的基础设施即服务的动态缓存管理

Proceedings of the Thirteenth EuroSys Conference Pub Date : 2018-04-23 DOI:10.1145/3190508.3190555

Cong Xu, K. Rajamani, Alexandre Ferreira, Wes Felter, J. Rubio, Y. Li

{"title":"dCat:用于高效、性能敏感的基础设施即服务的动态缓存管理","authors":"Cong Xu, K. Rajamani, Alexandre Ferreira, Wes Felter, J. Rubio, Y. Li","doi":"10.1145/3190508.3190555","DOIUrl":null,"url":null,"abstract":"In the modern multi-tenant cloud, resource sharing increases utilization but causes performance interference between tenants. More generally, performance isolation is also relevant in any multi-workload scenario involving shared resources. Last level cache (LLC) on processors is shared by all CPU cores in x86, thus the cloud tenants inevitably suffer from the cache flush by their noisy neighbors running on the same socket. Intel Cache Allocation Technology (CAT) provides a mechanism to assign cache ways to cores to enable cache isolation, but its static configuration can result in underutilized cache when a workload cannot benefit from its allocated cache capacity, and/or lead to sub-optimal performance for workloads that do not have enough assigned capacity to fit their working set. In this work, we propose a new dynamic cache management technology (dCat) to provide strong cache isolation with better performance. For each workload, we target a consistent, minimum performance bound irrespective of others on the socket and dependent only on its rightful share of the LLC capacity. In addition, when there is spare capacity on the socket, or when some workloads are not obtaining beneficial performance from their cache allocation, dCat dynamically reallocates cache space to cache-intensive workloads. We have implemented dCat in Linux on top of CAT to dynamically adjust cache mappings. dCat requires no modifications to applications so that it can be applied to all cloud workloads. Based on our evaluation, we see an average of 25% improvement over shared cache and 15.7% over static CAT for selected, memory intensive, SPEC CPU2006 workloads. For typical cloud workloads, with Redis we see 57.6% improvement (over shared LLC) and 26.6% improvement (over static partition) and with ElasticSearch we see 11.9% improvement over both.","PeriodicalId":334267,"journal":{"name":"Proceedings of the Thirteenth EuroSys Conference","volume":"8 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"dCat: dynamic cache management for efficient, performance-sensitive infrastructure-as-a-service\",\"authors\":\"Cong Xu, K. Rajamani, Alexandre Ferreira, Wes Felter, J. Rubio, Y. Li\",\"doi\":\"10.1145/3190508.3190555\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the modern multi-tenant cloud, resource sharing increases utilization but causes performance interference between tenants. More generally, performance isolation is also relevant in any multi-workload scenario involving shared resources. Last level cache (LLC) on processors is shared by all CPU cores in x86, thus the cloud tenants inevitably suffer from the cache flush by their noisy neighbors running on the same socket. Intel Cache Allocation Technology (CAT) provides a mechanism to assign cache ways to cores to enable cache isolation, but its static configuration can result in underutilized cache when a workload cannot benefit from its allocated cache capacity, and/or lead to sub-optimal performance for workloads that do not have enough assigned capacity to fit their working set. In this work, we propose a new dynamic cache management technology (dCat) to provide strong cache isolation with better performance. For each workload, we target a consistent, minimum performance bound irrespective of others on the socket and dependent only on its rightful share of the LLC capacity. In addition, when there is spare capacity on the socket, or when some workloads are not obtaining beneficial performance from their cache allocation, dCat dynamically reallocates cache space to cache-intensive workloads. We have implemented dCat in Linux on top of CAT to dynamically adjust cache mappings. dCat requires no modifications to applications so that it can be applied to all cloud workloads. Based on our evaluation, we see an average of 25% improvement over shared cache and 15.7% over static CAT for selected, memory intensive, SPEC CPU2006 workloads. For typical cloud workloads, with Redis we see 57.6% improvement (over shared LLC) and 26.6% improvement (over static partition) and with ElasticSearch we see 11.9% improvement over both.\",\"PeriodicalId\":334267,\"journal\":{\"name\":\"Proceedings of the Thirteenth EuroSys Conference\",\"volume\":\"8 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Thirteenth EuroSys Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3190508.3190555\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Thirteenth EuroSys Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3190508.3190555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 48

摘要

在现代多租户云中，资源共享可以提高利用率，但也会造成租户之间的性能干扰。更一般地说，性能隔离也适用于任何涉及共享资源的多工作负载场景。处理器上的最后一级缓存(LLC)由x86中的所有CPU内核共享，因此云租户不可避免地受到运行在同一套接字上的吵闹邻居的缓存刷新的影响。Intel Cache Allocation Technology (CAT)提供了一种将缓存方式分配给核心以启用缓存隔离的机制，但是当工作负载无法从其分配的缓存容量中获益时，其静态配置可能导致缓存未得到充分利用，并且/或者导致分配的容量不足以满足其工作集的工作负载的性能不理想。在这项工作中，我们提出了一种新的动态缓存管理技术(dCat)，以提供更高性能的强缓存隔离。对于每个工作负载，我们的目标是一致的、最小的性能界限，而不考虑套接字上的其他工作负载，并且只依赖于其合理的LLC容量份额。此外，当套接字上有空闲容量时，或者当某些工作负载没有从缓存分配中获得有益的性能时，dCat会动态地将缓存空间重新分配给缓存密集型工作负载。我们在Linux中在CAT之上实现了dCat来动态调整缓存映射。dCat不需要修改应用程序，因此它可以应用于所有云工作负载。根据我们的评估，对于选定的内存密集型SPEC CPU2006工作负载，我们看到比共享缓存平均提高25%，比静态CAT平均提高15.7%。对于典型的云工作负载，使用Redis我们看到57.6%的改进(与共享LLC相比)和26.6%的改进(与静态分区相比)，使用ElasticSearch我们看到两者都有11.9%的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

dCat: dynamic cache management for efficient, performance-sensitive infrastructure-as-a-service

In the modern multi-tenant cloud, resource sharing increases utilization but causes performance interference between tenants. More generally, performance isolation is also relevant in any multi-workload scenario involving shared resources. Last level cache (LLC) on processors is shared by all CPU cores in x86, thus the cloud tenants inevitably suffer from the cache flush by their noisy neighbors running on the same socket. Intel Cache Allocation Technology (CAT) provides a mechanism to assign cache ways to cores to enable cache isolation, but its static configuration can result in underutilized cache when a workload cannot benefit from its allocated cache capacity, and/or lead to sub-optimal performance for workloads that do not have enough assigned capacity to fit their working set. In this work, we propose a new dynamic cache management technology (dCat) to provide strong cache isolation with better performance. For each workload, we target a consistent, minimum performance bound irrespective of others on the socket and dependent only on its rightful share of the LLC capacity. In addition, when there is spare capacity on the socket, or when some workloads are not obtaining beneficial performance from their cache allocation, dCat dynamically reallocates cache space to cache-intensive workloads. We have implemented dCat in Linux on top of CAT to dynamically adjust cache mappings. dCat requires no modifications to applications so that it can be applied to all cloud workloads. Based on our evaluation, we see an average of 25% improvement over shared cache and 15.7% over static CAT for selected, memory intensive, SPEC CPU2006 workloads. For typical cloud workloads, with Redis we see 57.6% improvement (over shared LLC) and 26.6% improvement (over static partition) and with ElasticSearch we see 11.9% improvement over both.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Thirteenth EuroSys Conference

自引率

0.00%

发文量