{"title":"不同自适应缓存绕过方法的比较","authors":"Mariana Carmin, L. A. Ensina, M. Alves","doi":"10.1109/SBESC56799.2022.9965178","DOIUrl":null,"url":null,"abstract":"Most modern microprocessors have a deep cache hierarchy to hide the latency of accessing the main memory. Thus, with the increase in the number of cores, the shared Last-Level Cache (LLC) also increases, which consumes a large portion of the chip's total power and area. The same cache hierarchy can represent an extra latency barrier for applications with poor temporal and spatial locality. Therefore, sophisticated solutions should ensure optimal resource utilization to mitigate cache problems. In this scenario, an adaptive cache mechanism can benefit such applications, improving general system performance and decreasing energy consumption. When multiple programs are running, adapting the use of the LLC for each application avoids cache conflicts and cache pollution, increasing system performance. In this paper, we assess two approaches based on regression and classification models to adapt the use of the LLC during run-time, both using hardware counters. Analyzing the efficiency and overhead of each model through SPEC CPU 2006 and 2017, we observe a better performance for the classification model based on the Random Forest algorithm for both single and multi-program workloads.","PeriodicalId":130479,"journal":{"name":"2022 XII Brazilian Symposium on Computing Systems Engineering (SBESC)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparison of Different Adaptable Cache Bypassing Approaches\",\"authors\":\"Mariana Carmin, L. A. Ensina, M. Alves\",\"doi\":\"10.1109/SBESC56799.2022.9965178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most modern microprocessors have a deep cache hierarchy to hide the latency of accessing the main memory. Thus, with the increase in the number of cores, the shared Last-Level Cache (LLC) also increases, which consumes a large portion of the chip's total power and area. The same cache hierarchy can represent an extra latency barrier for applications with poor temporal and spatial locality. Therefore, sophisticated solutions should ensure optimal resource utilization to mitigate cache problems. In this scenario, an adaptive cache mechanism can benefit such applications, improving general system performance and decreasing energy consumption. When multiple programs are running, adapting the use of the LLC for each application avoids cache conflicts and cache pollution, increasing system performance. In this paper, we assess two approaches based on regression and classification models to adapt the use of the LLC during run-time, both using hardware counters. Analyzing the efficiency and overhead of each model through SPEC CPU 2006 and 2017, we observe a better performance for the classification model based on the Random Forest algorithm for both single and multi-program workloads.\",\"PeriodicalId\":130479,\"journal\":{\"name\":\"2022 XII Brazilian Symposium on Computing Systems Engineering (SBESC)\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 XII Brazilian Symposium on Computing Systems Engineering (SBESC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SBESC56799.2022.9965178\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 XII Brazilian Symposium on Computing Systems Engineering (SBESC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBESC56799.2022.9965178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
大多数现代微处理器都有一个深缓存层次结构来隐藏访问主存的延迟。因此,随着内核数量的增加,共享的最后一级缓存(LLC)也会增加,这消耗了芯片总功率和面积的很大一部分。对于时间和空间局部性差的应用程序,相同的缓存层次结构可能表示额外的延迟屏障。因此,复杂的解决方案应该确保最佳的资源利用,以减轻缓存问题。在这种情况下,自适应缓存机制可以使这些应用程序受益,从而提高系统的总体性能并降低能耗。当运行多个程序时,为每个应用程序调整LLC的使用可以避免缓存冲突和缓存污染,从而提高系统性能。在本文中,我们评估了基于回归和分类模型的两种方法,以适应在运行时使用LLC,两者都使用硬件计数器。通过SPEC CPU 2006和2017分析每个模型的效率和开销,我们观察到基于随机森林算法的分类模型在单程序和多程序工作负载下都具有更好的性能。
Comparison of Different Adaptable Cache Bypassing Approaches
Most modern microprocessors have a deep cache hierarchy to hide the latency of accessing the main memory. Thus, with the increase in the number of cores, the shared Last-Level Cache (LLC) also increases, which consumes a large portion of the chip's total power and area. The same cache hierarchy can represent an extra latency barrier for applications with poor temporal and spatial locality. Therefore, sophisticated solutions should ensure optimal resource utilization to mitigate cache problems. In this scenario, an adaptive cache mechanism can benefit such applications, improving general system performance and decreasing energy consumption. When multiple programs are running, adapting the use of the LLC for each application avoids cache conflicts and cache pollution, increasing system performance. In this paper, we assess two approaches based on regression and classification models to adapt the use of the LLC during run-time, both using hardware counters. Analyzing the efficiency and overhead of each model through SPEC CPU 2006 and 2017, we observe a better performance for the classification model based on the Random Forest algorithm for both single and multi-program workloads.