F. Qiao, B. Yu, Jianliang Ma, Tianzhou Chen, Tongsen Hu
{"title":"芯片多处理器中一种高效共享少重用滤波器","authors":"F. Qiao, B. Yu, Jianliang Ma, Tianzhou Chen, Tongsen Hu","doi":"10.1109/ICICTA.2011.582","DOIUrl":null,"url":null,"abstract":"In general, the Less Recently Used (LRU) policy was commonly employed to manage shared L2 cache in Chip Multiprocessors. However, LRU policy remains some deficiencies based on previous studies. In particular, LRU may perform considerably bad when the workloads of application program are larger than L2 cache, because there are usually a great number of less reused lines that are never reused or reused for few times in L2 cache. The cache performance can be improved significantly if we keep non-less reused lines rather than less reused lines in cache for a time quantum. This paper proposes a new architecture called Shared Less Reused Filter (SLRF) that applying the less reused filter that can filter out the less reused lines rather than just never reused lines according to the context of Chip Multiprocessors. Our experiments on a large set of multithread benchmarks, for 11 splash-2 benchmarks, demonstrate that our technique shows that augmented in a 2M LRU-managed L2 cache with a SLRF which has 256 KB filter buffer improves IPC by 13.43% compared with the context of the uniprocessor, and reduces the average MPKI by 18.20 %.","PeriodicalId":368130,"journal":{"name":"2011 Fourth International Conference on Intelligent Computation Technology and Automation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SLRF: A High-efficiency Shared Less Reused Filter in Chip Multiprocessors\",\"authors\":\"F. Qiao, B. Yu, Jianliang Ma, Tianzhou Chen, Tongsen Hu\",\"doi\":\"10.1109/ICICTA.2011.582\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In general, the Less Recently Used (LRU) policy was commonly employed to manage shared L2 cache in Chip Multiprocessors. However, LRU policy remains some deficiencies based on previous studies. In particular, LRU may perform considerably bad when the workloads of application program are larger than L2 cache, because there are usually a great number of less reused lines that are never reused or reused for few times in L2 cache. The cache performance can be improved significantly if we keep non-less reused lines rather than less reused lines in cache for a time quantum. This paper proposes a new architecture called Shared Less Reused Filter (SLRF) that applying the less reused filter that can filter out the less reused lines rather than just never reused lines according to the context of Chip Multiprocessors. Our experiments on a large set of multithread benchmarks, for 11 splash-2 benchmarks, demonstrate that our technique shows that augmented in a 2M LRU-managed L2 cache with a SLRF which has 256 KB filter buffer improves IPC by 13.43% compared with the context of the uniprocessor, and reduces the average MPKI by 18.20 %.\",\"PeriodicalId\":368130,\"journal\":{\"name\":\"2011 Fourth International Conference on Intelligent Computation Technology and Automation\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 Fourth International Conference on Intelligent Computation Technology and Automation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICTA.2011.582\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Fourth International Conference on Intelligent Computation Technology and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICTA.2011.582","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
一般来说,Less Recently Used (LRU)策略通常用于管理芯片多处理器中的共享L2缓存。然而,根据以往的研究,LRU政策仍存在一些不足。特别是,当应用程序的工作负载大于L2缓存时,LRU的性能可能会相当差,因为在L2缓存中通常会有大量重用较少的行从未重用或很少重用。如果我们在一定时间内将非较少重用的行保存在缓存中,而不是将较少重用的行保存在缓存中,则可以显著提高缓存性能。本文提出了一种新的体系结构,称为共享少重用滤波器(SLRF),根据芯片多处理器的背景,采用少重用滤波器来过滤掉少重用的线路,而不是仅仅是从不重用的线路。我们在11个splash-2的大型多线程基准测试中进行的实验表明,我们的技术表明,与单处理器环境相比,在2M lru管理的L2缓存中使用具有256 KB过滤器缓冲的SLRF可以提高IPC 13.43%,并将平均MPKI降低18.20%。
SLRF: A High-efficiency Shared Less Reused Filter in Chip Multiprocessors
In general, the Less Recently Used (LRU) policy was commonly employed to manage shared L2 cache in Chip Multiprocessors. However, LRU policy remains some deficiencies based on previous studies. In particular, LRU may perform considerably bad when the workloads of application program are larger than L2 cache, because there are usually a great number of less reused lines that are never reused or reused for few times in L2 cache. The cache performance can be improved significantly if we keep non-less reused lines rather than less reused lines in cache for a time quantum. This paper proposes a new architecture called Shared Less Reused Filter (SLRF) that applying the less reused filter that can filter out the less reused lines rather than just never reused lines according to the context of Chip Multiprocessors. Our experiments on a large set of multithread benchmarks, for 11 splash-2 benchmarks, demonstrate that our technique shows that augmented in a 2M LRU-managed L2 cache with a SLRF which has 256 KB filter buffer improves IPC by 13.43% compared with the context of the uniprocessor, and reduces the average MPKI by 18.20 %.