Fine Grain Cache Partitioning Using Per-Instruction Working Blocks

Jason Jong Kyu Park, Yongjun Park, S. Mahlke
{"title":"Fine Grain Cache Partitioning Using Per-Instruction Working Blocks","authors":"Jason Jong Kyu Park, Yongjun Park, S. Mahlke","doi":"10.1109/PACT.2015.11","DOIUrl":null,"url":null,"abstract":"A traditional least-recently used (LRU) cache replacement policy fails to achieve the performance of the optimal replacement policy when cache blocks with diverse reuse characteristics interfere with each other. When multiple applications share a cache, it is often partitioned among the applications because cache blocks show similar reuse characteristics within each application. In this paper, we extend the idea to a single application by viewing a cache as a shared resource between individual memory instructions. To that end, we propose Instruction-based LRU (ILRU), a fine grain cache partitioning that way-partitions individual cache sets based on per-instruction working blocks, which are cache blocks required by an instruction to satisfy all the reuses within a set. In ILRU, a memory instruction steals a block from another only when it requires more blocks than it currently has. Otherwise, a memory instruction victimizes among the cache blocks inserted by itself. Experiments show that ILRU can improve the cache performance in all levels of cache, reducing the number of misses by an average of 7.0% for L1, 9.1% for L2, and 8.7% for L3, which results in a geometric mean performance improvement of 5.3%. ILRU for a three-level cache hierarchy imposes a modest 1.3% storage overhead over the total cache size.","PeriodicalId":385398,"journal":{"name":"2015 International Conference on Parallel Architecture and Compilation (PACT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2015-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Parallel Architecture and Compilation (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACT.2015.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

A traditional least-recently used (LRU) cache replacement policy fails to achieve the performance of the optimal replacement policy when cache blocks with diverse reuse characteristics interfere with each other. When multiple applications share a cache, it is often partitioned among the applications because cache blocks show similar reuse characteristics within each application. In this paper, we extend the idea to a single application by viewing a cache as a shared resource between individual memory instructions. To that end, we propose Instruction-based LRU (ILRU), a fine grain cache partitioning that way-partitions individual cache sets based on per-instruction working blocks, which are cache blocks required by an instruction to satisfy all the reuses within a set. In ILRU, a memory instruction steals a block from another only when it requires more blocks than it currently has. Otherwise, a memory instruction victimizes among the cache blocks inserted by itself. Experiments show that ILRU can improve the cache performance in all levels of cache, reducing the number of misses by an average of 7.0% for L1, 9.1% for L2, and 8.7% for L3, which results in a geometric mean performance improvement of 5.3%. ILRU for a three-level cache hierarchy imposes a modest 1.3% storage overhead over the total cache size.
使用每指令工作块的细粒度缓存分区
当具有不同重用特征的缓存块相互干扰时,传统的LRU (least-recently used)缓存替换策略无法达到最优替换策略的性能。当多个应用程序共享一个缓存时,它通常在应用程序之间进行分区,因为缓存块在每个应用程序中显示相似的重用特征。在本文中,我们通过将缓存视为单个内存指令之间的共享资源,将该思想扩展到单个应用程序。为此,我们提出了基于指令的LRU (ILRU),这是一种细粒度缓存分区,它基于每条指令的工作块对单个缓存集进行分区,这些工作块是指令满足集合内所有重用所需的缓存块。在ILRU中,只有当内存指令需要比当前拥有更多的块时,它才会从另一个内存指令那里窃取一个块。否则,内存指令会在自己插入的缓存块中受害。实验表明,ILRU可以在所有级别的缓存中提高缓存性能,平均减少了L1的7.0%,L2的9.1%和L3的8.7%的失误次数,从而使几何平均性能提高了5.3%。对于三层缓存层次结构,ILRU在总缓存大小上施加了适度的1.3%的存储开销。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信