块大小和获取策略对性能的影响

S. Przybylski
{"title":"块大小和获取策略对性能的影响","authors":"S. Przybylski","doi":"10.1145/325164.325135","DOIUrl":null,"url":null,"abstract":"The interactions between a cache's block size, fetch size, and fetch policy from the perspective of maximizing system-level performance are explored. It has been previously noted that, given a simple fetch strategy, the performance optimal block size is almost always four or eight words. If there is even a small cycle time penalty associated with either longer blocks or fetches, then the performance optimal size is noticeably reduced. In split cache organizations, where the fetch and block sizes of instruction and data caches are all independent design variables, instruction cache block size and fetch size should be the same. For the workload and write-back write policy used in this trace-driven simulation study, the instruction cache block size should be about a factor of 2 greater than the data cache fetch size, which in turn should be equal to or double the data cache block size. The simplest fetch strategy of fetching only on a miss and stalling the CPU until the fetch is complete works well. Complicated fetch strategies do not produce the performance improvements indicated by the accompanying reductions in miss ratios because of limited memory resources and a strong temporal clustering of cache misses. For the environments simulated, the most effective fetch strategy improved performance by between 1.7% and 4.5% over the simplest strategy described above.<<ETX>>","PeriodicalId":297046,"journal":{"name":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"108","resultStr":"{\"title\":\"The performance impact of block sizes and fetch strategies\",\"authors\":\"S. Przybylski\",\"doi\":\"10.1145/325164.325135\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The interactions between a cache's block size, fetch size, and fetch policy from the perspective of maximizing system-level performance are explored. It has been previously noted that, given a simple fetch strategy, the performance optimal block size is almost always four or eight words. If there is even a small cycle time penalty associated with either longer blocks or fetches, then the performance optimal size is noticeably reduced. In split cache organizations, where the fetch and block sizes of instruction and data caches are all independent design variables, instruction cache block size and fetch size should be the same. For the workload and write-back write policy used in this trace-driven simulation study, the instruction cache block size should be about a factor of 2 greater than the data cache fetch size, which in turn should be equal to or double the data cache block size. The simplest fetch strategy of fetching only on a miss and stalling the CPU until the fetch is complete works well. Complicated fetch strategies do not produce the performance improvements indicated by the accompanying reductions in miss ratios because of limited memory resources and a strong temporal clustering of cache misses. For the environments simulated, the most effective fetch strategy improved performance by between 1.7% and 4.5% over the simplest strategy described above.<<ETX>>\",\"PeriodicalId\":297046,\"journal\":{\"name\":\"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1990-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"108\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/325164.325135\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/325164.325135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 108

摘要

从最大化系统级性能的角度探讨了缓存的块大小、取大小和取策略之间的相互作用。前面已经注意到,给定一个简单的获取策略,性能最优的块大小几乎总是四个或八个字。如果与更长的块或读取相关的周期时间损失很小,那么性能最佳大小就会明显降低。在分割缓存组织中,指令和数据缓存的取值和块大小都是独立的设计变量,指令缓存块大小和取值大小应该相同。对于本跟踪驱动模拟研究中使用的工作负载和回写策略,指令缓存块大小应该是数据缓存读取大小的2倍左右,而数据缓存读取大小又应该等于或两倍于数据缓存块大小。最简单的取操作策略是只在未取的情况下取操作,并暂停CPU直到取操作完成。由于有限的内存资源和缓存缺失的强时间聚类,复杂的读取策略并不能产生相应的缺失率降低所表明的性能改进。对于模拟的环境,最有效的获取策略比上面描述的最简单的策略提高了1.7%到4.5%的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The performance impact of block sizes and fetch strategies
The interactions between a cache's block size, fetch size, and fetch policy from the perspective of maximizing system-level performance are explored. It has been previously noted that, given a simple fetch strategy, the performance optimal block size is almost always four or eight words. If there is even a small cycle time penalty associated with either longer blocks or fetches, then the performance optimal size is noticeably reduced. In split cache organizations, where the fetch and block sizes of instruction and data caches are all independent design variables, instruction cache block size and fetch size should be the same. For the workload and write-back write policy used in this trace-driven simulation study, the instruction cache block size should be about a factor of 2 greater than the data cache fetch size, which in turn should be equal to or double the data cache block size. The simplest fetch strategy of fetching only on a miss and stalling the CPU until the fetch is complete works well. Complicated fetch strategies do not produce the performance improvements indicated by the accompanying reductions in miss ratios because of limited memory resources and a strong temporal clustering of cache misses. For the environments simulated, the most effective fetch strategy improved performance by between 1.7% and 4.5% over the simplest strategy described above.<>
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信