通过重用距离分析研究多核处理器扩展对目录技术的影响

Minshu Zhao, D. Yeung
{"title":"通过重用距离分析研究多核处理器扩展对目录技术的影响","authors":"Minshu Zhao, D. Yeung","doi":"10.1109/HPCA.2015.7056065","DOIUrl":null,"url":null,"abstract":"Researchers have proposed numerous directory techniques to address multicore scalability whose behavior depends on the CPU's particular configuration, e.g. core count and cache size. As CPUs continue to scale, it is essential to explore the directory's architecture dependences. However, this is challenging using detailed simulation given the large number of CPU configurations that are possible. This paper proposes to use multicore reuse distance analysis to study coherence directories. We develop a framework to extract the directory access stream from parallel LRU stacks, enabling rapid analysis of the directory's accesses and contents across both core count and cache size scaling. We also implement our framework in a profiler, and apply it to gain insights into multicore scaling's impact on the directory. Our profiling results show that directory accesses reduce by 3.5x across data cache size scaling, suggesting techniques that tradeoff access latency for reduced capacity or conflicts become increasingly effective as cache size scales. We also show the portion of on-chip memory devoted to the directory cache can be reduced by 53.3% across data cache size scaling, thus lowering the over-provisioning needed at large cache sizes. Finally, we validate our RD-based directory analyses, and find they are within 13% of cache simulations in terms of access count, on average.","PeriodicalId":6593,"journal":{"name":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)","volume":"40 1","pages":"590-602"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Studying the impact of multicore processor scaling on directory techniques via reuse distance analysis\",\"authors\":\"Minshu Zhao, D. Yeung\",\"doi\":\"10.1109/HPCA.2015.7056065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Researchers have proposed numerous directory techniques to address multicore scalability whose behavior depends on the CPU's particular configuration, e.g. core count and cache size. As CPUs continue to scale, it is essential to explore the directory's architecture dependences. However, this is challenging using detailed simulation given the large number of CPU configurations that are possible. This paper proposes to use multicore reuse distance analysis to study coherence directories. We develop a framework to extract the directory access stream from parallel LRU stacks, enabling rapid analysis of the directory's accesses and contents across both core count and cache size scaling. We also implement our framework in a profiler, and apply it to gain insights into multicore scaling's impact on the directory. Our profiling results show that directory accesses reduce by 3.5x across data cache size scaling, suggesting techniques that tradeoff access latency for reduced capacity or conflicts become increasingly effective as cache size scales. We also show the portion of on-chip memory devoted to the directory cache can be reduced by 53.3% across data cache size scaling, thus lowering the over-provisioning needed at large cache sizes. Finally, we validate our RD-based directory analyses, and find they are within 13% of cache simulations in terms of access count, on average.\",\"PeriodicalId\":6593,\"journal\":{\"name\":\"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)\",\"volume\":\"40 1\",\"pages\":\"590-602\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCA.2015.7056065\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2015.7056065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

研究人员已经提出了许多目录技术来解决多核可伸缩性,其行为取决于CPU的特定配置,例如核数和缓存大小。随着cpu的不断扩展,有必要研究目录的体系结构依赖关系。然而,考虑到可能的大量CPU配置,使用详细的模拟是具有挑战性的。本文提出用多核复用距离分析方法研究相干目录。我们开发了一个框架来从并行LRU堆栈中提取目录访问流,从而可以跨核心计数和缓存大小缩放快速分析目录访问和内容。我们还在一个分析器中实现了我们的框架,并应用它来深入了解多核扩展对目录的影响。我们的分析结果表明,在数据缓存大小的变化中,目录访问减少了3.5倍,这表明,随着缓存大小的变化,通过减少容量或冲突来权衡访问延迟的技术变得越来越有效。我们还展示了用于目录缓存的片上内存部分可以在数据缓存大小扩展中减少53.3%,从而降低了在大缓存大小时所需的过度配置。最后,我们验证了基于rd的目录分析,发现就访问计数而言,它们平均在缓存模拟的13%以内。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Studying the impact of multicore processor scaling on directory techniques via reuse distance analysis
Researchers have proposed numerous directory techniques to address multicore scalability whose behavior depends on the CPU's particular configuration, e.g. core count and cache size. As CPUs continue to scale, it is essential to explore the directory's architecture dependences. However, this is challenging using detailed simulation given the large number of CPU configurations that are possible. This paper proposes to use multicore reuse distance analysis to study coherence directories. We develop a framework to extract the directory access stream from parallel LRU stacks, enabling rapid analysis of the directory's accesses and contents across both core count and cache size scaling. We also implement our framework in a profiler, and apply it to gain insights into multicore scaling's impact on the directory. Our profiling results show that directory accesses reduce by 3.5x across data cache size scaling, suggesting techniques that tradeoff access latency for reduced capacity or conflicts become increasingly effective as cache size scales. We also show the portion of on-chip memory devoted to the directory cache can be reduced by 53.3% across data cache size scaling, thus lowering the over-provisioning needed at large cache sizes. Finally, we validate our RD-based directory analyses, and find they are within 13% of cache simulations in terms of access count, on average.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信