重用基于距离的性能建模和工作负载映射

Sai Prashanth Muralidhara, M. Kandemir, Orhan Kislal
{"title":"重用基于距离的性能建模和工作负载映射","authors":"Sai Prashanth Muralidhara, M. Kandemir, Orhan Kislal","doi":"10.1145/2212908.2212936","DOIUrl":null,"url":null,"abstract":"Modern multicore architectures have multiple cores connected to a hierarchical cache structure resulting in heterogeneity in cache sharing across different subsets of cores. In these systems, overall throughput and efficiency depends heavily on a careful mapping of applications to available cores. In this paper, we study the problem of application-to-core mapping with the goal of trying to improve the overall cache performance in the presence of a hierarchical multi-level cache structure. We propose to sample the memory access patterns of individual applications and build their reuse distance distributions. Further, we propose to use these reuse distance distributions to compute an application-to-core mapping that tries to improve the overall cache performance, and consequently, the overall throughput. We show that our proposed mapping scheme is very effective in practice yielding throughput benefits of about 39% over the worst case mapping and about 30% over the default operating system based mapping. We believe, as larger chip multiprocessors with deeper cache hierarchies are projected to be the norm in the future, efficient mapping of applications to cores will become a vital requirement to extract the maximum possible performance from these systems.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Reuse distance based performance modeling and workload mapping\",\"authors\":\"Sai Prashanth Muralidhara, M. Kandemir, Orhan Kislal\",\"doi\":\"10.1145/2212908.2212936\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern multicore architectures have multiple cores connected to a hierarchical cache structure resulting in heterogeneity in cache sharing across different subsets of cores. In these systems, overall throughput and efficiency depends heavily on a careful mapping of applications to available cores. In this paper, we study the problem of application-to-core mapping with the goal of trying to improve the overall cache performance in the presence of a hierarchical multi-level cache structure. We propose to sample the memory access patterns of individual applications and build their reuse distance distributions. Further, we propose to use these reuse distance distributions to compute an application-to-core mapping that tries to improve the overall cache performance, and consequently, the overall throughput. We show that our proposed mapping scheme is very effective in practice yielding throughput benefits of about 39% over the worst case mapping and about 30% over the default operating system based mapping. We believe, as larger chip multiprocessors with deeper cache hierarchies are projected to be the norm in the future, efficient mapping of applications to cores will become a vital requirement to extract the maximum possible performance from these systems.\",\"PeriodicalId\":430420,\"journal\":{\"name\":\"ACM International Conference on Computing Frontiers\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM International Conference on Computing Frontiers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2212908.2212936\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2212908.2212936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

现代多核体系结构将多个核连接到分层缓存结构,从而导致不同核子集之间缓存共享的异构性。在这些系统中,总体吞吐量和效率在很大程度上取决于应用程序到可用核心的仔细映射。在本文中,我们研究了应用到核心的映射问题,目的是试图在分层的多级缓存结构中提高整体缓存性能。我们建议对单个应用程序的内存访问模式进行采样,并构建它们的重用距离分布。此外,我们建议使用这些重用距离分布来计算应用程序到核心的映射,该映射试图提高总体缓存性能,从而提高总体吞吐量。我们表明,我们提出的映射方案在实践中非常有效,比最坏情况下的映射提高了39%的吞吐量,比基于默认操作系统的映射提高了30%。我们相信,随着具有更深缓存层次的更大的芯片多处理器预计将成为未来的标准,将应用程序有效地映射到核心将成为从这些系统中提取最大可能性能的重要要求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reuse distance based performance modeling and workload mapping
Modern multicore architectures have multiple cores connected to a hierarchical cache structure resulting in heterogeneity in cache sharing across different subsets of cores. In these systems, overall throughput and efficiency depends heavily on a careful mapping of applications to available cores. In this paper, we study the problem of application-to-core mapping with the goal of trying to improve the overall cache performance in the presence of a hierarchical multi-level cache structure. We propose to sample the memory access patterns of individual applications and build their reuse distance distributions. Further, we propose to use these reuse distance distributions to compute an application-to-core mapping that tries to improve the overall cache performance, and consequently, the overall throughput. We show that our proposed mapping scheme is very effective in practice yielding throughput benefits of about 39% over the worst case mapping and about 30% over the default operating system based mapping. We believe, as larger chip multiprocessors with deeper cache hierarchies are projected to be the norm in the future, efficient mapping of applications to cores will become a vital requirement to extract the maximum possible performance from these systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信