Locality vs. Balance: Exploring Data Mapping Policies on NUMA Systems

M. Diener, E. Cruz, P. Navaux
{"title":"Locality vs. Balance: Exploring Data Mapping Policies on NUMA Systems","authors":"M. Diener, E. Cruz, P. Navaux","doi":"10.1109/PDP.2015.11","DOIUrl":null,"url":null,"abstract":"In parallel architectures that have a Non-Uniform Memory Access (NUMA) behavior, the mapping of memory pages to NUMA nodes influences the performance of parallel applications. In order to improve traditional data mapping policies, two basic strategies can be employed: optimizing locality or balance of memory accesses. In a locality-based policy, memory pages are mapped to nodes that access the page the most. In a balance-based policy, memory pages are mapped such that the number of memory accesses resolved by each memory controller is similar. In this paper, we perform an in-depth exploration of these data mapping policies on the performance of parallel applications. We introduce metrics that describe their memory access behavior and evaluate their suitability for data mapping. We also present new mapping policies that focus on locality, balance or both. These policies were evaluated on three different NUMA architectures with applications from the NAS-OMP and PARSEC benchmark suites. Results show that the performance improvements of each policy depend on the characteristics of the applications and machines. Choosing the wrong policy can actually hurt the performance compared to the default first-touch mapping. Compared to traditional mapping policies and to policies that only focus on either locality or balance, taking into account both locality and balance results in the highest improvements. Furthermore, it avoids the performance reduction caused by the wrong data mapping.","PeriodicalId":285111,"journal":{"name":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2015.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

Abstract

In parallel architectures that have a Non-Uniform Memory Access (NUMA) behavior, the mapping of memory pages to NUMA nodes influences the performance of parallel applications. In order to improve traditional data mapping policies, two basic strategies can be employed: optimizing locality or balance of memory accesses. In a locality-based policy, memory pages are mapped to nodes that access the page the most. In a balance-based policy, memory pages are mapped such that the number of memory accesses resolved by each memory controller is similar. In this paper, we perform an in-depth exploration of these data mapping policies on the performance of parallel applications. We introduce metrics that describe their memory access behavior and evaluate their suitability for data mapping. We also present new mapping policies that focus on locality, balance or both. These policies were evaluated on three different NUMA architectures with applications from the NAS-OMP and PARSEC benchmark suites. Results show that the performance improvements of each policy depend on the characteristics of the applications and machines. Choosing the wrong policy can actually hurt the performance compared to the default first-touch mapping. Compared to traditional mapping policies and to policies that only focus on either locality or balance, taking into account both locality and balance results in the highest improvements. Furthermore, it avoids the performance reduction caused by the wrong data mapping.
局部性与平衡性:探索NUMA系统上的数据映射策略
在具有非统一内存访问(NUMA)行为的并行体系结构中,将内存页映射到NUMA节点会影响并行应用程序的性能。为了改进传统的数据映射策略,可以采用两种基本策略:优化局部性或平衡内存访问。在基于位置的策略中,内存页被映射到访问该页最多的节点。在基于平衡的策略中,对内存页进行映射,使得每个内存控制器解析的内存访问数量相似。在本文中,我们深入探讨了这些数据映射策略对并行应用程序性能的影响。我们引入了描述它们的内存访问行为和评估它们对数据映射的适用性的指标。我们还提出了新的映射策略,侧重于局部性、平衡性或两者兼而有之。这些策略在三个不同的NUMA体系结构上使用来自NAS-OMP和PARSEC基准套件的应用程序进行了评估。结果表明,每种策略的性能改进取决于应用程序和机器的特性。与默认的首次接触映射相比,选择错误的策略实际上会损害性能。与传统的映射策略和只关注局部性或平衡性的策略相比,同时考虑局部性和平衡性会带来最大的改进。此外,它还避免了由于错误的数据映射而导致的性能降低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信