Online scalability characterization of data-parallel programs on many cores

Younghyun Cho, S. Oh, Bernhard Egger
{"title":"Online scalability characterization of data-parallel programs on many cores","authors":"Younghyun Cho, S. Oh, Bernhard Egger","doi":"10.1145/2967938.2967960","DOIUrl":null,"url":null,"abstract":"We present an accurate online scalability prediction model for data-parallel programs on NUMA many-core systems. Memory contention is considered to be the major limiting factor of program scalability as data parallelism limits the amount of synchronization or data dependencies between parallel work units. Reflecting the architecture of NUMA systems, contention is modeled at the last-level caches of the compute nodes and the memory nodes using a two-level queuing model to estimate the mean service time of the individual memory nodes. Scalability predictions for individual or co-located parallel applications are based solely on data obtained during a short sampling period at runtime; this allows the presented model to be employed in a variety of scenarios. The proposed model has been implemented into an open-source OpenCL and the GNU OpenMP runtime and evaluated on a 64-core AMD system. For a wide variety of parallel workloads and configurations, the evaluations show that the model is able to predict the scalability of data-parallel kernels with high accuracy.","PeriodicalId":407717,"journal":{"name":"2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2967938.2967960","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

We present an accurate online scalability prediction model for data-parallel programs on NUMA many-core systems. Memory contention is considered to be the major limiting factor of program scalability as data parallelism limits the amount of synchronization or data dependencies between parallel work units. Reflecting the architecture of NUMA systems, contention is modeled at the last-level caches of the compute nodes and the memory nodes using a two-level queuing model to estimate the mean service time of the individual memory nodes. Scalability predictions for individual or co-located parallel applications are based solely on data obtained during a short sampling period at runtime; this allows the presented model to be employed in a variety of scenarios. The proposed model has been implemented into an open-source OpenCL and the GNU OpenMP runtime and evaluated on a 64-core AMD system. For a wide variety of parallel workloads and configurations, the evaluations show that the model is able to predict the scalability of data-parallel kernels with high accuracy.
多核数据并行程序的在线可伸缩性特性
针对NUMA多核系统上的数据并行程序,提出了一种准确的在线可扩展性预测模型。内存争用被认为是程序可伸缩性的主要限制因素,因为数据并行性限制了并行工作单元之间的同步或数据依赖的数量。为了反映NUMA系统的体系结构,争用在计算节点和内存节点的最后一级缓存上建模,使用两级排队模型来估计各个内存节点的平均服务时间。单个或共存的并行应用程序的可伸缩性预测完全基于在运行时的短采样期间获得的数据;这允许在各种场景中使用所呈现的模型。该模型已在开源的OpenCL和GNU OpenMP运行时中实现,并在64核AMD系统上进行了评估。对于各种并行工作负载和配置,评估表明该模型能够高精度地预测数据并行核的可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信