HOMP: Automated Distribution of Parallel Loops and Data in Highly Parallel Accelerator-Based Systems

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2017-05-01 DOI:10.1109/IPDPS.2017.99

Yonghong Yan, Jiawen Liu, K. Cameron, M. Umar

{"title":"HOMP: Automated Distribution of Parallel Loops and Data in Highly Parallel Accelerator-Based Systems","authors":"Yonghong Yan, Jiawen Liu, K. Cameron, M. Umar","doi":"10.1109/IPDPS.2017.99","DOIUrl":null,"url":null,"abstract":"Heterogeneous computing systems, e.g., those with accelerators than the host CPUs, offer the accelerated performance for a variety of workloads. However, most parallel programming models require platform dependent, time-consuming hand-tuning efforts for collectively using all the resources in a system to achieve efficient results. In this work, we explore the use of OpenMP parallel language extensions to empower users with the ability to design applications that automatically and simultaneously leverage CPUs and accelerators to further optimize use of available resources. We believe such automation will be key to ensuring codes adapt to increases in the number and diversity of accelerator resources for future computing systems. The proposed system combines language extensions to OpenMP, load-balancing algorithms and heuristics, and a runtime system for loop distribution across heterogeneous processing elements. We demonstrate the effectiveness of our automated approach to program on systems with multiple CPUs, GPUs, and MICs.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.99","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Heterogeneous computing systems, e.g., those with accelerators than the host CPUs, offer the accelerated performance for a variety of workloads. However, most parallel programming models require platform dependent, time-consuming hand-tuning efforts for collectively using all the resources in a system to achieve efficient results. In this work, we explore the use of OpenMP parallel language extensions to empower users with the ability to design applications that automatically and simultaneously leverage CPUs and accelerators to further optimize use of available resources. We believe such automation will be key to ensuring codes adapt to increases in the number and diversity of accelerator resources for future computing systems. The proposed system combines language extensions to OpenMP, load-balancing algorithms and heuristics, and a runtime system for loop distribution across heterogeneous processing elements. We demonstrate the effectiveness of our automated approach to program on systems with multiple CPUs, GPUs, and MICs.

查看原文本刊更多论文

高度并行加速器系统中并行环路和数据的自动分布

异构计算系统，例如，那些比主机cpu有加速器的系统，可以为各种工作负载提供加速性能。然而，大多数并行编程模型需要依赖于平台的、耗时的手动调优工作，以便共同使用系统中的所有资源来获得有效的结果。在这项工作中，我们探索了OpenMP并行语言扩展的使用，使用户能够设计自动同时利用cpu和加速器的应用程序，以进一步优化可用资源的使用。我们相信，这种自动化将是确保代码适应未来计算系统中加速器资源数量和多样性增加的关键。提出的系统结合了OpenMP的语言扩展、负载平衡算法和启发式，以及跨异构处理元素的循环分发的运行时系统。我们展示了我们的自动化方法在具有多个cpu、gpu和mic的系统上编程的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量