分析高速网络中规划模型对高效通信重叠的影响

2014 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2014-07-21 DOI:10.1109/HPCSim.2014.6903689

G. Utrera, Marisa Gil, X. Martorell

{"title":"分析高速网络中规划模型对高效通信重叠的影响","authors":"G. Utrera, Marisa Gil, X. Martorell","doi":"10.1109/HPCSim.2014.6903689","DOIUrl":null,"url":null,"abstract":"Exascale applications for civil engineering, simulations and other fields related with current research make intensive use of large sparse matrices. A characteristic of these matrices is the difficulty of balancing communication and computation, so that even when these two phases are overlapped the application does not achieve a good overall scalability, but instead suffers from a loss of performance. Some proposals have been presented in order to diminish this drawback, based on the hybrid use of programming models, using MPI as the communication basis and threads for computation -mainly OpenMP, but also Cilk, CUDA or OpenCL, to adapt to new heterogeneous platforms. In this work, we evaluate the impact of providing task-based parallelism instead of fork-join parallelism. As regards communication, the appearance of faster networks with specific optimizations and internal protocol characteristics makes it appealing to analyze and evaluate the influence of these networks on performance execution. We evaluate our results on two different communication networks: 10Gigabit Ethernet and Infiniband. For our evaluations we run the miniFE miniapplication of the Mantevo suite benchmark, in a homogeneous supercomputer platform based on Intel SandyBridge processors. Experimental results show how the network behavior can affect performance and how it can be managed via task-based models: from a hybrid MPI/OpenMP version that overlaps communication and computation, our task-based proposal MPI/OmpSs obtains up to 60% improvement.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"6 1","pages":"218-225"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Analyzing the impact of programming models for efficient communication overlap in high-speed networks\",\"authors\":\"G. Utrera, Marisa Gil, X. Martorell\",\"doi\":\"10.1109/HPCSim.2014.6903689\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Exascale applications for civil engineering, simulations and other fields related with current research make intensive use of large sparse matrices. A characteristic of these matrices is the difficulty of balancing communication and computation, so that even when these two phases are overlapped the application does not achieve a good overall scalability, but instead suffers from a loss of performance. Some proposals have been presented in order to diminish this drawback, based on the hybrid use of programming models, using MPI as the communication basis and threads for computation -mainly OpenMP, but also Cilk, CUDA or OpenCL, to adapt to new heterogeneous platforms. In this work, we evaluate the impact of providing task-based parallelism instead of fork-join parallelism. As regards communication, the appearance of faster networks with specific optimizations and internal protocol characteristics makes it appealing to analyze and evaluate the influence of these networks on performance execution. We evaluate our results on two different communication networks: 10Gigabit Ethernet and Infiniband. For our evaluations we run the miniFE miniapplication of the Mantevo suite benchmark, in a homogeneous supercomputer platform based on Intel SandyBridge processors. Experimental results show how the network behavior can affect performance and how it can be managed via task-based models: from a hybrid MPI/OpenMP version that overlaps communication and computation, our task-based proposal MPI/OmpSs obtains up to 60% improvement.\",\"PeriodicalId\":6469,\"journal\":{\"name\":\"2014 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"6 1\",\"pages\":\"218-225\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCSim.2014.6903689\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSim.2014.6903689","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

土木工程、仿真和其他与当前研究相关领域的百亿亿次应用大量使用了大型稀疏矩阵。这些矩阵的一个特点是难以平衡通信和计算，因此，即使这两个阶段重叠，应用程序也无法获得良好的整体可伸缩性，反而会遭受性能损失。为了减少这个缺点，已经提出了一些建议，基于混合使用编程模型，使用MPI作为通信基础和计算线程-主要是OpenMP，但也有Cilk, CUDA或OpenCL，以适应新的异构平台。在这项工作中，我们评估了提供基于任务的并行性而不是fork-join并行性的影响。在通信方面，具有特定优化和内部协议特征的更快网络的出现，使得分析和评估这些网络对性能执行的影响变得很有吸引力。我们在两种不同的通信网络上评估了我们的结果:10gb以太网和Infiniband。为了进行评估，我们在基于英特尔SandyBridge处理器的同构超级计算机平台上运行了Mantevo套件基准测试的miniFE迷你应用程序。实验结果显示了网络行为如何影响性能以及如何通过基于任务的模型来管理网络行为:从一个重叠通信和计算的MPI/OpenMP混合版本中，我们基于任务的MPI/ omps提案获得了高达60%的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Analyzing the impact of programming models for efficient communication overlap in high-speed networks

Exascale applications for civil engineering, simulations and other fields related with current research make intensive use of large sparse matrices. A characteristic of these matrices is the difficulty of balancing communication and computation, so that even when these two phases are overlapped the application does not achieve a good overall scalability, but instead suffers from a loss of performance. Some proposals have been presented in order to diminish this drawback, based on the hybrid use of programming models, using MPI as the communication basis and threads for computation -mainly OpenMP, but also Cilk, CUDA or OpenCL, to adapt to new heterogeneous platforms. In this work, we evaluate the impact of providing task-based parallelism instead of fork-join parallelism. As regards communication, the appearance of faster networks with specific optimizations and internal protocol characteristics makes it appealing to analyze and evaluate the influence of these networks on performance execution. We evaluate our results on two different communication networks: 10Gigabit Ethernet and Infiniband. For our evaluations we run the miniFE miniapplication of the Mantevo suite benchmark, in a homogeneous supercomputer platform based on Intel SandyBridge processors. Experimental results show how the network behavior can affect performance and how it can be managed via task-based models: from a hybrid MPI/OpenMP version that overlaps communication and computation, our task-based proposal MPI/OmpSs obtains up to 60% improvement.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 International Conference on High Performance Computing & Simulation (HPCS)

自引率

0.00%

发文量