基于ib的PGAS模型的高效按需连接管理机制

Abhinav Vishnu, M. Krishnan
{"title":"基于ib的PGAS模型的高效按需连接管理机制","authors":"Abhinav Vishnu, M. Krishnan","doi":"10.1109/CCGRID.2010.58","DOIUrl":null,"url":null,"abstract":"In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely used due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pair wise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various micro benchmarks and benchmarks (LU Factorization, Random-Access and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no performance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Efficient On-Demand Connection Management Mechanisms with PGAS Models over InfiniBand\",\"authors\":\"Abhinav Vishnu, M. Krishnan\",\"doi\":\"10.1109/CCGRID.2010.58\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely used due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pair wise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various micro benchmarks and benchmarks (LU Factorization, Random-Access and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no performance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.\",\"PeriodicalId\":444485,\"journal\":{\"name\":\"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing\",\"volume\":\"105 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGRID.2010.58\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2010.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

在过去十年左右的时间里,由于出色的性价比,集群的受欢迎程度大幅上升。在此期间,各种各样的互连被提出,InfiniBand因其高性能和开放标准而引领潮流。同时,为了满足各种应用及其编程模型的需求,出现了多种编程模型。为了支持多种编程模型的需求,InfiniBand提供了多种传输语义,从不可靠的无连接到可靠的连接特征。其中,可靠连接(RC)语义由于其高性能和对远程直接内存访问(RDMA)、硬件原子和网络容错等新特性的支持而得到广泛应用。然而,RC传输语义的面向对连接的特性限制了它的可伸缩性和在处理器数量增加时的使用。在本文中,我们在分区全局地址空间(PGAS)编程模型的背景下设计并实现了按需连接管理方法,该方法提供了共享内存抽象和单向通信语义,从而导致了多种语言(UPC, X10, Chapel)和库(全局数组,MPI-RMA)的发展。本文以全局阵列为研究载体,利用全局阵列的运行时系统——聚合远程内存复制接口(armi)实现了该方法。我们使用各种微基准和基准(LU分解、随机访问和Lennard Jones模拟)以及应用(多阶段地下传输(STOMP))来评估我们的方法——armci随需应变连接管理(ARMCI-ODCM)。通过对多达4096个处理器的性能评估,我们能够将连接内存减少数倍,而性能的下降可以忽略不计。在4096个处理器上使用STOMP,在没有性能下降的情况下,将总连接内存减少了66倍。据我们所知,这是使用PGAS模型的InfiniBand按需连接管理的第一个设计,实施和评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient On-Demand Connection Management Mechanisms with PGAS Models over InfiniBand
In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely used due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pair wise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various micro benchmarks and benchmarks (LU Factorization, Random-Access and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no performance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信