{"title":"用于扩展MPI应用的HCA复用端点:用uDAPL进行设计和性能评估","authors":"Jasjit Singh, Yogeshwar Sonawane","doi":"10.1109/CLUSTER.2010.22","DOIUrl":null,"url":null,"abstract":"With an ever increasing demand for computing power, number of nodes to be deployed in a cluster based supercomputer is increasing. Limited hardware resources such as Endpoints (equivalent to Queue Pairs) on a Host Channel Adapter (HCA) of a high speed interconnect limit the scalability of a parallel application based on MPI that sets up reliable connections between every process pair using endpoints, prior to communication. In this paper, we propose a novel approach of multiplexing hardware endpoints (hweps) to extend scalability. (a) We discuss critical design issues with the multiplexing technique that differentiates a hwep from its software counterpart (swep) and enables sharing of hwep by multiple sweps. (b) We introduce the concept of Virtual Identifier (VID) which ensures that the connection between hardware endpoints is strictly one-to-one. (c) We also present static mapping scheme that offsets the overheads incurred due to multiplexing. User Direct Access Programming Library (uDAPL) defines a single set of APIs for all RDMA capable transports. We have incorporated the proposed multiplexing technique as a part of uDAPL implementation. Using this approach, we are able to scale MPI applications beyond the limit imposed by HCA and with no visible performance degradation.","PeriodicalId":152171,"journal":{"name":"2010 IEEE International Conference on Cluster Computing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Multiplexing Endpoints of HCA for Scaling MPI Applications: Design and Performance Evaluation with uDAPL\",\"authors\":\"Jasjit Singh, Yogeshwar Sonawane\",\"doi\":\"10.1109/CLUSTER.2010.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With an ever increasing demand for computing power, number of nodes to be deployed in a cluster based supercomputer is increasing. Limited hardware resources such as Endpoints (equivalent to Queue Pairs) on a Host Channel Adapter (HCA) of a high speed interconnect limit the scalability of a parallel application based on MPI that sets up reliable connections between every process pair using endpoints, prior to communication. In this paper, we propose a novel approach of multiplexing hardware endpoints (hweps) to extend scalability. (a) We discuss critical design issues with the multiplexing technique that differentiates a hwep from its software counterpart (swep) and enables sharing of hwep by multiple sweps. (b) We introduce the concept of Virtual Identifier (VID) which ensures that the connection between hardware endpoints is strictly one-to-one. (c) We also present static mapping scheme that offsets the overheads incurred due to multiplexing. User Direct Access Programming Library (uDAPL) defines a single set of APIs for all RDMA capable transports. We have incorporated the proposed multiplexing technique as a part of uDAPL implementation. Using this approach, we are able to scale MPI applications beyond the limit imposed by HCA and with no visible performance degradation.\",\"PeriodicalId\":152171,\"journal\":{\"name\":\"2010 IEEE International Conference on Cluster Computing\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Conference on Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLUSTER.2010.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTER.2010.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multiplexing Endpoints of HCA for Scaling MPI Applications: Design and Performance Evaluation with uDAPL
With an ever increasing demand for computing power, number of nodes to be deployed in a cluster based supercomputer is increasing. Limited hardware resources such as Endpoints (equivalent to Queue Pairs) on a Host Channel Adapter (HCA) of a high speed interconnect limit the scalability of a parallel application based on MPI that sets up reliable connections between every process pair using endpoints, prior to communication. In this paper, we propose a novel approach of multiplexing hardware endpoints (hweps) to extend scalability. (a) We discuss critical design issues with the multiplexing technique that differentiates a hwep from its software counterpart (swep) and enables sharing of hwep by multiple sweps. (b) We introduce the concept of Virtual Identifier (VID) which ensures that the connection between hardware endpoints is strictly one-to-one. (c) We also present static mapping scheme that offsets the overheads incurred due to multiplexing. User Direct Access Programming Library (uDAPL) defines a single set of APIs for all RDMA capable transports. We have incorporated the proposed multiplexing technique as a part of uDAPL implementation. Using this approach, we are able to scale MPI applications beyond the limit imposed by HCA and with no visible performance degradation.