Yuichiro Ajima, Takafumi Nose, K. Saga, Naoyuki Shida, S. Sumimoto
{"title":"ACPdl:瘦pgas层上的数据结构和全局内存分配器库","authors":"Yuichiro Ajima, Takafumi Nose, K. Saga, Naoyuki Shida, S. Sumimoto","doi":"10.1145/2832241.2832242","DOIUrl":null,"url":null,"abstract":"HPC systems comprise an increasing number of processor cores towards the exascale computing era. As the number of parallel processes on a system increases, the number of point-to-point connections for each process increases and the memory usage of connections becomes an issue. A new communication library called Advanced Communication Primitives (ACP) is being developed to address the issue by providing communication functions with the Partitioned Global Address Space (PGAS) model that is potentially connection-less. The ACP library is designed to underlie domain-specific languages or parallel language runtimes. The ACP basic layer (ACPbl) comprises a minimum set of functions to abstract interconnect devices and to provide an address translation mechanism. As far as using ACPbl, global address can be granted only to local memory. In this paper, a new set of functions called the ACP data library (ACPdl) including global memory allocator and data-structure library is introduced to improve the productivity of the ACP library. The global memory allocator allocates a memory region of a remote process and assigns global address to it without involving the remote process. The data-structure library uses the global memory allocator internally and provides functions to create, read, update and delete distributed data-structures. Evaluation results of global memory allocator and associative-array data-structure functions show that overhead between the main and communication threads may become a bottleneck when an implementation of ACPbl uses a low latency HPC-dedicated interconnect device.","PeriodicalId":347945,"journal":{"name":"ESPM '15","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"ACPdl: data-structure and global memory allocator library over a thin PGAS-layer\",\"authors\":\"Yuichiro Ajima, Takafumi Nose, K. Saga, Naoyuki Shida, S. Sumimoto\",\"doi\":\"10.1145/2832241.2832242\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"HPC systems comprise an increasing number of processor cores towards the exascale computing era. As the number of parallel processes on a system increases, the number of point-to-point connections for each process increases and the memory usage of connections becomes an issue. A new communication library called Advanced Communication Primitives (ACP) is being developed to address the issue by providing communication functions with the Partitioned Global Address Space (PGAS) model that is potentially connection-less. The ACP library is designed to underlie domain-specific languages or parallel language runtimes. The ACP basic layer (ACPbl) comprises a minimum set of functions to abstract interconnect devices and to provide an address translation mechanism. As far as using ACPbl, global address can be granted only to local memory. In this paper, a new set of functions called the ACP data library (ACPdl) including global memory allocator and data-structure library is introduced to improve the productivity of the ACP library. The global memory allocator allocates a memory region of a remote process and assigns global address to it without involving the remote process. The data-structure library uses the global memory allocator internally and provides functions to create, read, update and delete distributed data-structures. Evaluation results of global memory allocator and associative-array data-structure functions show that overhead between the main and communication threads may become a bottleneck when an implementation of ACPbl uses a low latency HPC-dedicated interconnect device.\",\"PeriodicalId\":347945,\"journal\":{\"name\":\"ESPM '15\",\"volume\":\"105 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ESPM '15\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2832241.2832242\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESPM '15","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2832241.2832242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ACPdl: data-structure and global memory allocator library over a thin PGAS-layer
HPC systems comprise an increasing number of processor cores towards the exascale computing era. As the number of parallel processes on a system increases, the number of point-to-point connections for each process increases and the memory usage of connections becomes an issue. A new communication library called Advanced Communication Primitives (ACP) is being developed to address the issue by providing communication functions with the Partitioned Global Address Space (PGAS) model that is potentially connection-less. The ACP library is designed to underlie domain-specific languages or parallel language runtimes. The ACP basic layer (ACPbl) comprises a minimum set of functions to abstract interconnect devices and to provide an address translation mechanism. As far as using ACPbl, global address can be granted only to local memory. In this paper, a new set of functions called the ACP data library (ACPdl) including global memory allocator and data-structure library is introduced to improve the productivity of the ACP library. The global memory allocator allocates a memory region of a remote process and assigns global address to it without involving the remote process. The data-structure library uses the global memory allocator internally and provides functions to create, read, update and delete distributed data-structures. Evaluation results of global memory allocator and associative-array data-structure functions show that overhead between the main and communication threads may become a bottleneck when an implementation of ACPbl uses a low latency HPC-dedicated interconnect device.