Jithin Jose, S. Potluri, H. Subramoni, Xiaoyi Lu, Khaled Hamidouche, K. Schulz, H. Sundar, D. Panda
{"title":"用混合MPI+PGAS编程模型设计可扩展的核外排序","authors":"Jithin Jose, S. Potluri, H. Subramoni, Xiaoyi Lu, Khaled Hamidouche, K. Schulz, H. Sundar, D. Panda","doi":"10.1145/2676870.2676880","DOIUrl":null,"url":null,"abstract":"While Hadoop holds the current Sort Benchmark record, previous research has shown that MPI-based solutions can deliver similar performance. However, most existing MPI-based designs rely on two-sided communication semantics. The emerging Partitioned Global Address Space (PGAS) programming model presents a flexible way to express parallelism for data-intensive applications. However, not all portions of the data analytics applications are amenable to conversion using PGAS models. In this study, we propose a novel design of the out-of-core, k-way parallel sort algorithm that takes advantage of the features of both MPI and OpenSHMEM PGAS models. To the best of our knowledge, this is the first design of any data intensive computing application using Hybrid MPI + PGAS models. Our experimental evaluation indicates that our proposed framework outperforms existing MPI-based design by up to 45% at 8,192 processes. It also achieves 7X improvement over Hadoop-based sort using the same amount of resources at 1,024 cores.","PeriodicalId":245693,"journal":{"name":"International Conference on Partitioned Global Address Space Programming Models","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models\",\"authors\":\"Jithin Jose, S. Potluri, H. Subramoni, Xiaoyi Lu, Khaled Hamidouche, K. Schulz, H. Sundar, D. Panda\",\"doi\":\"10.1145/2676870.2676880\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While Hadoop holds the current Sort Benchmark record, previous research has shown that MPI-based solutions can deliver similar performance. However, most existing MPI-based designs rely on two-sided communication semantics. The emerging Partitioned Global Address Space (PGAS) programming model presents a flexible way to express parallelism for data-intensive applications. However, not all portions of the data analytics applications are amenable to conversion using PGAS models. In this study, we propose a novel design of the out-of-core, k-way parallel sort algorithm that takes advantage of the features of both MPI and OpenSHMEM PGAS models. To the best of our knowledge, this is the first design of any data intensive computing application using Hybrid MPI + PGAS models. Our experimental evaluation indicates that our proposed framework outperforms existing MPI-based design by up to 45% at 8,192 processes. It also achieves 7X improvement over Hadoop-based sort using the same amount of resources at 1,024 cores.\",\"PeriodicalId\":245693,\"journal\":{\"name\":\"International Conference on Partitioned Global Address Space Programming Models\",\"volume\":\"99 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Partitioned Global Address Space Programming Models\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2676870.2676880\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Partitioned Global Address Space Programming Models","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2676870.2676880","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models
While Hadoop holds the current Sort Benchmark record, previous research has shown that MPI-based solutions can deliver similar performance. However, most existing MPI-based designs rely on two-sided communication semantics. The emerging Partitioned Global Address Space (PGAS) programming model presents a flexible way to express parallelism for data-intensive applications. However, not all portions of the data analytics applications are amenable to conversion using PGAS models. In this study, we propose a novel design of the out-of-core, k-way parallel sort algorithm that takes advantage of the features of both MPI and OpenSHMEM PGAS models. To the best of our knowledge, this is the first design of any data intensive computing application using Hybrid MPI + PGAS models. Our experimental evaluation indicates that our proposed framework outperforms existing MPI-based design by up to 45% at 8,192 processes. It also achieves 7X improvement over Hadoop-based sort using the same amount of resources at 1,024 cores.