Masamichi Takagi, Norio Yamaguchi, Balazs Gerofi, A. Hori, Y. Ishikawa
{"title":"Adaptive transport service selection for MPI with InfiniBand network","authors":"Masamichi Takagi, Norio Yamaguchi, Balazs Gerofi, A. Hori, Y. Ishikawa","doi":"10.1145/2831129.2831132","DOIUrl":"https://doi.org/10.1145/2831129.2831132","url":null,"abstract":"We propose a method which adaptively selects InfiniBand transport services used for source and destination peers to improve performance while limiting memory consumption of the MPI library. There are two major choices of IB transport services available, i.e., Reliable Connection (RC) and Dynamically Connected (DC), each of which is selected for each pair of source peer and destination peer. RC is faster than DC for all communication patterns except for the case where there are many active RCs. It also consumes a lot of memory when there are many processes. DC, on the other hand, consumes less memory than RC but its performance drops when sending messages to different destinations or when many DCs sends a message to the same destination DC. Therefore, the library should find the best mapping of RCs and DCs to pairs of source peer and destination peer according to the communication pattern of the application. Our method finds a good mapping by comparing potential latency benefits for candidate mappings. It achieves 13%--19% latency reduction when compared to the methods using only DCs in micro-benchmarks representing communication patterns problematic to RC or DC with 64 processes.","PeriodicalId":417011,"journal":{"name":"Workshop on Exascale MPI","volume":"30 15","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132914584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amin Hassani, A. Skjellum, P. Bangalore, R. Brightwell
{"title":"Practical resilient cases for FA-MPI, a transactional fault-tolerant MPI","authors":"Amin Hassani, A. Skjellum, P. Bangalore, R. Brightwell","doi":"10.1145/2831129.2831130","DOIUrl":"https://doi.org/10.1145/2831129.2831130","url":null,"abstract":"MPI is insufficient when confronting failures. FA-MPI (Fault-Aware MPI) provides extensions to the MPI standard designed to enable data-parallel applications to achieve resilience without sacrificing scalability. FA-MPI introduces transactions as a novel extension to the MPI message-passing model. Transactions support failure detection, isolation, mitigation, and recovery via application-driven policies. To achieve maximum achievable performance of modern machines, overlapping communication and I/O with computation through non-blocking operations is of growing importance. Therefore, we emphasize fault-tolerant, non-blocking communication operations plus a set of nestable lightweight transactional TryBlock API extensions able to exploit system and application hierarchy. This strategy enables applications to run to completion with higher probability than nominally. We modified two proxy applications---MiniFE and LULESH---by adding FA-MPI semantics to them. Finally we present performance and overhead results for 1K MPI processes.","PeriodicalId":417011,"journal":{"name":"Workshop on Exascale MPI","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130981838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
I. Peng, S. Markidis, E. Laure, Daniel J. Holmes, Mark Bull
{"title":"A data streaming model in MPI","authors":"I. Peng, S. Markidis, E. Laure, Daniel J. Holmes, Mark Bull","doi":"10.1145/2831129.2831131","DOIUrl":"https://doi.org/10.1145/2831129.2831131","url":null,"abstract":"Data streaming model is an effective way to tackle the challenge of data-intensive applications. As traditional HPC applications generate large volume of data and more data-intensive applications move to HPC infrastructures, it is necessary to investigate the feasibility of combining message-passing and streaming programming models. MPI, the de facto standard for programming on HPC, cannot intuitively express the communication pattern and the functional operations required in streaming models. In this work, we designed and implemented a data streaming library MPIStream atop MPI to allocate data producers and consumers, to stream data continuously or irregularly and to process data at run-time. In the same spirit as the STREAM benchmark, we developed a parallel stream benchmark to measure data processing rate. The performance of the library largely depends on the size of the stream element, the number of data producers and consumers and the computational intensity of processing one stream element. With 2,048 data producers and 2,048 data consumers in the parallel benchmark, MPIStream achieved 200 GB/s processing rate on a Blue Gene/Q supercomputer. We illustrate that a streaming library for HPC applications can effectively enable irregular parallel I/O, application monitoring and threshold collective operations.","PeriodicalId":417011,"journal":{"name":"Workshop on Exascale MPI","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115300397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Overtime: a tool for analyzing performance variation due to network interference","authors":"Ryan E. Grant, K. Pedretti, A. Gentile","doi":"10.1145/2831129.2831133","DOIUrl":"https://doi.org/10.1145/2831129.2831133","url":null,"abstract":"Shared networks create unique challenges in obtaining consistent performance across jobs for large systems when not using exclusive system-wide allocations. In order to provide good system utilization, resource managers allocate system space to multiple jobs. These multiple independent node allocations can interfere with each other through their shared network. This work provides a method of observing and measuring the impact of network contention due to interference from other jobs through a continually running benchmark application and the use of network performance counters. This is the first work to measure network interference using specially designed benchmarks and network performance counters.","PeriodicalId":417011,"journal":{"name":"Workshop on Exascale MPI","volume":"31 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125758962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Bridges, Matthew G. F. Dosanjh, Ryan E. Grant, A. Skjellum, Shane Farmer, R. Brightwell
{"title":"Preparing for exascale: modeling MPI for many-core systems using fine-grain queues","authors":"P. Bridges, Matthew G. F. Dosanjh, Ryan E. Grant, A. Skjellum, Shane Farmer, R. Brightwell","doi":"10.1145/2831129.2831134","DOIUrl":"https://doi.org/10.1145/2831129.2831134","url":null,"abstract":"This paper presents a fine-grain queueing model of MPI point-to-point messaging performance for use in the design and analysis of current and future large-scale computing systems. In particular, the model seeks to capture key performance behavior of MPI communication on many-core systems. We demonstrate that this model encompasses key MPI performance characteristics, such as short/long protocol and offload/onload protocol tradeoffs, and demonstrate its use in predicting the potential impact of architectural and software changes for many-core systems on communication performance. In addition, we also discuss the limitations of this model and potential directions for enhancing its fidelity.","PeriodicalId":417011,"journal":{"name":"Workshop on Exascale MPI","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130952822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}