{"title":"Replay for debugging MPI parallel programs","authors":"Chul-Eui Hong, Bum-Sik Lee, Giwon On, D. Chi","doi":"10.1109/MPIDC.1996.534108","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534108","url":null,"abstract":"The cyclic debugging approach often fails for parallel programs because parallel programs reveal nondeterministic characteristics due to message race conditions. This paper addresses the execution replay algorithm for debugging MPI parallel programs. The lexical analyzer identifies the MPI events which affect nondeterministic executions, and then an execution is controlled in order to make it equivalent to a reference execution by keeping their orders of events in two executions identical. The proposed replay system uses the logical time stamping algorithm and the derived data types provided by MPI standard. It also presents the method of how to replay the blocking and nonblocking message passing events. The proposed replay system was applied to the bitonic-merge sort and other parallel programs. We found that re-execution has reproducible behavior and the replay system is useful to find the communication errors.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130425724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and implementation of MPI on Puma portals","authors":"Ron Brightwell, Lance Shuler","doi":"10.1109/MPIDC.1996.534090","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534090","url":null,"abstract":"As the successor to SUNMOS, the Puma operating system provides a flexible, lightweight, high performance message passing environment for massively parallel computers. Message passing in Puma is accomplished through the use of a new mechanism known as a portal. Puma is currently running on the Intel Paragon and is being developed for the Intel TeraFLOPS machine. We discuss issues regarding the development of the Argonne National Laboratory/Mississippi State University implementation of the Message Passing Interface standard on top of portals. Included is a description of the design and implementation for both MPI point-to-point and collective communications, and MPI-2 one-sided communications.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134221442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Markus, S. B. Kim, K. Pantazopoulos, A. L. Ocken, E. Houstis, P. Wu, S. Weerawarana, D. Maharry
{"title":"Performance evaluation of MPI implementations and MPI based Parallel ELLPACK solvers","authors":"S. Markus, S. B. Kim, K. Pantazopoulos, A. L. Ocken, E. Houstis, P. Wu, S. Weerawarana, D. Maharry","doi":"10.1109/MPIDC.1996.534109","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534109","url":null,"abstract":"We are concerned with the parallelization of finite element mesh generation and its decomposition, and the parallel solution of sparse algebraic equations which are obtained from the parallel discretization of second order elliptic partial differential equations (PDEs) using finite difference and finite element techniques. For this we use the Parallel ELLPACK (//ELLPACK) problem solving environment (PSE) which supports PDE computations on several MIMD platforms. We have considered the ITPACK library of stationary iterative solvers which we have parallelized and integrated into the //ELLPACK PSE. This Parallel ITPACK package has been implemented using the MPI, PVM, PICL, PARMACS, nCUBE Vertex and Intel NX message passing communication libraries. It performs very efficiently on a variety of hardware and communication platforms. To study the efficiency of three MPI library implementations, the performance of the Parallel ITPACK solvers was measured on several distributed memory architectures and on clusters of workstations for a testbed of elliptic boundary value PDE problems. We present a comparison of these MPI library implementations with PVM and the native communication libraries, based on their performance on these tests. Moreover we have implemented in MPI, a parallel mesh generator that concurrently produces a semi-optimal partitioning of the mesh to support various domain decomposition solution strategies across the above platforms.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123150654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A thread taxonomy for MPI","authors":"A. Skjellum, B. Protopopov, S. Hebert","doi":"10.1109/MPIDC.1996.534094","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534094","url":null,"abstract":"In 1994, we presented extensions to MPI and offered an early paper on potential thread extensions to MPI as well as non-blocking collective extensions to MPI. The present paper is a thorough review of thread issues in MPI, including alternative models, their computational uses, and the impact on implementations. A number of issues are addressed: barriers to thread safety in MPI implementations with MPICH as an example and changes of the semantics of non-thread-safe MPI calls, different thread models, their uses, and possible integration. Minimal portable thread management and synchronization mechanisms API extensions for MPI are considered. A tentative design for multi-threaded thread-safe ADI and Channel Device for MPICH is proposed. We consider threads as both an implementation device for MPI and as a user-level mechanism to achieve fine-grain concurrency. The reduction of the process to a simple resource container (as considered by Mach), with the thread as the main named computational unit is suggested. Specific results thus far with Windows NT version of MPICH are mentioned.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"10 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120905182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Message passing in complex irregular systems","authors":"C. Dionne, M. Nolette, D. Gagné","doi":"10.1109/MPIDC.1996.534101","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534101","url":null,"abstract":"MPI is a standard that is well adapted to the needs of the parallel computing community, where performance is a primary concern. However, there is a lack of standards and tools well suited to message passing in complex irregular systems. This paper presents a set of message passing requirements for complex irregular systems. These requirements lead to the implementation of a communication library layered over MPI to provide a higher level of abstraction. The implementation of a training system for the crew of maritime patrol aircraft based on this communication library demonstrates its applicability to solve complex irregular problems.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125123713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel solution of the wave equation using higher order finite elements","authors":"M. Kern, S. Mefire","doi":"10.1109/MPIDC.1996.534103","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534103","url":null,"abstract":"We present a parallel solver for wave propagation problems based on the higher-order explicit finite elements developed by Cohen et al. (1995) These elements were introduce to allow mass-lumping while preserving high accuracy. Our approach is based on a coarse-grain, domain-splitting parallelism, and uses the new MPI (Message Passing Interface) standard as a message passing library. The program currently runs on a network of workstations, on a Cray T3D and on an IBM SP/2.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"6 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125147697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MPICH on the T3D: a case study of high performance message passing","authors":"R. Brightwell, A. Skjellum","doi":"10.1109/MPIDC.1996.534088","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534088","url":null,"abstract":"The paper describes the design, implementation and performance of a port of the Argonne National Laboratory/Mississippi State University MPICH implementation of the Message Passing Interface standard to the Cray T3D massively parallel processing system. A description of the factors influencing the design and the various stages of implementation are presented. Performance results revealing superior bandwidth and comparable latency as compared to other full message passing systems on the T3D are shown. Further planned improvements and optimizations, including an analysis of a port to the T3E, are mentioned.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123003484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized communicators in the Message Passing Interface","authors":"E. Demaine, Ian T Foster, C. Kesselman, M. Snir","doi":"10.1109/MPIDC.1996.534093","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534093","url":null,"abstract":"We propose extensions to the Message Passing Interface (MPI) that generalize the MPI communicator concept to allow multiple communication endpoints per process, dynamic creation of endpoints, and the transfer of endpoints between processes. The generalized communicator construct can be used to express a wide range of interesting communication structures, including collective communication operations involving multiple threads per process, communications between dynamically created threads, and object-oriented applications in which communications are directed to specific objects. Furthermore, this enriched functionality can be provided in a manner that preserves backward compatibility with MPI. We describe the proposed extensions, illustrate their use with examples, and discuss implementation issues.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"575 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116205349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MPI performance evaluation and characterization using a compact application benchmark code","authors":"P. Worley","doi":"10.1109/MPIDC.1996.534110","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534110","url":null,"abstract":"In this paper the parallel benchmark code PSTSWM is used to evaluate the performance of the vendor-supplied implementations of the MPI message-passing standard on the Intel Paragon, IBM SP2, and Cray Research T3D. This study is meant to complement the performance evaluation of individual MPI commands by providing information on the practical significance of MPI performance on the execution of a communication-intensive application code. In particular three performance questions are addressed: how important is the communication protocol in determining performance when using MPI, how does MPI performance compare with that of the native communication library, and how efficient are the collective communication routines.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126418547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rendering of numerical flow simulations using MPI","authors":"J. Stone, M. Underwood","doi":"10.1109/MPIDC.1996.534105","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534105","url":null,"abstract":"Results from a parallel computational fluid dynamics (CFD) code combined with a ray tracing library for run-time visualization are presented. Several factors make in-place rendering of CFD data preferable to the use of external rendering packages or dedicated graphics workstations. In-place rendering avoids significant I/O to disks or to networked graphics workstations, and provides the ability to monitor simulations as they progress. The use of MPI (Message Passing Interface) in both codes helped facilitate their combination into a single application. Also due to the use of MPI, the two separate applications have been run on several different architectures. The parallel architectures include networks of workstations, the Intel iPSC/860, the Intel Paragon, and the IBM SP2.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128626821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}