{"title":"Parallel implementation of 3D FMA using MPI","authors":"E. Lu, D. Okunbor","doi":"10.1109/MPIDC.1996.534102","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534102","url":null,"abstract":"The simulation of N-body systems has been used extensively in biophysics and chemistry to investigate the dynamics of biomolecules, and in astrophysics to study the chaotic characteristics of the galactic system. However, the long-range force calculation has a time complexity of O(N/sup 2/), where N is the number of particles in the system. The fast multipole algorithm (FMA), proposed by Greengard and Rokhlin (1987), reduces the time complexity to O(N). Our goal is to build a parallel FMA library which is portable, scalable and efficient. We use the Message Passing Interface (MPI) as the communication back-end. Also, an effective communication scheme to reduce the communication overhead and a partitioning technique to obtain good load balancing among the processors were implemented into the library.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127712890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Early implementation of Para++ with MPI-2","authors":"O. Coulaud, E. Dillon","doi":"10.1109/MPIDC.1996.534099","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534099","url":null,"abstract":"Gives an overview of the implementation of Para++'s dynamic process management on top of an early MPI-2 implementation. It is currently implemented on top of LAM 6.0. After a presentation of the Para++ concepts, a presentation of the internal implementation of Para++ on top of both PVM and MPI-2 is made. This implementation uses the dynamic process chapter of MPI-2, and makes intensive use of the inter-communicator operations. The highlights given in this paper should help people to better understand how Para++ 2.0 works as well as giving an exhaustive example of an MPI-2 application.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116190688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MPI as a coordination layer for communicating HPF tasks","authors":"Ian T Foster, D. Kohr, R. Krishnaiyer","doi":"10.1109/MPIDC.1996.534096","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534096","url":null,"abstract":"Data-parallel languages such as High Performance Fortran (HPF) present a simple execution model in which a single thread of control performs high-level operations on distributed arrays. These languages can greatly ease the development of parallel programs. Yet there are large classes of applications for which a mixture of task and data parallelism is most appropriate. Such applications can be structured as collections of data-parallel tasks that communicate by using explicit message passing. Because the Message Passing Interface (MPI) defines standardized, familiar mechanisms for this communication model, we propose that HPF tasks communicate by making calls to a coordination library that provides an HPF binding for MPI. The semantics of a communication interface for sequential languages can be ambiguous when the interface is invoked from a parallel language; we show how these ambiguities can be resolved by describing one possible HPF binding for MPI. We then present the design of a library that implements this binding, discuss issues that influenced our design decisions, and evaluate the performance of a prototype HPF/MPI library using a communications microbenchmark and application kernel. Finally, we discuss how MPI features might be incorporated into our design framework.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122109443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cooperative Data Sharing: a layered approach to an architecture-independent Message-Passing Interface","authors":"D. C. DiNucci","doi":"10.1109/MPIDC.1996.534095","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534095","url":null,"abstract":"When MPI began to take form as a rather high-level interface with extensive features, it became somewhat less attractive to some benchmarkers and tool builders who required a very efficient low-level portable interface and did not need extensive features targeted toward application development. As a result, the Message Passing Kernel (MPK) project began at NAS. The name changed to the Cooperative Data Sharing (CDS) System when it became clear that the semantics we desired did not require copying (as message-passing does). The document describes the design and implementation of the kernel level of CDS, called CDS1, and some directions we are taking on a higher, MPI-level interface built upon it, called CDS2. The semantics of communication in CDS1 are similar to shared memory in that no copying is required and data sharing and one-sided communication is supported, and similar to message-passing in that regions of contiguous data can be passed from one process to another through queues. A prototype of CDSI has been demonstrated on an SGI Power Challenge Array and a network of Sun workstations running Solaris.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131381852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementing MPI under AP/Linux","authors":"D. Sitsky, P. Mackerras, A. Tridgell, D. Walsh","doi":"10.1109/MPIDC.1996.534092","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534092","url":null,"abstract":"A preliminary MPI library has been implemented for the Fujitsu AP1000+ multicomputer running the AP/Linux operating system. Under this environment, parallel programs may be dedicated to a fixed partition, or a number of parallel programs may share a partition. Therefore, the MPI library has been constructed so that messaging operations can be driven by polling and/or interrupt techniques. It has been found that polling works well when a single parallel program is running on a given partition, and that interrupt-driven communication makes far better use of the machine when multiple parallel programs are executing. Gang scheduling of multiple parallel programs which use polling was found to be relatively ineffective.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"67 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133068944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MPICH performance characteristics and considerations","authors":"R. Frost","doi":"10.1109/MPIDC.1996.534115","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534115","url":null,"abstract":"A study of the Argonne National Laboratory/Mississippi State University \"MPICH\" implementation of MPI is presented. The performance of point-to-point communications on clusters, MPPs, networks of workstations, and SMPs is given. Performance strategies for a range of application and architecture characteristics are discussed. Experiences installing MPICH on these architectures are also discussed.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"208 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132292814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"First class communication in MPI","authors":"E. Demaine","doi":"10.1109/MPIDC.1996.534113","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534113","url":null,"abstract":"We compare three concurrent-programming languages based on message-passing: Concurrent ML (CML), Occam and MPI. The main advantage of the CML extension of Standard ML (SML) is that communication events are first-class just like normal program variables (e.g., integers), that is, they can be created at run-time, assigned to variables, and passed to and returned from functions. In addition, it provides dynamic process and channel creation. Occam, first designed for transputers, is based on a static model of process and channel creation. We examine how these limitations enforce severe restrictions on communication events, and how they affect the flexibility of Occam programs. The MPI (Message Passing Interface) standard provides a common way to access message-passing in C and Fortran. Although MPI was designed for parallel and distributed computation, it can also be viewed as a general concurrent-programming language. In particular most Occam features and several important facilities of CML can be implemented in MPI. For example, MPI-2 supports dynamic process and channel creation, and less general first-class communication events. We propose an extension to MPI which provides the CML choose, wrap, and guard combinators. This would make MPI a strong base for the flexible concurrency available in CML. Assuming that the modifications are incorporated into the standard and its implementations higher-order concurrency and its advantages will become more widespread.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133030536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MICE: a prototype MPI implementation in Converse environment","authors":"M. Bhandarkar, L. Kalé","doi":"10.1109/MPIDC.1996.534091","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534091","url":null,"abstract":"The paper describes MICE, a prototype implementation of MPI on the Converse interoperable parallel programming environment. It is based on MPICH, a public-domain implementation of MPI and uses the Abstract Device Interface (ADI) which has been retargeted on top of Converse. MICE makes use of message-managers and allows use of thread-objects to let MPI modules co-exist with other types of computations and communication (such as a library computation in Charm++ or asynchronous computations in multipol) within a single application. It also makes it possible to interoperate PVM (in a restricted form) and MPI modules. Thread-objects make it possible to build multi-threaded MPI programs. This MPI implementation demonstrates that it is possible to provide interoperability without any significant performance degradation.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124019948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Collective communication and communicators in mpi++","authors":"D. Kafura, L. Huang","doi":"10.1109/MPIDC.1996.534097","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534097","url":null,"abstract":"The paper describes the current version of mpi++, a C++ language binding for MPI, that includes all of the collective services, and services for contexts, groups and communicators as described in Chapter 4 and 5 of the MPI standard. The code for mpi++ has been tested on a Sun Sparc workstation and an Intel Paragon. Segments of a mpi++ program implementing a parallel algorithm is introduced to illustrate the Collective class hierarchy. The paper also shows how mpi++ deals with other collective operations (e.g., reduction), attribute caching, groups,and communicators. The class hierarchy of mpi++ is presented and briefly explained.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126024519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Object Oriented MPI (OOMPI): a class library for the Message Passing Interface","authors":"B. McCandless, J. Squyres, A. Lumsdaine","doi":"10.1109/MPIDC.1996.534098","DOIUrl":"https://doi.org/10.1109/MPIDC.1996.534098","url":null,"abstract":"Using the Message Passing Interface (MPI) in C++ has been difficult up to this point, because of the lack of suitable C++ bindings and C++ class libraries. The existing MPI standard provides language bindings only for C and Fortran 77, precluding their direct use in object-oriented programming. Even the proposed C++ bindings in MPI-2 are at a fairly low-level and are not directly suitable for object-oriented programming. In this paper, we present the requirements, analysis and design for Object-Oriented MPI (OOMPI), a C++ class library for MPI. Although the OOMPI class library is specified in C++, in some sense the specification is a generic one that uses C++ as the program description language. Thus, the OOMPI specification can also be considered as a generic object-oriented class library specification which can thus also form the basis for MPI class libraries in other object-oriented languages.","PeriodicalId":432081,"journal":{"name":"Proceedings. Second MPI Developer's Conference","volume":"961 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127033647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}