Leah H. Jamieson, E. Delp, J. N. Patel, Chao-Chun Wang, Ashfaq A. Khokhar
{"title":"A library-based program development environment for parallel image processing","authors":"Leah H. Jamieson, E. Delp, J. N. Patel, Chao-Chun Wang, Ashfaq A. Khokhar","doi":"10.1109/SPLC.1993.365567","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365567","url":null,"abstract":"Cloner is an image processing prototyping environment that helps users design new parallel image processing algorithms for a target machine by building on and modifying existing library algorithms. In this paper we show the Cloner user interface, discuss how guided access is accomplished, and provide an example of how Cloner supports the rapid development of high performance codes. The example demonstrates how menu options and queries are used to guide a user to select an appropriate 2-dimensional FFT algorithm based on image site and available machine resources.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132522070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CMSSL: a scalable scientific software library","authors":"S. Johnsson","doi":"10.1109/SPLC.1993.365582","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365582","url":null,"abstract":"Massively parallel processors introduce new demands on software systems with respect to performance, scalability, robustness and portability. The increased complexity of the memory systems and the increased range of problem sizes for which a given piece of software is used poses serious challenges for software developers. The Connection Machine Scientific Software Library, CMSSL, uses several novel techniques to meet these challenges. The CMSSL contains routines for managing the data distribution and provides data distribution independent functionality. High performance is achieved through careful scheduling of operations and data motion, and through the automatic selection of algorithms at run-time. We discuss some of the techniques used, and provide evidence that CMSSL has reached the goals of performance and scalability for an important set of applications.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132815856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The MPI communication library: its design and a portable implementation","authors":"W. Gropp, E. Lusk","doi":"10.1109/SPLC.1993.365571","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365571","url":null,"abstract":"We describe an effort to define a standard message-passing interface. The MPI \"standard\" has now emerged. The paper describes the motivation behind the basic concepts of MPI and very briefly summarizes some of its advanced features. We also outline an implementation strategy and describe a preliminary portable implementation.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130808521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kamala Anupindi, A. Skjellum, P. Coddington, Geo rey Foxkamala
{"title":"Parallel differential-algebraic equation solvers for power system transient stability analysis","authors":"Kamala Anupindi, A. Skjellum, P. Coddington, Geo rey Foxkamala","doi":"10.1109/SPLC.1993.365560","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365560","url":null,"abstract":"Real-time or faster-than-real-time power system transient stability simulations will have significant impact on the future design and operations of both individual electrical utility companies and large interconnected power systems. The analysis involves solution of extremely large systems of differential and algebraic equations. Differential-Algebraic Equation (DAE) solvers have been used to solve problems similar in nature to the transient stability analysis (TSA) problem. This paper discusses the possibility of the use of the existing DAE solvers to solve the transient stability analyse's application. We also discuss our research in developing a scalable, parallel DAE solver for use by the power system community and in related applications.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115273343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. K. Bhargava, G. Fox, Chao-Wei Ou, S. Ranka, V. Singh
{"title":"Scalable libraries for graph partitioning","authors":"R. K. Bhargava, G. Fox, Chao-Wei Ou, S. Ranka, V. Singh","doi":"10.1109/SPLC.1993.365562","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365562","url":null,"abstract":"The key problem in efficiently executing irregular and unstructured data parallel applications is partitioning the data to minimize communication while balancing the load. Partitioning such applications can be posed as a graph-partitioning problem based on the computational graph. The partitioning problem is in the class of NP-complete problems; hence exact solutions are computationally intractable for large problems. However, good suboptimal solutions are sufficient for effective parallelization of a large class of these applications. We are currently developing a library of partitioners based on physical optimization and related methods. In this paper, we describe an outline of the different methods and current status of our library.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126052632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ScaLAPACK++: an object oriented linear algebra library for scalable systems","authors":"J. Dongarra, R. Pozo, D. Walker","doi":"10.1109/SPLC.1993.365563","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365563","url":null,"abstract":"We describe the design of ScaLAPACK++, an object oriented CSS library for implementing linear algebra computations on distributed memory multicomputers. This package, when complete, will support distributed dense, banded, sparse matrix operations for symmetric, positive-definite, and non-symmetric cases. In ScaLAPACK++ we have employed object oriented design methods to enhance scalability, portability, flexibility, and ease-of-use. We illustrate some of these points by describing the implementation of a right-looking LU factorization for dense systems in ScaLAPACK++.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114309005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concurrent DASSL: a second-generation DAE solver library","authors":"Alvin Leung, Anthony Skjellumt, Geoffrey","doi":"10.1109/SPLC.1993.365565","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365565","url":null,"abstract":"The goal of the most recent revision of the Concurrent DASSL code is to recast it to conform to the Multicomputer Toolbox's object-oriented design philosophy and to enhance its performance. In this report, we describe in detail improvements in three categories: uniform interfaces, performance enhancement, and error handling. Through these improvements, we are able to achieve a library that is easier for parallel applications to use.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128241267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steven G. Smith, R. Falgout, Charles H. Still, Anthony Skjellum
{"title":"High-level message-passing constructs for Zipcode 1.0: design and implementation","authors":"Steven G. Smith, R. Falgout, Charles H. Still, Anthony Skjellum","doi":"10.1109/SPLC.1993.365572","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365572","url":null,"abstract":"Zipcode is a message passing system that was initially designed for multicomputers and homogeneous networks of computers. The paper describes Zipcode \"invoices,\" which raise the message-passing interface of Zipcode to a higher level of abstraction. The \"gather-send\" and \"receive-scatter\" semantics enable heterogeneous communication. The higher level of abstraction also simplifies message passing and reveals more optimizations. We explain the utility of these features and give examples of the calling sequences that implement them. All of these features are seen as enablers for parallel library development and large applications.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116638351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A concurrent, multigroup, discrete ordinates model of neutron transport","authors":"M. Dorr, C. Still","doi":"10.1109/SPLC.1993.365585","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365585","url":null,"abstract":"We present an algorithm for the concurrent solution of the linear system arising from a multigroup, discrete ordinates model of neutron transport. The target architectures consist of distributed memory computers ranging from workstation clusters to massively parallel computers. Based on an analysis of the memory requirement and floating point complexity of matrix-vector multiplication in the iterative solution of the linear system, we propose a data layout and communication strategy designed to achieve scalability with respect to all phase space variables. Numerical results are presented to demonstrate the performance of the algorithm on the nCUBE/2.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"359 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120881883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Reed, P. Roth, R. Aydt, K. Shields, L. Tavera, R. Noe, B. Schwartz
{"title":"Scalable performance analysis: the Pablo performance analysis environment","authors":"D. Reed, P. Roth, R. Aydt, K. Shields, L. Tavera, R. Noe, B. Schwartz","doi":"10.1109/SPLC.1993.365577","DOIUrl":"https://doi.org/10.1109/SPLC.1993.365577","url":null,"abstract":"Developers of application codes for massively parallel computer systems face daunting performance tuning and optimization problems that must be solved if massively parallel systems are to fulfill their promise. Recording and analyzing the dynamics of application program, system software, and hardware interactions is the key to understanding and the prerequisite to performance tuning, but this instrumentation and analysis must not unduly perturb program execution. Pablo is a performance analysis environment designed to provide unobtrusive performance data capture, analysis, and presentation across a wide variety of scalable parallel systems. Current efforts include dynamic statistical clustering to reduce the volume of data that must be captured and complete performance data immersion via head-mounted displays.<<ETX>>","PeriodicalId":146277,"journal":{"name":"Proceedings of Scalable Parallel Libraries Conference","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116934354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}