{"title":"Adapting to load on workstation clusters","authors":"Robert Brunner, L. Kalé, L. Kalé","doi":"10.1109/FMPC.1999.750590","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750590","url":null,"abstract":"Desktop workstations represent a largely untapped source of computational power for parallel computing. Two of the main problems in utilizing these workstations are developing strategies for migrating load so that partially loaded workstations can contribute CPU cycles to the computation, and making dynamically migratable application programs easy to write. This paper describes object arrays, a construct which makes dynamically migratable applications easier to write, and a simple strategy for migrating load on a workstation cluster.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123455711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Java Grande: software infrastructure for HPCC","authors":"G. Fox","doi":"10.1109/FMPC.1999.750606","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750606","url":null,"abstract":"We describe the definition, motivation and current status of Java Grande activities. We introduce 3 roles of Java in Grande programming at client, middleware or backend tiers of a computing system. We start with Java as a language and describe where it is clearly good and where it could be good! The Java Grande Forum has numerical and distributed computing working groups and projects include the study of changes to Java and its runtime to enhance Grande applications and their programming environment community. There is an important activity to define seamless interfaces allowing universal access to general hosts. Benchmarks for all sorts of Grande applications are critical. We discuss Java for Parallel Computing including message passing (MPI) and data parallelism.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"215 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127606530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HPF implementation of ARC3D","authors":"M. Frumkin, J. Yan","doi":"10.1109/FMPC.1999.750587","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750587","url":null,"abstract":"We present an HPF implementation of ARC3D code along with the profiling and performance data on SGI Origin 2000. Advantages and limitations of HPF as a parallel programming language for CFD applications are discussed. For achieving good performance results we used the data distributions optimized for implementation of implicit and explicit operators of the solver and boundary conditions. We compare the results with MPI and directive based implementations.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130967235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. T. Krasteva, C. Baker, L. T. Watson, B. Grossman, W. Mason, R. Haftka
{"title":"Distributed control parallelism for multidisciplinary design of a high speed civil transport","authors":"D. T. Krasteva, C. Baker, L. T. Watson, B. Grossman, W. Mason, R. Haftka","doi":"10.1109/FMPC.1999.750597","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750597","url":null,"abstract":"Large scale multidisciplinary design optimization (MDO) problems often involve massive computation over vast data sets; Regardless of the MDO problem solving methodology, advanced computing technologies and architectures are indispensable. The data parallelism inherent in some engineering problems makes massively parallel architectures a natural choice, but efficiently harnessing the power of massive parallelism requires sophisticated algorithms and techniques. This paper presents an effort to apply massively scalable distributed control and dynamic load balancing techniques to the reasonable design space identification phase of a variable complexity approach to the multidisciplinary design optimization of a high speed civil transport (HSCT). The scalability and performance of two dynamic load balancing techniques, random polling and global round robin with message combining, and two termination detection schemes, token passing and global task count, are studied. The extent to which such techniques are applicable to other MDO paradigms, and to the potential for parallel multidisciplinary design with current large-scale disciplinary codes, is of particular interest.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133656273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hiroaki Inoue, K. Anjo, J. Yamamoto, J. Tanabe, Masaki Wakabayashi, M. Sato, H. Amano, K. Hiraki
{"title":"The preliminary evaluation of MBP-light with two protocol policies for a massively parallel processor-JUMP-1","authors":"Hiroaki Inoue, K. Anjo, J. Yamamoto, J. Tanabe, Masaki Wakabayashi, M. Sato, H. Amano, K. Hiraki","doi":"10.1109/FMPC.1999.750609","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750609","url":null,"abstract":"A massively parallel processor called JUMP-1 has been developed to build an efficient cache coherent-distributed shared memory (DSM) on a large system with more than 1000 processors. Here, the dedicated processor called MBP (Memory Based Processor)-light to manage the DSM of JUMP-1 is introduced, and its preliminary performance with two protocol policies-update/invalidate-is evaluated. From results of its simulation, it appears that simple operations like the tag check and the collection/generation of acknowledgment packets are mostly processed by the hardware mechanisms in MBP-light without the aids of the core processor with both policies. Also, the buffer-register architecture adopted by the core processor in MBP-light is exploited enough to process a protocol transaction for both policies.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127035311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Latency tolerant algorithms for WAN based workstation clusters","authors":"Bernd Helzer, M. Clement, Q. Snell, Brigham Young","doi":"10.1109/FMPC.1999.750584","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750584","url":null,"abstract":"One of the biggest differences between traditional supercomputers and workstation clusters is the latency involved in sending a message between processors. Wide Area Network (WAN) based workstation clusters can experience significant latency between machines at different geographical positions. Improvements in network technology can achieve marginal improvements, but the speed of light delays cannot be decreased. This research develops stencil algorithms that are more tolerant of latency. These algorithms can be used to solve finite element problems as well as other problems where neighbor communications are used. Latency tolerant algorithms are essential if a large number of machines on the Internet are to be used in performing a parallel computation.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116001918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New algorithms for efficient mining of association rules","authors":"Li Shen, Hong Shen, Ling Cheng","doi":"10.1109/FMPC.1999.750605","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750605","url":null,"abstract":"Discovery of association rules is an important data mining task. Several algorithms have been proposed to solve this problem. Most of them require repeated passes over the database, which incurs huge I/O overhead and high synchronization expense in parallel cases. There are a few algorithms trying to reduce these costs. But they contains weaknesses such as often requiring high pre-processing cost to get a vertical database layout, containing much redundant computation in parallel cases, and so on. We propose new association mining algorithms to overcome the above drawbacks: through minimizing the I/O cost and effectively controlling the computation cost. Experiments on well-known synthetic data show that our algorithms consistently outperform a priori, one of the best algorithms for association mining, by factors ranging from 2 to 4 in most cases. Also, our algorithms are very easy to be parallelized, and we present a parallelization for them based on a shared-nothing architecture. We observe that the parallelism in our parallel approach is developed more sufficiently than in two of the best existing parallel algorithms.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116077796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed applet-based certifiable processing in client/server environments","authors":"Hongxia Jin, G. Sullivan, G. Masson","doi":"10.1109/FMPC.1999.750583","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750583","url":null,"abstract":"We describe and demonstrate the concept of Distributed Applet-based Certifiable Processing (DACP) in client/server environments for computational result correctness checking. DACP offers a low-overhead framework for Web-bused client/server environments in which a server can partition a given computational problem into a set of subproblems, distribute these sub-problems across a network to clients, and then efficiently certify the correctness of the sub-problem results returned by the clients before assembling them into a final answer for the original computational problem. The resource and time advantages of the DACP methodology are directly related to the effectiveness and efficiency offered by all innovative distributed implementation of the certification-trail approach to computational result checking. As a proof of the concept, we apply the DACP methodology, to a class of important computationally intensive problems. Our experimental assessment of DACP, performed with the use of Java applets which we have developed emphatically indicates that DACP offers significant advantages in comparison with other known result correctness checking techniques for reliable distributed computing in client/server environments.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123882395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel simulation of two-phase flow problems using the finite element method","authors":"S. Aliabadi, Khalil Shujaee, T. Tezduyar","doi":"10.1109/FMPC.1999.750591","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750591","url":null,"abstract":"Parallel computation of unsteady, two-phase flow problems are performed using stabilized finite element method. The finite element formulations are written for fix meshes and are based on the Navier-Stokes equations and an advection equation governing the motion of the interface function. The interface function, with two distinct values serve as an marker identifying each fluid This function is advected with fluid velocity through out the computational domain. To increase the accuracy of the method, an interface-sharpening/mass conservation algorithm is designed. The method has been implemented on the CRAY T3E and also IBM SP/6000 using the MPI libraries. We show the effectiveness of the method in simulating complex 3D problems, such as two-fluid interface in a centrifuge tube, operation stability of a partially-filled tanker truck driving over a bump and hydrodynamics stability of ships.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123952406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalability analysis of multidimensional wavefront algorithms on large-scale SMP clusters","authors":"A. Hoisie, O. Lubeck, H. Wasserman","doi":"10.1109/FMPC.1999.750452","DOIUrl":"https://doi.org/10.1109/FMPC.1999.750452","url":null,"abstract":"We develop a model for the parallel performance of algorithms that consist of concurrent, two-dimensional wavefronts implemented in a message passing environment. The model combines the separate contributions of computation and communication wavefronts. We validate the model on three supercomputer systems, with up to 500 processors, using data from an ASCI deterministic particle transport application, although the model is general to any wavefront algorithm implemented on a 2-D processor domain. We also use the model to make estimates of performance and scalability of wavefront algorithms on 100-TFLOPS computer systems expected to be in existence within the next decade. Our model shows that on a 1-billion-cell problem, single-node computation speed (nor inter-processor communication performance, as is widely believed) is the bottleneck. Finally, we present preliminary considerations that reveal the additional complexity associated with modeling wavefront algorithms on reduced-connectivity network topologies, such as clusters of SMPs.","PeriodicalId":405655,"journal":{"name":"Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132177944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}