A. Wissink, R. Hornung, S. Kohn, Steve S. Smith, N. Elliott
{"title":"Large Scale Parallel Structured AMR Calculations Using the SAMRAI Framework","authors":"A. Wissink, R. Hornung, S. Kohn, Steve S. Smith, N. Elliott","doi":"10.1145/582034.582040","DOIUrl":"https://doi.org/10.1145/582034.582040","url":null,"abstract":"This paper discusses the design and performance of the parallel data communication infrastructure in SAMRAI, a software framework for structured adaptive mesh refinement (SAMR) multi-physics applications. We describe requirements of such applications and how SAMRAI abstractions manage complex data communication operations found in them. Parallel performance is characterized for two adaptive problems solving hyperbolic conservation laws on up to 512 processors of the IBM ASCI Blue Pacific system. Results reveal good scaling for numerical and data communication operations but poorer scaling in adaptive meshing and communication schedule construction phases of the calculations. We analyze the costs of these different operations, addressing key concerns for scaling SAMR computations to large numbers of processors, and discuss potential changes to improve our current implementation.","PeriodicalId":325282,"journal":{"name":"ACM/IEEE SC 2001 Conference (SC'01)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128529845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Nakano, R. Kalia, P. Vashishta, T. Campbell, S. Ogata, F. Shimojo, S. Saini
{"title":"Scalable Atomistic Simulation Algorithms for Materials Research","authors":"A. Nakano, R. Kalia, P. Vashishta, T. Campbell, S. Ogata, F. Shimojo, S. Saini","doi":"10.1145/582034.582035","DOIUrl":"https://doi.org/10.1145/582034.582035","url":null,"abstract":"A suite of scalable atomistic simulation programs has been developed for materials research based on space-time multiresolution algorithms. Design and analysis of parallel algorithms are presented for molecular dynamics (MD) simulations and quantum-mechanical (QM) calculations based on the density functional theory. Performance tests have been carried out on 1,088-processor Cray T3E and 1,280-processor IBM SP3 computers. The linear-scaling algorithms have enabled 6.44-billion-atom MD and 111,000-atom QM calculations on 1,024 SP3 processors with parallel efficiency well over 90%. The production-quality programs also feature wavelet-based computational-space decomposition for adaptive load balancing, spacefilling-curve-based adaptive data compression with user-defined error bound for scalable I/O, and octree-based fast visibility culling for immersive and interactive visualization of massive simulation data.","PeriodicalId":325282,"journal":{"name":"ACM/IEEE SC 2001 Conference (SC'01)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132642144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel Dedicated Hardware Devices for Heterogeneous Computations","authors":"A. Marongiu, P. Palazzari, V. Rosato","doi":"10.1145/582034.582063","DOIUrl":"https://doi.org/10.1145/582034.582063","url":null,"abstract":"We describe a design methodology which allows a fast design and prototyping of dedicated hardware devices to be used in heterogeneous computations. The platforms used in heterogeneous computations consist of a general-purpose COTS architecture which hosts a dedicated hardware device; parts of the computation are mapped onto the former, parts onto the latter, in a way to improve the overall computation efficiency. We report the design and the prototyping of a FPGA-based hardware board to be used in the search of low-autocorrelation binary sequences. The circuit has been designed by using a recently developed Parallel Hardware Generator (PHG) package which produces a synthesizable VHDL code starting from the specific algorithm expressed as a System of Affine Recurrence Equations (SARE). The performance of the realized devices has been compared to those obtained on the same numerical application on several computational platforms.","PeriodicalId":325282,"journal":{"name":"ACM/IEEE SC 2001 Conference (SC'01)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134080937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Canning, B. Ujfalussy, T. Schulthess, Xiaoguang Zhang, W. Shelton, D. Nicholson, G. M. Stocks, Yang Wang, T. Dirks
{"title":"Multi-teraflops Spin Dynamics Studies of the Magnetic Structure of FeMn/Co Interfaces","authors":"A. Canning, B. Ujfalussy, T. Schulthess, Xiaoguang Zhang, W. Shelton, D. Nicholson, G. M. Stocks, Yang Wang, T. Dirks","doi":"10.1145/582034.582078","DOIUrl":"https://doi.org/10.1145/582034.582078","url":null,"abstract":"We have used the power of massively parallel computers to perform first principles spin dynamics (SD) simulations of the magnetic structure of Iron-Manganese/Cobalt (FeMn/Co) interfaces. These large scale quantum mechanical simulations, involving 2016-atom super-cell models, reveal details of the orientational con.guration of the magnetic moments at the interface that are unobtainable by any other means. Exchange bias, which involves the use of an antiferromagnetic (AFM) layer such as FeMn to pin the orientation of the magnetic moment of a proximate ferromagnetic (FM) layer such as Co, is of fundamental importance in magnetic multilayer storage and read head devices. Here the equation of motion of .rst principles SD is used to perform relaxations of model magnetic structures to the true ground (equilibrium) state. Our code is intrinsically parallel and has achieved a maximum execution rate of 2.46 Teraflops on the IBM SP at the National Energy Research Scienti.c Computing Center (NERSC).","PeriodicalId":325282,"journal":{"name":"ACM/IEEE SC 2001 Conference (SC'01)","volume":"51 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132864268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Jini-Based Computing Portal System","authors":"T. Suzumura, S. Matsuoka, H. Nakada","doi":"10.1145/582034.582058","DOIUrl":"https://doi.org/10.1145/582034.582058","url":null,"abstract":"JiPANG (A Jini-based Portal Augmenting Grids) is a portal system and a toolkit which provides uniform access interface layer to a variety of Grid systems, and is built on top of Jini distributed object technology. JiPANG performs uniform higher-level management of the computing services and resources being managed by individual Grid systems such as Ninf, NetSolve, Globus, etc. In order to give the user a uniform interface to the Grids JiPANG provides a set of simple Java APIs called the JiPANG Toolkits, and furthermore, allows the user to interact with Grid systems, again in a uniform way, using the JiPANG Browser application. With JiPANG, users need not install any client packages before-hand to interact with Grid systems, nor be concerned about updating to the latest version. Such uniform, transparent services available in a ubiquitous manner we believe is essential for the success of Grid as a viable computing platform for the next generation.","PeriodicalId":325282,"journal":{"name":"ACM/IEEE SC 2001 Conference (SC'01)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128641536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multilevel Algorithms for Generating Coarse Grids for Multigrid Methods","authors":"I. Moulitsas, G. Karypis","doi":"10.1145/582034.582079","DOIUrl":"https://doi.org/10.1145/582034.582079","url":null,"abstract":"Geometric Multigrid methods have gained widespread acceptance for solving large systems of linear equations, especially for structured grids. One of the challenges in successfully extending these methods to unstructured grids is the problem of generating an appropriate set of coarse grids. The focus of this paper is the development of robust algorithms, both serial and parallel, for generating a sequence of coarse grids from the original unstructured grid. Our algorithms treat the problem of coarse grid construction as an optimization problem that tries to optimize the overall quality of the resulting fused elements. We solve this problem using the multilevel paradigm that has been very successful in solving the related grid/graph partitioning problem. The parallel formulation of our algorithm incurs a very small communication overhead, achieves high degree of concurrency, and maintains the high quality of the coarse grids obtained by the serial algorithm.","PeriodicalId":325282,"journal":{"name":"ACM/IEEE SC 2001 Conference (SC'01)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125024796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Kerbyson, H. Alme, A. Hoisie, F. Petrini, H. Wasserman, M. Gittings
{"title":"Predictive Performance and Scalability Modeling of a Large-Scale Application","authors":"D. Kerbyson, H. Alme, A. Hoisie, F. Petrini, H. Wasserman, M. Gittings","doi":"10.1145/582034.582071","DOIUrl":"https://doi.org/10.1145/582034.582071","url":null,"abstract":"In this work we present a predictive analytical model that encompasses the performance and scaling characteristics of an important ASCI application. SAGE (SAIC’s Adaptive Grid Eulerian hydrocode) is a multidimensional hydrodynamics code with adaptive mesh refinement. The model is validated against measurements on several systems including ASCI Blue Mountain, ASCI White, and a Compaq Alphaserver ES45 system showing high accuracy. It is parametric - basic machine performance numbers (latency, MFLOPS rate, bandwidth) and application characteristics (problem size, decomposition method, etc.) serve as input. The model is applied to add insight into the performance of current systems, to reveal bottlenecks, and to illustrate where tuning efforts can be effective. We also use the model to predict performance on future systems.","PeriodicalId":325282,"journal":{"name":"ACM/IEEE SC 2001 Conference (SC'01)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125550730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An 8.61 Tflop/s Molecular Dynamics Simulation for NaCl with a Special-Purpose Computer: MDM","authors":"T. Narumi, A. Kawai, T. Koishi","doi":"10.1145/582034.582060","DOIUrl":"https://doi.org/10.1145/582034.582060","url":null,"abstract":"We performed molecular dynamics (MD) simulation of 33 million pairs of NaCl ions with the Ewald summation and obtained a calculation speed of 8.61 Tflop/s. In this calculation we used a special-purpose computer, MDM, which we have developed for the calculations of the Coulomb and van der Waals forces. The MDM enabled us to perform large scale MD simulations without truncating the Coulomb force. It is composed of MDGRAPE-2, WINE-2 and a host computer. MDGRAPE-2 accelerates the calculation for real-space part of the Coulomb and van der Waals forces. WINE-2 accelerates the calculation for wavenumber-space part of the Coulomb force. The host computer performs other calculations. With the completed MDM system we performed an MD simulation similar to what was the basis of our SC2000 submission for a Gordon Bell prize. With this large scale MD simulation, we can dramatically decrease the fluctuation of the temperature less than 0.1 Kelvin.","PeriodicalId":325282,"journal":{"name":"ACM/IEEE SC 2001 Conference (SC'01)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134112354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}