N. Wright, Shava Smallen, C. Olschanowsky, J. Hayes, A. Snavely
{"title":"Measuring and Understanding Variation in Benchmark Performance","authors":"N. Wright, Shava Smallen, C. Olschanowsky, J. Hayes, A. Snavely","doi":"10.1109/HPCMP-UGC.2009.72","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.72","url":null,"abstract":"Runtime irreproducibility complicates application performance evaluation on today’s high performance computers. Performance can vary significantly between seemingly identical runs; this presents a challenge to benchmarking as well as a user, who is trying to determine whether the change they made to their code is an actual improvement. In order to gain a better understanding of this phenomenon, we measure the runtime variation of two applications, PARAllel Total Energy Code (PARATEC) and Weather Research and Forecasting (WRF), on three different machines. Key associated metrics are also recorded. The data is then used to 1) quantify the magnitude and distribution of the variations and 2) gain an understanding as why the variations occur. Using our lightweight framework, Integrated Performance Monitoring (IPM), to understand the performance characteristics of individual runs, and the Inca framework to automate the procedure measurements were collected over a month’s time. The results indicate that performance can vary up to 25% and is almost always due to contention for network resources. We also found that the variation differs between machines and is almost always greater on machines with lower performing networks.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"408 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116242530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Full Annulus High Fidelity Fan and Compressor Simulations","authors":"S. Gorrell, Jixian Yao, Michael G. List","doi":"10.1109/HPCMP-UGC.2009.13","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.13","url":null,"abstract":"Challenge Project C3L supports research efforts to design distortion tolerant fans and accurately predict the inlet conditions to the core compressor of gas turbine engines. The technical approach and computational challenges associated with the Challenge Project are presented. The simulations run have produced critical understanding that allows design and performanceprediction tools to be improved by being based more on flow physics and less on empiricism. Distortion transfer simulations of two multistage fans show the stage-bystage transfer and generation of total pressure and total temperature distortion. Fan response for each stage along the circumference showed the first stage and the middle stage performance trajectories did not follow a typical speedline pattern while the last stage performance closely resembled a typical speedline. Simulations of an Air Force Research Laboratory (AFRL) research compressor are presented to show how aerodynamic blockage varies at off design operating conditions.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126993561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Elton, S. Samsi, H. Smith, L. Humphrey, S. Ahalt, A. Chalker, Niraj Srivastava, A. H. Abdullah, P. Boyle
{"title":"Using Star-P® on DoD High Performance Computing Systems","authors":"B. Elton, S. Samsi, H. Smith, L. Humphrey, S. Ahalt, A. Chalker, Niraj Srivastava, A. H. Abdullah, P. Boyle","doi":"10.1109/HPCMP-UGC.2009.64","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.64","url":null,"abstract":"This paper provides a step-by-step demonstration of a Very High Level Language system, Star-P, on Department of Defense (DoD) high performance computing (HPC) systems. Specifically, we demonstrate how to effect parallel computing in MATLAB and Python via Star-P on the DoD Supercomputing Resource Center (DSRC) Army Research Laboratory (ARL) 4,488-core Intel Woodcrest MJM system. We demonstrate how to run various Star-P/MATLAB and Star-P/Python programs in parallel on the ARL DSRC MJM system. The results will focus on the use of Star-P software platform and how it delivers mission tempo by enabling rapid application prototyping and allowing transparent use of DSRC HPC resources from familiar desktop environments, such as Microsoft Windows and Linux.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"07 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129227491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Vahala, J. Yepez, M. Soe, L. Vahala, S. Ziegeler
{"title":"Quantum Lattice-Gas Algorithm for Quantum Turbulence - CAP Simulations on 12,288 Cores of Cray XT-5 Einstein at NAVO","authors":"G. Vahala, J. Yepez, M. Soe, L. Vahala, S. Ziegeler","doi":"10.1109/HPCMP-UGC.2009.20","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.20","url":null,"abstract":"A novel unitary quantum lattice algorithm is developed to explore quantum turbulence. Because of its low memory requirements and its near perfect parallelization to the full 12,288 cores on the Cray XT5, simulations were run up to spatial grids of 5,7603. The Gross-Pitaevskii equation, which describes the ground state of a Bose Einstein condensate (BEC), is solved and it is found that the incompressible kinetic energy spectrum exhibits 3 distinct power laws: classical Kolmogorov k?5/3 spectrum at scales much larger than the individual quantum vortex cores, and a quantum Kelvin wave cascade spectrum of k?3 at scales of the order of the quantum cores. In the adjoining semiclassical regime, there is a steeper spectral decay transitioning between the classical and quantum regimes. However, its spectral exponent does not seem to be universal. This is the first, first-principle simulation yielding the universal quantum Kelvin cascade exponent.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126315178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Bernholc, L. Yu, V. Ranjan, M. Nardelli, W. Lu, K. Saha, V. Meunier
{"title":"Electronic Properties of High-Performance Capacitor Materials and Nanoscale Multiterminal Devices","authors":"J. Bernholc, L. Yu, V. Ranjan, M. Nardelli, W. Lu, K. Saha, V. Meunier","doi":"10.1109/HPCMP-UGC.2009.51","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.51","url":null,"abstract":"Recent advances in theoretical methods combined with the advent of massively-parallel supercomputers allow one to reliably simulate the properties of complex materials and device structures from first principles. We describe applications in two general areas: (i) novel ferroelectric oxide-polymer composites for ultrahigh power density capacitors, necessary for pulsed power applications, such as electric discharges, power conditioning, and dense electronic circuitry, and (ii) electron transport properties of ballistic, multi-terminal molecular devices, which could form the basis for ultraspeed electronics and spintronics. For capacitor materials, we investigate the dielectric properties of PbTiO3 slabs and polypropylene/PbTiO3 nanocomposites. We evaluate both the optical and static local dielectric permittivity profiles for isolated PbTiO3 slabs and across the polypropylene/PbTiO3 interface. For thin ferroelectric slabs, we find that in order to maintain the ferroelectric structure, it is necessary to introduce compensating surface charges. Our results show that: (i) the surface-and interface-induced modifications to dielectric permittivity in polymer/metal-oxide composites are localized to only a few atomic layers; (ii) the interface effects are mainly confined to the metal-oxide side; and (iii) metal-oxide particles larger than a few nanometers retain the average macroscopic value of bulk dielectric permittivity. Turning to nanoelectronic devices, we investigate ballistic electron transport through a paradigmatic four-terminal molecular electronic device. In contrast to a conventional two-terminal setup, the same organic molecule placed between four electrodes exhibits new properties, such as a pronounced negative differential resistance.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132777581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Realization of Linear Wave-Propagation Models from HPC Simulations","authors":"S. Ketcham, M. Parker, M. Phan","doi":"10.1109/HPCMP-UGC.2009.57","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.57","url":null,"abstract":"Modeling of sound propagation in complex environments requires high performance computing (HPC) to simulate three-dimensional wave fields with realistic fidelity. This is especially true for urban areas, where sound waves reflect and diffract due to the built-up infrastructure. HPC can predict these wave fields with desired fidelity, but the computational investment would have greater return if reduced-size models that operate with considerably less computational resources could be produced from the results. The objective of this work is to develop such models. The work applies a modified version of the Eigensystem Realization Algorithm, using Markov parameters from HPC input-output response functions, to generate state-space models that simulate hundreds of thousands of output signals of the HPC wave field. The results include predicted acoustic signals and signatures from realized models, using a source with a different time series than the source used to generate the Markov parameters. We compare wave-field signals from reduced-order models with HPC model signals over a large urban domain, adjusting the model order and accuracy by singular-value cutoff. We conclude that the method produces efficient high-fidelity models of sound propagation in complex environments.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117228536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-scale Forecasting and Targeting of Tropical Cyclones in the Western Pacific","authors":"J. Doyle, C. Reynolds, Hao Jin, R. Hodur","doi":"10.1109/HPCMP-UGC.2009.44","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.44","url":null,"abstract":"In support of The Observing-system Research and Predictability Experiment (THORPEX) Pacific Asian Regional Campaign (T-PARC) and the Office of Naval Research (ONR) Tropical Cyclone Structure-08 (TCS-08) experiments, a variety of real-time products were produced at the Naval Research Laboratory during the field campaign that took place from August through early October 2008. In support of the targeted observing objective, large-scale targeting guidance was produced twice daily using singular vectors (SVs) from the Navy Operational Global Atmospheric Prediction System (NOGAPS). For mesoscale models, TC forecasts were produced using a new version of the Coupled Ocean/Atmosphere Mesoscale Prediction System (COAMPS®) developed specifically for tropical cyclone prediction (COAMPS-TC). In addition to the COAMPSTC forecasts, mesoscale targeted observing products were produced using the COAMPS forecast and adjoint system twice daily, centered on storms of interest.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116297329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Arunajatesan, N. Sinha, M. Stanek, J. Grove, Rudy A. Johnson
{"title":"High Performance Computational Modeling of Unsteady Surface Loads in Complex Weapons Bays","authors":"S. Arunajatesan, N. Sinha, M. Stanek, J. Grove, Rudy A. Johnson","doi":"10.1109/HPCMP-UGC.2009.14","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.14","url":null,"abstract":"This paper presents high performance computing results from applications of the CRAFT CFD® software to complex weapons bay configurations. Several flow control concepts have been developed based on past experience with smaller length-to-depth ratio (L/D) cavities with the intended application being a more complex and longer weapons bay. The goal in this paper is to document work done to build confidence in the modeling capability to evaluate flow control concepts for complex weapons bay configurations. The effects of some fundamental bay geometric changes are examined through Large Eddy Simulations. These simulations are validated against measurements and then used to understand the mechanisms involved. The results demonstrate that the extension of previous generic cavity modeling on shorter bays to the longer geometrically complex bay is not straightforward.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122315474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of Empirical and Theoretical Computations of Velocity for a Cold Spray Nozzle","authors":"S. Dinavahi, V. Champagne, D. Helfritch","doi":"10.1109/HPCMP-UGC.2009.10","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.10","url":null,"abstract":"Cold spray is a process whereby micron-size particles are accelerated to high velocity through entrainment in a gas undergoing expansion in a rocket nozzle and are subsequently impacted upon a surface. The impacted particles, which can be combinations of metals, ceramics and polymeric materials, form a consolidated structure that can be several centimeters thick. The characteristics of this structure depend on the initial characteristics of the metal powder and upon the impact velocity. Two-dimensional axi-symmetric computations of the flow through a converging, diverging nozzle were performed using the Reynolds-Averaged Navier-Stokes (RANS) code, Computational Fluid Dynamics++ (CFD++), on the Army Research Laboratory, Department of Defense (DoD) Supercomputing Resource Center (ARL DSRC) computers. Aluminum particles of constant diameter were injected at the entrance of a De Laval converging, diverging nozzle. The Eulerian Disperse Phase (EDP) capability in CFD++ was used for these simulations. The EDP model couples the dispersed phase with the fluid dynamics. In addition, onedimensional (1D), isentropic, gas-dynamic equations were solved for the same geometry and initial conditions. The results from the RANS computations and 1D calculation compared favorably, considering the difference in governing equations.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131817409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Analysis of GPU Parallel Computing","authors":"S. Park","doi":"10.1109/HPCMP-UGC.2009.59","DOIUrl":"https://doi.org/10.1109/HPCMP-UGC.2009.59","url":null,"abstract":"Parallel systems are becoming ubiquitous in the world of computing as evidenced by multi-core processors, heterogeneous Cell broadband engine, and highly parallel graphics processing units (GPUs). All parallel systems share a requirement that parallel programming is necessary to leverage multiple cores. As a result of this trend, multi-core CPUs are no longer a clear winner due to its peaked clock frequency and programming effort involved in parallelizing code for multi-core architecture. Given such drawbacks, dataparallel applications might benefit from GPU assisted computing. GPUs are the most popular and inexpensive accelerators. To evaluate GPU-based computing, a floating-point intensive algorithm for a radar imaging application is chosen for analysis. The paper attempts to present a fair performance comparison of CPU and GPU implementations.","PeriodicalId":268639,"journal":{"name":"2009 DoD High Performance Computing Modernization Program Users Group Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130372691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}