{"title":"TDPSS: A Scalable Time Domain Power System Simulator for Dynamic Security Assessment","authors":"S. Khaitan, J. McCalley","doi":"10.1109/SC.Companion.2012.51","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.51","url":null,"abstract":"Simulation plays a very crucial role to model, study and experiment with any design innovation proposed in the power systems. Since mathematical modeling of power systems leads to tens of thousands of stiff DAEs (differential and algebraic equations), the design of power system simulators involve exercising a trade-off between the simulation speed and modeling accuracy. Lack of efficient and detailed simulators forces the designers to experiment their techniques with small test systems and hence, the results obtained from such experiments may not be representative of the results obtained using real-life power systems. In this paper, we present TDPSS, a high speed time domain power system simulator for dynamic security assessment. TDPSS has been designed using object-oriented programming framework, and thus, it is modular and extensible. By offering a variety of models of power system components and fast numerical algorithms, it provides the user with the flexibility to experiment with different design options in an efficient manner. We discuss the design of TDPSS to give insights into the simulation infrastructure and also discuss the areas where TDPSS can be extended for parallel contingency analysis. We also validate it against the commercial power system simulators, namely PSSE and DSA Tools. Further, we compare the simulation speed of TPDSS for different numerical algorithms. The results have shown that TDPSS is accurate and also outperforms the commonly used commercial simulator PSSE in terms of its computational efficiency.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"14 1","pages":"323-332"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80103585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Duque, Daniel E. Hiepler, S. Legensky, C. Stone
{"title":"In-Situ Feature Tracking and Visualization of a Temporal Mixing Layer","authors":"E. Duque, Daniel E. Hiepler, S. Legensky, C. Stone","doi":"10.1109/SC.Companion.2012.335","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.335","url":null,"abstract":"The flow field for a temporal mixing layer was analyzed by solving the Navier-Stokes equations via a Large Eddy Simulation method, LESLIE3D, and then visualizing and post-processing the resulting flow features by utilizing the prototype visualization and CFD data analysis software system Intelligent In-Situ Feature Detection, Tracking and Visualization for Turbulent Flow Simulations (IFDT). The system utilizes volume rendering with an Intelligent Adaptive Transfer Function that allows the user to train the visualization system to highlight flow features such as turbulent vortices. A feature extractor based upon a Prediction-Correction method then tracks and extracts the flow features and determines the statistics of features over time. The method executes In-Situ with the flow solver via a Python Interface Framework to avoid the overhead of saving data to file. The movie submitted for this visualization showcase highlights the visualization of the flow such as the formation of vortex features, vortex breakdown, the onset of turbulence and then fully mixed conditions.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"1 1","pages":"1593-1593"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88864747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Achieving design targets by stochastic car crash simulations","authors":"T. Yasuki","doi":"10.1109/SC.Companion.2012.350","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.350","url":null,"abstract":"","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"89 1","pages":"1923-1941"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78223865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Oh, $#*@! Exascale! The Effect of Emerging Architectures on Scientific Discovery","authors":"K. Moreland","doi":"10.1109/SC.Companion.2012.38","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.38","url":null,"abstract":"The predictions for exascale computing are dire. Although we have benefited from a consistent supercomputer architecture design, even across manufacturers, for well over a decade, recent trends indicate that future high-performance computers will have different hardware structure and programming models to which software must adapt. This paper provides an informal discussion on the ways in which changes in high-performance computing architecture will profoundly affect the scalability of our current generation of scientific visualization and analysis codes and how we must adapt our applications, workflows, and attitudes to continue our success at exascale computing.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"6 1","pages":"224-231"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72823698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems","authors":"Aysan Rasooli Oskooei, D. Down","doi":"10.1109/SC.Companion.2012.155","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.155","url":null,"abstract":"The scalability of Cloud infrastructures has significantly increased their applicability. Hadoop, which works based on a MapReduce model, provides for efficient processing of Big Data. This solution is being used widely by most Cloud providers. Hadoop schedulers are critical elements for providing desired performance levels. A scheduler assigns MapReduce tasks to Hadoop resources. There is a considerable challenge to schedule the growing number of tasks and resources in a scalable manner. Moreover, the potential heterogeneous nature of deployed Hadoop systems tends to increase this challenge. This paper analyzes the performance of widely used Hadoop schedulers including FIFO and Fair sharing and compares them with the COSHH (Classification and Optimization based Scheduler for Heterogeneous Hadoop) scheduler, which has been developed by the authors. Based on our insights, a hybrid solution is introduced, which selects appropriate scheduling algorithms for scalable and heterogeneous Hadoop systems with respect to the number of incoming jobs and available resources.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"39 1","pages":"1284-1291"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72843764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher D. Krieger, M. Strout, J. Roelofs, A. Bajwa
{"title":"Executing Optimized Irregular Applications Using Task Graphs within Existing Parallel Models","authors":"Christopher D. Krieger, M. Strout, J. Roelofs, A. Bajwa","doi":"10.1109/SC.Companion.2012.43","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.43","url":null,"abstract":"Many sparse or irregular scientific computations are memory bound and benefit from locality improving optimizations such as blocking or tiling. These optimizations result in asynchronous parallelism that can be represented by arbitrary task graphs. Unfortunately, most popular parallel programming models with the exception of Threading Building Blocks (TBB) do not directly execute arbitrary task graphs. In this paper, we compare the programming and execution of arbitrary task graphs qualitatively and quantitatively in TBB, the OpenMP doall model, the OpenMP 3.0 task model, and Cilk Plus. We present performance and scalability results for 8 and 40 core shared memory systems on a sparse matrix iterative solver and a molecular dynamics benchmark.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"40 1","pages":"261-268"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86921617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neural Circuit Simulation of Hodgkin-Huxley Type Neurons Toward Peta Scale Computers","authors":"Daisuke Miyamoto, T. Kazawa, R. Kanzaki","doi":"10.1109/SC.Companion.2012.314","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.314","url":null,"abstract":"We ported and optimized simulation environment \"NEURON\" on K computer to simulate a insect brain as multi-compartment Hodgkin-Huxley type model. To use SIMD units of SPARC64VIIIfx (CPU of K computer), we exchanged the order of the compartment loop and the ion channel loop and apply sector caches. These tuning improved single core performance 340 MFLOPS/core to 1560 MFLOPS/core (about 10% efficiency).Spike exchange method of gNEURONh (MPI_Allgather) demands large amount of time in case of 10,000 cores or more and simple asynchronous point-to-point method (MPI_Isend) is not effective either, because of a large number of function calls and long distance of interconnect pathway. To tackle these problems, we adopted MPI/OpenMP hybrid parallelization to reduce interconnect communications and we developed a program to optimize location of neurons on calculation nodes in the 3D torus network. As a these results, we obtained 187 TFLOPS with 196,608 CPU cores.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"160 1","pages":"1541-1541"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87053432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Real-Time Computation Engine for a Dispatcher Training Center of the European Transmission Network","authors":"B. Haut, Francois Bouchez, F. Villella","doi":"10.1109/SC.Companion.2012.52","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.52","url":null,"abstract":"Dispatcher Training Simulators (DTS) are fundamental tools used by Transmission System Operators (TSO's) and Distribution System Operators (DSO's) around the world to train their dispatchers on frequent or uncommon situations. DTS widely used are generally based on simplified models for the system dynamics (static modelling or very simplified dynamics) and are limited to small/medium systems due to constraints in computational performance. Taking into account fast dynamics on large system is a real challenge for the simulation engine of a DTS. Indeed, in order to represent effectively the reaction of the power system, the simulation must be carried out very close to real-time. The PEGASE project addressed this challenge on the European Transmission Network (ETN). Many different algorithms were investigated and several have been implemented in a prototype based on FAST, a full-dynamic DTS simulation engine developed by Tractebel Engineering S.A. and integrated in the Energy Management System (EMS) of several TSO's. This paper describes the considered algorithmic improvements and presents numerical results (both in terms of accuracy and efficiency) obtained on a representation of the whole ETN.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"54 1","pages":"333-340"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77781127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Data Analysis Performance for High-Performance Computing with Integrating Statistical Metadata in Scientific Datasets","authors":"Jialin Liu, Yong Chen","doi":"10.1109/SC.Companion.2012.156","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.156","url":null,"abstract":"Scientific datasets and libraries, such as HDF5, ADIOS, and NetCDF, have been used widely in many data intensive applications. These libraries have their special file formats and I/O functions to provide efficient access to large datasets. When the data size keeps increasing, these high level I/O libraries face new challenges. Recent studies have started to utilize database techniques such as indexing and subsetting, and data reorganization to manage the increasing datasets. In this work, we present a new approach to boost the data analysis performance, namely Fast Analysis with Statistical Metadata (FASM), via data subsetting and integrating a small amount of statistics into the original datasets. The added statistical information illustrates the data shape and provides knowledge of the data distribution; therefore the original I/O libraries can utilize these statistical metadata to perform fast queries and analyses. The proposed FASM approach is currently evaluated with the PnetCDF on Lustre file systems, but can also be implemented with other scientific libraries. The FASM can potentially lead to a new dataset design and can have an impact on big data analysis.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"50 1","pages":"1292-1295"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87796451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Poster: MPACK 0.7.0: Multiple Precision Version of BLAS and LAPACK","authors":"Maho Nakata","doi":"10.1109/SC.Companion.2012.183","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.183","url":null,"abstract":"We are interested in the accuracy of linear algebra operations; accuracy of the solution of linear equation, eigenvalue and eigenvectors of some matrices, etc. This is a reason for we have been developing the MPACK. The MPACK consists of MBLAS and MLAPACK, multiple precision version of BLAS and LAPACK, respectively. Features of MPACK are: (i) based on LAPACK 3.x, (ii) to provide a reference implementation and or API (iii) written in C++, rewrite from FORTRAN77 (iv) supports GMP, MPFR, DD/QD and binary128 as multiple precision arithmetic library and (v) portable. Current version of MPACK is 0.7.0 and it supports 76 MBLAS routines and 100 MLAPACK routines. Matrix-matrix multiplication routine has been accelerated using NVIDIA C2050 GPU.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"26 1","pages":"1353-1353"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81879453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}