2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)最新文献

Guide-copy: Fast and silent migration of virtual machine for datacenters Guide-copy:用于数据中心的虚拟机快速、静默迁移

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503251

Jihun Kim, Dongju Chae, Jangwoong Kim, Jong Kim

引用次数: 16

Practical nonvolatile multilevel-cell phase change memory 实用的非易失性多电平单元相变存储器

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503221

D. Yoon, Jichuan Chang, R. Schreiber, N. Jouppi

引用次数: 32

Solving the compressible Navier-Stokes equations on up to 1.97 million cores and 4.1 trillion grid points 在多达197万个核和4.1万亿个网格点上解决可压缩的Navier-Stokes方程

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503265

I. Bermejo-Moreno, J. Bodart, J. Larsson, Blaise M. Barney, J. Nichols, Steve Jones

引用次数: 57

Deterministic scale-free pipeline parallelism with hyperqueues 具有超队列的确定性无标度管道并行性

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503233

H. Vandierendonck, Kallia Chronaki, Dimitrios S. Nikolopoulos

{"title":"Deterministic scale-free pipeline parallelism with hyperqueues","authors":"H. Vandierendonck, Kallia Chronaki, Dimitrios S. Nikolopoulos","doi":"10.1145/2503210.2503233","DOIUrl":"https://doi.org/10.1145/2503210.2503233","url":null,"abstract":"Ubiquitous parallel computing aims to make parallel programming accessible to a wide variety of programming areas using deterministic and scale-free programming models built on a task abstraction. However, it remains hard to reconcile these attributes with pipeline parallelism, where the number of pipeline stages is typically hard-coded in the program and defines the degree of parallelism. This paper introduces hyperqueues, a programming abstraction that enables the construction of deterministic and scale-free pipeline parallel programs. Hyperqueues extend the concept of Cilk++ hyperobjects to provide thread-local views on a shared data structure. While hyperobjects are organized around private local views, hyperqueues require shared concurrent views on the underlying data structure. We define the semantics of hyperqueues and describe their implementation in a work-stealing scheduler. We demonstrate scalable performance on pipeline-parallel PARSEC benchmarks and find that hyperqueues provide comparable or up to 30% better performance than POSIX threads and Intel's Threading Building Blocks. The latter are highly tuned to the number of available processing cores, while programs using hyperqueues are scale-free.","PeriodicalId":371074,"journal":{"name":"2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126592462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Assessing the effects of data compression in simulations using physically motivated metrics 使用物理激励指标评估模拟中数据压缩的效果

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503283

D. Laney, S. Langer, Christopher Weber, Peter Lindstrom, Al Wegener

{"title":"Assessing the effects of data compression in simulations using physically motivated metrics","authors":"D. Laney, S. Langer, Christopher Weber, Peter Lindstrom, Al Wegener","doi":"10.1145/2503210.2503283","DOIUrl":"https://doi.org/10.1145/2503210.2503283","url":null,"abstract":"This paper examines whether lossy compression can be used effectively in physics simulations as a possible strategy to combat the expected data-movement bottleneck in future high performance computing architectures. We show that, for the codes and simulations we tested, compression levels of 3-5X can be applied without causing significant changes to important physical quantities. Rather than applying signal processing error metrics, we utilize physics-based metrics appropriate for each code to assess the impact of compression. We evaluate three different simulation codes: a Lagrangian shock-hydrodynamics code, an Eulerian higher-order hydrodynamics turbulence modeling code, and an Eulerian coupled laser-plasma interaction code. We compress relevant quantities after each time-step to approximate the effects of tightly coupled compression and study the compression rates to estimate memory and disk-bandwidth reduction. We find that the error characteristics of compression algorithms must be carefully considered in the context of the underlying physics being modeled.","PeriodicalId":371074,"journal":{"name":"2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117315663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 57

Predicting application performance using supervised learning on communication features 使用通信特征的监督学习预测应用程序性能

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503263

Nikhil Jain, A. Bhatele, Michael P. Robson, T. Gamblin, L. Kalé

引用次数: 47

Parallel design and performance of nested filtering factorization preconditioner 嵌套滤波分解预调节器的并行设计与性能

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503287

Long Qu, L. Grigori, F. Nataf

引用次数: 8

Rethinking algorithm-based fault tolerance with a cooperative software-hardware approach 基于软硬件协同方法的算法容错再思考

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503226

Dong Li, Zizhong Chen, Panruo Wu, J. Vetter

引用次数: 47

Exploring the future of out-of-core computing with compute-local non-volatile memory 用计算本地非易失性存储器探索核外计算的未来

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503261

Myoungsoo Jung, E. Wilson, Wonil Choi, J. Shalf, H. Aktulga, Chao Yang, Erik Saule, Ümit V. Çatalyürek, M. Kandemir

{"title":"Exploring the future of out-of-core computing with compute-local non-volatile memory","authors":"Myoungsoo Jung, E. Wilson, Wonil Choi, J. Shalf, H. Aktulga, Chao Yang, Erik Saule, Ümit V. Çatalyürek, M. Kandemir","doi":"10.1145/2503210.2503261","DOIUrl":"https://doi.org/10.1145/2503210.2503261","url":null,"abstract":"Drawing parallels to the rise of general purpose graphical processing units (GPGPUs) as accelerators for specific high-performance computing (HPC) workloads, there is a rise in the use of non-volatile memory (NVM) as accelerators for I/O-intensive scientific applications. However, existing works have explored use of NVM within dedicated I/O nodes, which are distant from the compute nodes that actually need such acceleration. As NVM bandwidth begins to out-pace point-to-point network capacity, we argue for the need to break from the archetype of completely separated storage. Therefore, in this work we investigate co-location of NVM and compute by varying I/O interfaces, file systems, types of NVM, and both current and future SSD architectures, uncovering numerous bottlenecks implicit in these various levels in the I/O stack. We present novel hardware and software solutions, including the new Unified File System (UFS), to enable fuller utilization of the new compute-local NVM storage. Our experimental evaluation, which employs a real-world Out-of-Core (OoC) HPC application, demonstrates throughput increases in excess of an order of magnitude over current approaches.","PeriodicalId":371074,"journal":{"name":"2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122017952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Scalable parallel graph partitioning 可伸缩并行图划分

2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2013-11-17 DOI: 10.1145/2503210.2503280

Shad Kirmani, P. Raghavan

引用次数: 29