Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis最新文献_第2页

Diagnosing performance bottlenecks in emerging petascale applications 诊断新兴千兆级应用程序中的性能瓶颈

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654111

Nathan R. Tallent, J. Mellor-Crummey, L. Adhianto, M. Fagan, Mark W. Krentel

引用次数: 38

Efficient band approximation of Gram matrices for large scale kernel methods on GPUs 基于gpu的大规模核方法中Gram矩阵的有效带逼近

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654091

Mohamed E. Hussein, W. Abd-Almageed

引用次数: 9

The cat is out of the bag: cortical simulations with 109 neurons, 1013 synapses 谜底揭晓了:大脑皮层模拟了109个神经元，1013个突触

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654124

R. Ananthanarayanan, Steven K. Esser, H. Simon, D. Modha

引用次数: 292

Indexing genomic sequences on the IBM Blue Gene 索引基因组序列上的IBM蓝色基因

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654122

A. Ghoting, K. Makarychev

引用次数: 23

Optimal real number codes for fault tolerant matrix operations 最优实数代码的容错矩阵操作

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654089

Zizhong Chen

引用次数: 33

FALCON: a system for reliable checkpoint recovery in shared grid environments FALCON:在共享网格环境中用于可靠检查点恢复的系统

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654110

T. Islam, S. Bagchi, R. Eigenmann

引用次数: 9

Early performance evaluation of a "Nehalem" cluster using scientific and engineering applications 使用科学和工程应用程序对“Nehalem”集群进行早期性能评估

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654084

S. Saini, Andrey Naraikin, R. Biswas, D. Barkai, T. Sandstrom

{"title":"Early performance evaluation of a \"Nehalem\" cluster using scientific and engineering applications","authors":"S. Saini, Andrey Naraikin, R. Biswas, D. Barkai, T. Sandstrom","doi":"10.1145/1654059.1654084","DOIUrl":"https://doi.org/10.1145/1654059.1654084","url":null,"abstract":"In this paper, we present an early performance evaluation of a 624-core cluster based on the Intel® Xeon® Processor 5560 (code named \"Nehalem-EP\", and referred to as Xeon 5560 in this paper)---the third-generation quad-core architecture from Intel. This is the first processor from Intel with a non-uniform memory access (NUMA) architecture managed by on-chip integrated memory controller. It employs a point-to-point interconnect called the Intel® QuickPath Interconnect (QPI) between processors and to the input/output (I/O) hub. It also introduces to a quad-core architecture both Intel's hyper-threading technology (or simultaneous multi-threading, \"SMT\") and Intel® Turbo Boost Technology (\"Turbo mode\") that automatically allow processor cores to run faster than the base operating frequency if the processor is operating below rated power, temperature, and current specification limits. It can be engaged with any number of cores or logical processors enabled and active. We critically evaluate these features using the High Performance Computing Challenge (HPCC) benchmarks, NAS Parallel Benchmarks (NPB), and four full-scale scientific applications. We compare and contrast the results of a cluster based on the Xeon 5560 with an SGI® Altix® ICE 8200EX cluster of quad-core Intel® Xeon® 5472 Processor (\"Xeon 5472\" from here on) and another cluster of Intel® Xeon® 5462 Processor (\"Xeon 5462\"; the Xeon 5400 Series Processors are previous generation quad-core Intel processors and were code named Harpertown).","PeriodicalId":371415,"journal":{"name":"Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128972122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

A scalable method for ab initio computation of free energies in nanoscale systems 纳米系统中自由能从头计算的可扩展方法

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654125

M. Eisenbach, C.-G. Zhou, D. Nicholson, G. Brown, J. Larkin, T. Schulthess

引用次数: 22

Automating the generation of composed linear algebra kernels 自动生成组合线性代数核

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654119

Geoffrey Belter, E. Jessup, I. Karlin, Jeremy G. Siek

引用次数: 70

PFunc: modern task parallelism for modern high performance computing PFunc:现代高性能计算的现代任务并行

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Pub Date : 2009-11-14 DOI: 10.1145/1654059.1654103

P. Kambadur, Anshul Gupta, A. Ghoting, H. Avron, A. Lumsdaine

{"title":"PFunc: modern task parallelism for modern high performance computing","authors":"P. Kambadur, Anshul Gupta, A. Ghoting, H. Avron, A. Lumsdaine","doi":"10.1145/1654059.1654103","DOIUrl":"https://doi.org/10.1145/1654059.1654103","url":null,"abstract":"HPC today faces new challenges due to paradigm shifts in both hardware and software. The ubiquity of multi-cores, many-cores, and GPGPUs is forcing traditional serial as well as distributed-memory parallel applications to be parallelized for these architectures. Emerging applications in areas such as informatics are placing unique requirements on parallel programming tools that have not yet been addressed. Although, of all the available parallel programming models, task parallelism appears to be the most promising in meeting these new challenges, current solutions for task parallelism are inadequate. In this paper, we introduce PFunc, a new library for task parallelism that extends the feature set of current solutions for task parallelism with custom task scheduling, task priorities, task affinities, multiple completion notifications and task groups. These features enable PFunc to naturally and efficiently parallelize a wide variety of modern HPC applications and to support the SPMD model of parallel programming. We present three case studies: demand-driven DAG execution, frequent pattern mining and iterative sparse solvers to demonstrate the utility of PFunc's new features.","PeriodicalId":371415,"journal":{"name":"Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128467660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42