2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)最新文献_第2页

Wallaby: A scalable semantic configuration service for grids and clouds Wallaby:网格和云的可伸缩语义配置服务

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063348.2063362

W. C. Benton, Robert H. Rati, Erik J. Erlandson

{"title":"Wallaby: A scalable semantic configuration service for grids and clouds","authors":"W. C. Benton, Robert H. Rati, Erik J. Erlandson","doi":"10.1145/2063348.2063362","DOIUrl":"https://doi.org/10.1145/2063348.2063362","url":null,"abstract":"Job schedulers for grids and clouds can offer great generality and configurability, but they typically do so at the cost of increased administrator complexity. In this paper, we present Wallaby, an open-source, scalable configuration service for compute resources managed by the Condor high-throughput computing system. Wallaby offers several notable advantages over similar systems: it lets administrators write declarative specifications of user-visible functionality on groups of nodes instead of low-level configuration file fragments; it presents a high-level semantic model of Condor features and their interactions and dependencies; it validates configurations before pushing them to nodes; it supports version control, \"undo,\" and configuration differencing; and it includes a networked API that enables extensions and advanced functionality. Wallaby allows administrators to extend pools to include more physical, virtual, or cloud nodes with minimal explicit con figuration. Finally, it is scalable, supporting pools consisting of thousands of nodes with hundreds of configuration parameters each.","PeriodicalId":358797,"journal":{"name":"2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123985821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

System implications of memory reliability in exascale computing 百亿亿次计算中存储器可靠性的系统含义

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063384.2063445

Sheng Li, Ke Chen, Ming-yu Hsieh, Naveen Muralimanohar, C. Kersey, J. Brockman, Arun Rodrigues, N. Jouppi

{"title":"System implications of memory reliability in exascale computing","authors":"Sheng Li, Ke Chen, Ming-yu Hsieh, Naveen Muralimanohar, C. Kersey, J. Brockman, Arun Rodrigues, N. Jouppi","doi":"10.1145/2063384.2063445","DOIUrl":"https://doi.org/10.1145/2063384.2063445","url":null,"abstract":"Resiliency will be one of the toughest challenges in future exascale systems. Memory errors contribute more than 40% of the total hardware-related failures and are projected to increase in future exascale systems. The use of error correction codes (ECC) and checkpointing are two effective approaches to fault tolerance. While there are numerous studies on ECC or checkpointing in isolation, this is the first paper to investigate the combined effect of both on overall system performance and power. Specifically, we study the impact of various ECC schemes (SECDED, BCH, and chip-kill) in conjunction with checkpointing on future exascale systems. Our simulation results show that while chipkill is 13% better for computation-intensive applications, BCH has a 28% advantage in system energy-delay product (EDP) for memory-intensive applications. We also propose to use BCH in tagged memory systems with commodity DRAMs where chipkill is impractical. Our proposed architecture achieves 2.3× better system EDP than state-of-the-art tagged memory systems.","PeriodicalId":358797,"journal":{"name":"2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128681287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

One stop high performance computing user support at SNL 一站式高性能计算用户支持在SNL

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063348.2063383

J. Greenfield, L. Ice, S. Corwell, K. Haskell, C. Pavlakos, J. Noe

引用次数: 0

Integrating multi-touch in high-resolution display environments 在高分辨率显示环境中集成多点触控

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063348.2063359

Brandt M. Westing, B. Urick, M. Esteva, Freddy Rojas, Weijia Xu

引用次数: 10

Enabling and scaling biomolecular simulations of 100 million atoms on petascale machines with a multicore-optimized message-driven runtime 在具有多核优化的消息驱动运行时的千万亿次机器上启用和缩放1亿个原子的生物分子模拟

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063384.2063466

Chao Mei, Yanhua Sun, G. Zheng, Eric J. Bohm, L. Kalé, James C. Phillips, Christopher B. Harrison

{"title":"Enabling and scaling biomolecular simulations of 100 million atoms on petascale machines with a multicore-optimized message-driven runtime","authors":"Chao Mei, Yanhua Sun, G. Zheng, Eric J. Bohm, L. Kalé, James C. Phillips, Christopher B. Harrison","doi":"10.1145/2063384.2063466","DOIUrl":"https://doi.org/10.1145/2063384.2063466","url":null,"abstract":"A 100-million-atom biomolecular simulation with NAMD is one of the three benchmarks for the NSF-funded sustainable petascale machine. Simulating this large molecular system on a petascale machine presents great challenges, including handling I/O, large memory footprint and getting good strong-scaling results. In this paper, we present parallel I/O techniques to enable the simulation. A new SMP model is designed to efficiently utilize ubiquitous wide multicore clusters by extending the Charm++ asynchronous message-driven runtime. We exploit node-aware techniques to optimize both the application and the underlying SMP runtime. Hierarchical load balancing is further exploited to scale NAMD to the full Jaguar PF Cray XT5 (224,076 cores) at Oak Ridge National Laboratory, both with and without PME full electrostatics, achieving 93% parallel efficiency (vs 6720 cores) at 9 ms per step for a simple cutoff calculation. Excellent scaling is also obtained on 65,536 cores of the Intrepid Blue Gene/P at Argonne National Laboratory.","PeriodicalId":358797,"journal":{"name":"2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129137709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 66

SCMFS: A file system for Storage Class Memory SCMFS:存储类内存的文件系统

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063384.2063436

XiaoJian Wu, Sheng Qiu, A. Reddy

引用次数: 364

Scalable implementations of accurate excited-state coupled cluster theories: Application of high-level methods to porphyrin-based systems 精确激发态耦合集群理论的可扩展实现:基于卟啉系统的高级方法的应用

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063384.2063481

K. Kowalski, S. Krishnamoorthy, R. M. Olson, V. Tipparaju, E. Aprá

{"title":"Scalable implementations of accurate excited-state coupled cluster theories: Application of high-level methods to porphyrin-based systems","authors":"K. Kowalski, S. Krishnamoorthy, R. M. Olson, V. Tipparaju, E. Aprá","doi":"10.1145/2063384.2063481","DOIUrl":"https://doi.org/10.1145/2063384.2063481","url":null,"abstract":"The development of reliable tools for excited-state simulations is very important for understanding complex processes in the broad class of light harvesting systems and optoelectronic devices. Over the last years we have been developing equation of motion coupled cluster (EOMCC) methods capable of tackling these problems. In this paper we discuss the parallel performance of EOMCC codes which provide accurate description of excited-state correlation effects. Two aspects are discussed in detail: (1) a new algorithm for the iterative EOMCC methods based on improved parallel task scheduling algorithms, and (2) parallel algorithms for the non-iterative methods describing the effect of triply excited configurations. We demonstrate that the most computationally intensive non-iterative part can take advantage of 210,000 cores of the Cray XT5 system at the Oak Ridge Leadership Computing Facility (OLCF), achieving over 80% parallel efficiency. In particular, we demonstrate the importance of the computationally demanding non-iterative many-body methods in matching experimental level of accuracy for several porphyrin-based systems.","PeriodicalId":358797,"journal":{"name":"2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117289879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 36

The NWSC benchmark suite: Using scientific throughput to measure supercomputer performance NWSC基准套件:使用科学吞吐量来测量超级计算机性能

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063348.2063358

Rory C. Kelly, Siddhartha S. Ghosh, Si Liu, D. D. Vento, R. Valent

引用次数: 1

Hadoop acceleration through network levitated merge Hadoop加速通过网络悬浮合并

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063384.2063461

Yandong Wang, Xinyu Que, Weikuan Yu, Dror Goldenberg, Dhiraj Sehgal

引用次数: 123

Efficient data race detection for distributed memory parallel programs 分布式内存并行程序的高效数据竞争检测

2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC) Pub Date : 2011-11-12 DOI: 10.1145/2063384.2063452

Chang-Seo Park, Koushik Sen, Paul H. Hargrove, Costin Iancu

{"title":"Efficient data race detection for distributed memory parallel programs","authors":"Chang-Seo Park, Koushik Sen, Paul H. Hargrove, Costin Iancu","doi":"10.1145/2063384.2063452","DOIUrl":"https://doi.org/10.1145/2063384.2063452","url":null,"abstract":"In this paper we present a precise data race detection technique for distributed memory parallel programs. Our technique, which we call Active Testing, builds on our previous work on race detection for shared memory Java and C programs and it handles programs written using shared memory approaches as well as bulk communication. Active testing works in two phases: in the first phase, it performs an imprecise dynamic analysis of an execution of the program and finds potential data races that could happen if the program is executed with a different thread schedule. In the second phase, active testing re-executes the program by actively controlling the thread schedule so that the data races reported in the first phase can be confirmed. A key highlight of our technique is that it can scalably handle distributed programs with bulk communication and single- and split-phase barriers. Another key feature of our technique is that it is precise — a data race confirmed by active testing is an actual data race present in the program; however, being a testing approach, our technique can miss actual data races. We implement the framework for the UPC programming language and demonstrate scalability up to a thousand cores for programs with both fine-grained and bulk (MPI style) communication. The tool confirms previously known bugs and uncovers several unknown ones. Our extensions capture constructs proposed in several modern programming languages for High Performance Computing, most notably non-blocking barriers and collectives.","PeriodicalId":358797,"journal":{"name":"2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114527306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 45