ACM/IEEE SC 2006 Conference (SC'06)最新文献

Revisiting Web Server Workload Invariants in the Context of Scientific Web Sites 重访科学网站环境中的Web服务器工作负载不变量

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188570

Anne‐Marie Faber, Minaxi Gupta, C. Viecco

引用次数: 18

Adaptive Routing in High-Radix Clos Network 高基数Clos网络中的自适应路由

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188552

John Kim, W. Dally, D. Abts

引用次数: 70

Supporting Dynamic Migration in Tightly Coupled Grid Applications 在紧密耦合的网格应用中支持动态迁移

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188577

Liang Chen, Qian Zhu, G. Agrawal

{"title":"Supporting Dynamic Migration in Tightly Coupled Grid Applications","authors":"Liang Chen, Qian Zhu, G. Agrawal","doi":"10.1145/1188455.1188577","DOIUrl":"https://doi.org/10.1145/1188455.1188577","url":null,"abstract":"In recent years, there has been a growing trend towards supporting more tightly coupled applications on the grid, including scientific workflows, applications that use pipelined or data-flow like processing, and distributed streaming applications. As availability of resources can vary over time in a grid environment, dynamic reallocation of resources is very important for these applications, particularly because of their long-running nature, and because they often require large-volume data transfers between processing stages. This paper considers the problem of supporting and efficiently implementing dynamic resource allocation for tightly-coupled and pipelined applications in a grid environment. We provide an alternative to basic checkpointing, using the notion of light-weight summary structure (LSS), to enable efficient migration. The idea behind LSS is that at certain points during the execution of a processing stage, the state of the program can be summarized by a small amount of memory. This allows us to perform low-cost process migration, as long as such memory can be identified by an application developer, and migration is performed only at these points. Our implementation and evaluation of LSS based process migration has been in the context of the GATES (grid-based adaptive execution on streams) middleware that we have been developing. We also present an algorithm for dynamic resource allocation, and have shown an architecture for resource monitoring and allocation. We have extensively evaluated our implementation using three stream data processing applications, and show that the use of LSS allows efficient process migration","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128793215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Scalable Algorithms for Molecular Dynamics Simulations on Commodity Clusters 商品集群分子动力学模拟的可扩展算法

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188544

K. Bowers, Edmond Chow, Huafeng Xu, R. Dror, M. Eastwood, Brent A. Gregersen, J. L. Klepeis, I. Kolossváry, Mark A. Moraes, Federico D. Sacerdoti, J. Salmon, Yibing Shan, D. Shaw

{"title":"Scalable Algorithms for Molecular Dynamics Simulations on Commodity Clusters","authors":"K. Bowers, Edmond Chow, Huafeng Xu, R. Dror, M. Eastwood, Brent A. Gregersen, J. L. Klepeis, I. Kolossváry, Mark A. Moraes, Federico D. Sacerdoti, J. Salmon, Yibing Shan, D. Shaw","doi":"10.1145/1188455.1188544","DOIUrl":"https://doi.org/10.1145/1188455.1188544","url":null,"abstract":"Although molecular dynamics (MD) simulations of biomolecular systems often run for days to months, many events of great scientific interest and pharmaceutical relevance occur on long time scales that remain beyond reach. We present several new algorithms and implementation techniques that significantly accelerate parallel MD simulations compared with current state-of-the-art codes. These include a novel parallel decomposition method and message-passing techniques that reduce communication requirements, as well as novel communication primitives that further reduce communication time. We have also developed numerical techniques that maintain high accuracy while using single precision computation in order to exploit processor-level vector instructions. These methods are embodied in a newly developed MD code called Desmond that achieves unprecedented simulation throughput and parallel scalability on commodity clusters. Our results suggest that Desmond's parallel performance substantially surpasses that of any previously described code. For example, on a standard benchmark, Desmond's performance on a conventional Opteron cluster with 2K processors slightly exceeded the reported performance of IBM's Blue Gene/L machine with 32K processors running its Blue Matter MD code","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126216560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2234

Sequoia: Programming the Memory Hierarchy 红杉:内存层次结构编程

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188543

K. Fatahalian, D. Horn, T. Knight, L. Leem, M. Houston, Ji Young Park, M. Erez, Manman Ren, A. Aiken, W. Dally, P. Hanrahan

引用次数: 541

Sustainable Adaptive Grid Supercomputing: Multiscale Simulation of Semiconductor Processing across the Pacific 可持续自适应网格超级计算:跨太平洋半导体加工的多尺度模拟

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188566

H. Takemiya, Yoshio Tanaka, S. Sekiguchi, S. Ogata, R. Kalia, A. Nakano, P. Vashishta

{"title":"Sustainable Adaptive Grid Supercomputing: Multiscale Simulation of Semiconductor Processing across the Pacific","authors":"H. Takemiya, Yoshio Tanaka, S. Sekiguchi, S. Ogata, R. Kalia, A. Nakano, P. Vashishta","doi":"10.1145/1188455.1188566","DOIUrl":"https://doi.org/10.1145/1188455.1188566","url":null,"abstract":"We propose a reservation-based sustainable adaptive grid supercomputing paradigm to enable tightly coupled computations of considerable scale (involving over 1,000 processors) and duration (over tens of continuous days) on a grid of geographically distributed parallel supercomputers. The paradigm is demonstrated for an adaptive multiscale simulation application, in which accurate but compute-intensive quantum mechanical (QM) simulations are embedded within a classical molecular dynamics (MD) simulation only when and where high fidelity is required. Key technical innovations include: 1) an embedded divide-and-conquer algorithmic framework to maximally expose data and computation localities for enhanced scalability; 2) a buffered-cluster hybridization scheme to adaptively adjust MD/QM boundaries to maintain the model accuracy; and 3) a hybrid grid remote procedure call (GridRPC) + message passing interface (MPI) grid application framework to combine flexibility (adaptive resource allocation and migration), fault tolerance (automated fault recovery), and efficiency (scalable management of large computing resources). We have achieved an automated execution of multiscale MD/QM simulation on a Grid consisting of 6 supercomputer centers in Japan and the US (in total of 150 thousand processor hours) for the dynamic simulation of implanted oxygen atoms in a silicon substrate, in which the number of processors changes dynamically on demand and resources are allocated and migrated dynamically according to both reservations and unexpected faults. The simulation results reveal a strong dependence of the oxygen penetration depth on the incident oxygen-beam position, which is useful information to further advance SIMOX (separation by implanted oxygen) technique to fabricate high speed and low power-consumption semiconductor devices","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117309761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Nested OpenMP for Efficient Computation of 3D Critical Points in Multi-Block CFD Datasets 基于嵌套OpenMP的多块CFD数据集三维关键点高效计算

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188553

A. Gerndt, Samuel Sarholz, M. Wolter, Dieter an Mey, C. Bischof, T. Kuhlen

引用次数: 13

Hypergraph Partitioning for Automatic Memory Hierarchy Management 用于自动内存层次管理的超图分区

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188558

S. Krishnamoorthy, Ümit V. Çatalyürek, J. Nieplocha, A. Rountev, P. Sadayappan

引用次数: 25

Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs MPI程序中通信相位的自适应、透明频率和电压缩放

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188567

M. Lim, V. Freeh, D. Lowenthal

{"title":"Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs","authors":"M. Lim, V. Freeh, D. Lowenthal","doi":"10.1145/1188455.1188567","DOIUrl":"https://doi.org/10.1145/1188455.1188567","url":null,"abstract":"Although users of high-performance computing are most interested in raw performance, both energy and power consumption have become critical concerns. Some microprocessors allow frequency and voltage scaling, which enables a system to reduce CPU performance and power when the CPU is not on the critical path. When properly directed, such dynamic frequency and voltage scaling can produce significant energy savings with little performance penalty. This paper presents an MPI runtime system that dynamically reduces CPU performance during communication phases in MPI programs. It dynamically identifies such phases and, without profiling or training, selects the CPU frequency in order to minimize energy-delay product. All analysis and subsequent frequency and voltage scaling is within MPI and so is entirely transparent to the application. This means that the large number of existing MPI programs, as well as new ones being developed, can use our system without modification. Results show that the average reduction in energy-delay product over the NAS benchmark suite is 10% - the average energy reduction is 12% while the average execution time increase is only 2.1%","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115404473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 209

CycleMeter: Detecting Fraudulent Peers in Internet Cycle Sharing CycleMeter:在互联网周期共享中检测欺诈对等体

ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188584

Zheng Zhang, Y. C. Hu, S. Midkiff

引用次数: 2