ACM/IEEE SC 2002 Conference (SC'02)最新文献

筛选
英文 中文
Massive Arrays of Idle Disks For Storage Archives 存储档案的海量空闲磁盘阵列
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.1109/SC.2002.10058
Dennis Colarelli, D. Grunwald
{"title":"Massive Arrays of Idle Disks For Storage Archives","authors":"Dennis Colarelli, D. Grunwald","doi":"10.1109/SC.2002.10058","DOIUrl":"https://doi.org/10.1109/SC.2002.10058","url":null,"abstract":"The declining costs of commodity disk drives is rapidly changing the economics of deploying large amounts of online or near-line storage. Conventional mass storage systems use either high performance RAID clusters, automated tape libraries or a combination of tape and disk. In this paper, we analyze an alternative design using massive arrays of idle disks, or MAID. We argue that this storage organization provides storage densities matching or exceeding those of tape libraries with performance similar to disk arrays. Moreover, we show that with effective power management of individual drives, this performance can be achieved using a very small power budget. In particular, we show that our power management strategy can result in the performance comparable to an always-on RAID system while using 1/15th the power of such a RAID system.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116896473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 410
Early Evaluation of the IBM p690 IBM p690的早期评估
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.1109/SC.2002.10000
P. Worley, T. Dunigan, M. Fahey, James B. White, Arthur S. Bland
{"title":"Early Evaluation of the IBM p690","authors":"P. Worley, T. Dunigan, M. Fahey, James B. White, Arthur S. Bland","doi":"10.1109/SC.2002.10000","DOIUrl":"https://doi.org/10.1109/SC.2002.10000","url":null,"abstract":"Oak Ridge National Laboratory recently received 27 32-way IBM pSeries 690 SMP nodes. In this paper, we describe our initial evaluation of the p690 architecture, focusing on the performance of benchmarks and applications that are representative of the expected production workload.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116466287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A 26.58 Tflops Global Atmospheric Simulation with the Spectral Transform Method on the Earth Simulator 在地球模拟器上用光谱变换方法模拟26.58 Tflops全球大气
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.1109/SC.2002.10053
S. Shingu, H. Takahara, H. Fuchigami, M. Yamada, Yoshinori Tsuda, W. Ohfuchi, Yuji Sasaki, Kazuo Kobayashi, Takashi Hagiwara, S. Habata, M. Yokokawa, Hiroyuki Itoh, K. Otsuka
{"title":"A 26.58 Tflops Global Atmospheric Simulation with the Spectral Transform Method on the Earth Simulator","authors":"S. Shingu, H. Takahara, H. Fuchigami, M. Yamada, Yoshinori Tsuda, W. Ohfuchi, Yuji Sasaki, Kazuo Kobayashi, Takashi Hagiwara, S. Habata, M. Yokokawa, Hiroyuki Itoh, K. Otsuka","doi":"10.1109/SC.2002.10053","DOIUrl":"https://doi.org/10.1109/SC.2002.10053","url":null,"abstract":"A spectral atmospheric general circulation model called AFES (AGCM for Earth Simulator) was developed and optimized for the architecture of the Earth Simulator (ES). The ES is a massively parallel vector supercomputer that consists of 640 processor nodes interconnected by a single stage crossbar network with its total peak performance of 40.96 Tflops was achieved for a high resolution simulation (T1279L96) with AFES by utilizing the full 640-node configuration of the ES. The resulting computing efficiency is 64.9% of the peak performance, well surpassing that of conventional weather/climate applications having just 25-50% efficiency even on vector parallel computers. This remarkable performance proves the effectiveness of the ES as a viable means for practical applications.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126809824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 87
Collaborative Simulation Grid: Multiscale Quantum-Mechanical/Classical Atomistic Simulations on Distributed PC Clusters in the US and Japan 协同模拟网格:美国和日本分布式PC集群上的多尺度量子力学/经典原子模拟
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.1109/SC.2002.10013
H. Kikuchi, R. Kalia, A. Nakano, P. Vashishta, H. Iyetomi, S. Ogata, T. Kouno, F. Shimojo, K. Tsuruta, S. Saini
{"title":"Collaborative Simulation Grid: Multiscale Quantum-Mechanical/Classical Atomistic Simulations on Distributed PC Clusters in the US and Japan","authors":"H. Kikuchi, R. Kalia, A. Nakano, P. Vashishta, H. Iyetomi, S. Ogata, T. Kouno, F. Shimojo, K. Tsuruta, S. Saini","doi":"10.1109/SC.2002.10013","DOIUrl":"https://doi.org/10.1109/SC.2002.10013","url":null,"abstract":"A multidisciplinary,collaborative simulation has been performed on a Grid of geographically distributed PC clusters.The multiscale simulation approach seamlessly combines i) atomistic simulation based on the molecular dynamics (MD) method and ii) quantum mechanical (QM) calculation based on the density functional theory (DFT), so that accurate but less scalable computations are performed only where they are needed. The multiscale MD/QM simulation code has been Grid-enabled using i) a modular, additive hybridization scheme, ii) multiple QM clustering, and iii) computation/communication overlapping. The Gridified MD/QM simulation code has been used to study environmental effects of water molecules on fracture in silicon. A preliminary run of the code has achieved a parallel efficiency of 94% on 25 PCs distributed over 3 PC clusters in the US and Japan, and a larger test involving 154 processors on 5 distributed PC clusters is in progress.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130186306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Implementation and Evaluation of A QoS-Capable Cluster-Based IP Router 一种具有qos功能的集群IP路由器的实现与评价
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.1109/SC.2002.10026
P. Pradhan, T. Chiueh
{"title":"Implementation and Evaluation of A QoS-Capable Cluster-Based IP Router","authors":"P. Pradhan, T. Chiueh","doi":"10.1109/SC.2002.10026","DOIUrl":"https://doi.org/10.1109/SC.2002.10026","url":null,"abstract":"A major challenge in Internet edge router design is to support both high packet forwarding performance and versatile and efficient packet processing capabilities. The thesis of this research project is that a cluster of PCs connected by a high-speed system area network provides an effective hardware platform for building routers to be used at the edges of the Internet. This paper describes a scalable and extensible edge router architecture called Panama, which supports a novel aggregate route caching scheme, a real-time link scheduling algorithm whose performance overhead is independent of the number of real-time flows, a highly efficient kernel extension mechanism to safely load networking software extensions dynamically, and an integrated resource scheduler which ensures that real-time flows with additional packet processing requirements still meet their end-to-end performance requirements. This paper describes the implementation and evaluation of the first Panama prototype based on a cluster of PCs and Myrinet.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127698164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
NAMD: Biomolecular Simulation on Thousands of Processors 数千个处理器上的生物分子模拟
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.1109/SC.2002.10019
James C. Phillips, G. Zheng, Sameer Kumar, L. Kalé
{"title":"NAMD: Biomolecular Simulation on Thousands of Processors","authors":"James C. Phillips, G. Zheng, Sameer Kumar, L. Kalé","doi":"10.1109/SC.2002.10019","DOIUrl":"https://doi.org/10.1109/SC.2002.10019","url":null,"abstract":"NAMD is a fully featured, production molecular dynamics program for high performance simulation of large biomolecular systems. We have previously, at SC2000, presented scaling results for simulations with cutoff electrostatics on up to 2048 processors of the ASCI Red machine, achieved with an object-based hybrid force and spatial decomposition scheme and an aggressive measurement-based predictive load balancing framework. We extend this work by demonstrating similar scaling on the much faster processors of the PSC Lemieux Alpha cluster, and for simulations employing efficient (order N log N) particle mesh Ewald full electrostatics. This unprecedented scalability in a biomolecular simulation code has been attained through latency tolerance, adaptation to multiprocessor nodes, and the direct use of the Quadrics Elan library in place of MPI by the Charm++/Converse parallel runtime system.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"74 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128044572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 284
Owner Prediction for Accelerating Cache-to-Cache Transfer Misses in a cc-NUMA Architecture cc-NUMA架构中加速缓存到缓存传输失误的所有者预测
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.5555/762761.762762
M. Acacio, José González, José M. García, J. Duato
{"title":"Owner Prediction for Accelerating Cache-to-Cache Transfer Misses in a cc-NUMA Architecture","authors":"M. Acacio, José González, José M. García, J. Duato","doi":"10.5555/762761.762762","DOIUrl":"https://doi.org/10.5555/762761.762762","url":null,"abstract":"Cache misses for which data must be obtained from a remote cache (cache-to-cache transfer misses) account for an important fraction of the total miss rate. Unfortunately, cc-NUMA designs put the access to the directory information into the critical path of 3-hop misses, which significantly penalizes them compared to SMP designs. This work studies the use of owner prediction as a means of providing cc-NUMA multiprocessors with a more efficient support for cache-to-cache transfer misses. Our proposal comprises an effective prediction scheme as well as a coherence protocol designed to support the use of prediction. Results indicate that owner prediction can significantly reduce the latency of cache-to-cache transfer misses, which translates into speed-ups on application performance up to 12%. In order to also accelerate most of those 3-hop misses that are either not predicted or mispredicted, the inclusion of a small and fast directory cache in every node is evaluated, leading to improvements up to 16% on the final performance.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134141664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Distributed Dynamic Hash Tables Using IBM LAPI 使用IBM LAPI的分布式动态哈希表
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.1109/SC.2002.10041
J. Malard, R. Stewart
{"title":"Distributed Dynamic Hash Tables Using IBM LAPI","authors":"J. Malard, R. Stewart","doi":"10.1109/SC.2002.10041","DOIUrl":"https://doi.org/10.1109/SC.2002.10041","url":null,"abstract":"An asynchronous communication library for accessing and managing dynamic hash tables over a network of Symmetric Multiprocessors (SMP) is presented. A blocking factor is shown experimentally to reduce the variance of the wall clock time. It is also shown that remote accesses to a distributed hash table can be as effective and scalable as the one-sided operations of the low-level communication middleware on an IBM SP.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"64 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130587568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
On Increasing Architecture Awareness in Program Optimizations to Bridge the Gap between Peak and Sustained Processor Performance — Matrix-Multiply Revisited 提高程序优化中的体系结构意识以弥合峰值和持续处理器性能之间的差距&#8212矩阵相乘重新审视
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.1109/SC.2002.10054
David Parello, O. Temam, J. Verdun
{"title":"On Increasing Architecture Awareness in Program Optimizations to Bridge the Gap between Peak and Sustained Processor Performance — Matrix-Multiply Revisited","authors":"David Parello, O. Temam, J. Verdun","doi":"10.1109/SC.2002.10054","DOIUrl":"https://doi.org/10.1109/SC.2002.10054","url":null,"abstract":"As the complexity of processor architectures increases, there is a widening gap between peak processor performance and sustained processor performance so that programs now tend to exploit only a fraction of available performance. While there is a tremendous amount of literature on program optimizations, compiler optimizations lack efficiency because they are plagued by three flaws: (1) they often implicitly use simplified, if not simplistic, models of processor architecture, (2) they usually focus on a single processor component (e.g., cache) and ignore the interactions among multiple components, (3) the most heavily nvestigated components (e.g., caches) sometimes have only a small impact on overall performance. Through the in-depth analysis of a simple program kernel, we want to show that understanding the complex interactions between programs and the numerous processor architecture components is both feasible and critical to design efficient program optimizations.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"375 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131870830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Parallel Multiscale Gauss-Newton-Krylov Methods for Inverse Wave Propagation 反波传播的平行多尺度高斯-牛顿-克雷洛夫方法
ACM/IEEE SC 2002 Conference (SC'02) Pub Date : 2002-11-16 DOI: 10.5555/762761.762827
V. Akçelik, G. Biros, O. Ghattas
{"title":"Parallel Multiscale Gauss-Newton-Krylov Methods for Inverse Wave Propagation","authors":"V. Akçelik, G. Biros, O. Ghattas","doi":"10.5555/762761.762827","DOIUrl":"https://doi.org/10.5555/762761.762827","url":null,"abstract":"One of the outstanding challenges of computational science and engineering is large-scale nonlinear parameter estimation of systems governed by partial differential equations. These are known as inverse problems, in contradistinction to the forward problems that usually characterize large-scale simulation. Inverse problems are significantly more difficult to solve than forward problems, due to ill-posedness, large dense ill-conditioned operators, multiple minima, space-time coupling, and the need to solve the forward problem repeatedly. We present a parallel algorithm for inverse problems governed by time-dependent PDEs, and scalability results for an inverse wave propagation problem of determining the material field of an acoustic medium. The difficulties mentioned above are addressed through a combination of total variation regularization, preconditioned matrix-free Gauss-Newton-Krylov iteration, algorithmic checkpointing, and multiscale continuation. We are able to solve a synthetic inverse wave propagation problem though a pelvic bone geometry involving 2.1 million inversion parameters in 3 hours on 256 processors of the Terascale Computing System at the Pittsburgh Supercomputing Center.","PeriodicalId":302800,"journal":{"name":"ACM/IEEE SC 2002 Conference (SC'02)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134100327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 163
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信