ACM/IEEE SC 2006 Conference (SC'06)最新文献

筛选
英文 中文
Preliminary Investigation of Advanced Electrostatics in Molecular Dynamics on Reconfigurable Computers 可重构计算机分子动力学高级静电学的初步研究
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188550
R. Scrofano, V. Prasanna
{"title":"Preliminary Investigation of Advanced Electrostatics in Molecular Dynamics on Reconfigurable Computers","authors":"R. Scrofano, V. Prasanna","doi":"10.1145/1188455.1188550","DOIUrl":"https://doi.org/10.1145/1188455.1188550","url":null,"abstract":"Scientific computing is marked by applications with very high performance demands. As technology has improved, reconfigurable hardware has become a viable platform to provide application acceleration, even for floating-point-intensive scientific applications. Now, reconfigurable computers - computers with general purpose microprocessors, reconfigurable hardware, memory, and high performance interconnect - are emerging as platforms that allow complete applications to be partitioned into parts that execute in software and parts that are accelerated in hardware. In this paper, we study molecular dynamics simulation. Specifically, we study the use of the smooth particle mesh Ewald technique in a molecular dynamics simulation program that takes advantage of the hardware acceleration capabilities of a reconfigurable computer. We demonstrate a 2.7-2.9times speed-up over the corresponding software-only simulation program. Along the way, we note design issues and techniques related to the use of reconfigurable computers for scientific computing in general","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122798525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Benchmarking XML Processors for Applications in Grid Web Services 网格Web服务中应用程序的XML处理器基准测试
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188581
Michael R. Head, M. Govindaraju, Robert A. van Engelen, Wei Zhang
{"title":"Benchmarking XML Processors for Applications in Grid Web Services","authors":"Michael R. Head, M. Govindaraju, Robert A. van Engelen, Wei Zhang","doi":"10.1145/1188455.1188581","DOIUrl":"https://doi.org/10.1145/1188455.1188581","url":null,"abstract":"Web services based specifications have emerged as the underlying architecture for core grid services and standards, such as WSRF. XML is inextricably inter-twined with Web services based specifications, and as a result the design and implementation of XML processing tools plays a significant role in grid applications. These applications use XML in a wide variety of ways, including workflow specifications, WS-Security based documents, service descriptions in WSDL, and on-the-wire format in SOAP-based communication. The application characteristics also vary widely in the use of XML messages in their performance, memory, size, and processing requirements. Numerous XML processing tools exist today, each of which is optimized for specific features. To make the right decisions, grid application and middleware developers must thus understand the complex dependencies between XML features and the application. We propose a standard benchmark suite for quantifying, comparing, and contrasting the performance of XML processors under a wide range of representative use cases. The benchmarks are defined by a set of XML schemas and conforming documents. To demonstrate the utility of the benchmarks and to provide a snapshot of the current XML implementation landscape, we report the performance of many different XML implementations, on the benchmarks, and draw conclusions about their current performance characteristics. We also present a brief analysis on the current shortcomings and required critical design changes for multi-threaded XML processing tools to run efficiently on emerging multi-core architectures","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128718832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Grid Capacity Planning with Negotiation-based Advance Reservation for Optimized QoS 基于协商预约优化QoS的网格容量规划
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188563
M. Siddiqui, A. Villazón, T. Fahringer
{"title":"Grid Capacity Planning with Negotiation-based Advance Reservation for Optimized QoS","authors":"M. Siddiqui, A. Villazón, T. Fahringer","doi":"10.1145/1188455.1188563","DOIUrl":"https://doi.org/10.1145/1188455.1188563","url":null,"abstract":"Advance reservation of grid resources can play a key role in enabling grid middleware to deliver on-demand resource provision with significantly improved quality-of-service (QoS). However, in the grid, advance reservation has been largely ignored due to the dynamic grid behavior, underutilization concerns, multi-constrained applications, and lack of support for agreement enforcement. These issues force the grid middleware to make resource allocations at run-time with reduced QoS. To remedy these, we introduce a new, 3-layered negotiation protocol for advance reservation of the grid resources. We model resource allocation as an online strip packing problem and introduce a new mechanism that optimizes resource utilization and QoS constraints while generating the contention-free solutions. The mechanism supports open reservations to deal with the dynamic grid and provides a practical solution for agreement enforcement. We have implemented a prototype and performed experiments to demonstrate the effectiveness of our approach","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131411813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 129
Large Image Correction and Warping in a Cluster Environment 集群环境下的大图像校正和翘曲
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188539
Vijay S. Kumar, B. Rutt, T. Kurç, Ümit V. Çatalyürek, J. Saltz, S. Chow, S. Lamont, M. Martone
{"title":"Large Image Correction and Warping in a Cluster Environment","authors":"Vijay S. Kumar, B. Rutt, T. Kurç, Ümit V. Çatalyürek, J. Saltz, S. Chow, S. Lamont, M. Martone","doi":"10.1145/1188455.1188539","DOIUrl":"https://doi.org/10.1145/1188455.1188539","url":null,"abstract":"This paper is concerned with efficient execution of a pipeline of data processing operations on very large images obtained from confocal microscopy instruments. We describe parallel, out-of-core algorithms for each operation in this pipeline. One of the challenging steps in the pipeline is the warping operation using inverse mapping based methods. We propose and investigate a set of algorithms to handle the warping computations on storage clusters. Our experimental results show that the proposed approaches are scalable both in terms of number of processors and the size of images","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124728078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Estimating Query Result Sizes for Proxy Caching in Scientific Database Federations 估计科学数据库联盟中代理缓存的查询结果大小
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188562
T. Malik, R. Burns, N. Chawla, A. Szalay
{"title":"Estimating Query Result Sizes for Proxy Caching in Scientific Database Federations","authors":"T. Malik, R. Burns, N. Chawla, A. Szalay","doi":"10.1145/1188455.1188562","DOIUrl":"https://doi.org/10.1145/1188455.1188562","url":null,"abstract":"In a proxy cache for federations of scientific databases it is important to estimate the size of a query before making a caching decision. With accurate estimates, near-optimal cache performance can be obtained. On the other extreme, inaccurate estimates can render the cache totally ineffective. We present classification and regression over templates (CAROT), a general method for estimating query result sizes, which is suited to the resource-limited environment of proxy caches and the distributed nature of database federations. CAROT estimates query result sizes by learning the distribution of query results, not by examining or sampling data, but from observing workload. We have integrated CAROT into the proxy cache of the National Virtual Observatory (NVO) federation of astronomy databases. Experiments conducted in the NVO show that CAROT dramatically outperforms conventional estimation techniques and provides near-optimal cache performance","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114557779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
The Potential Energy Efficiency of Vector Acceleration 矢量加速度的势能效率
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188537
Christophe Lemuet, Jack Sampson, Jean-Francois Collard, Norm Jouppi
{"title":"The Potential Energy Efficiency of Vector Acceleration","authors":"Christophe Lemuet, Jack Sampson, Jean-Francois Collard, Norm Jouppi","doi":"10.1145/1188455.1188537","DOIUrl":"https://doi.org/10.1145/1188455.1188537","url":null,"abstract":"Energy efficiency of computation is quickly becoming a key problem from the chip through the data center. This paper presents the first quantitative study of the potential energy efficiency of vector accelerators. We propose and study a vector accelerator architecture suitable for implementation in a 70 nm technology. The vector architecture has a high-bandwidth on-chip cache system coupled to 16 independent memory channels. We show that such an accelerator can achieve speedups of 10X or more on loop kernels in comparison to a quad-issue superscalar uniprocessor, while using less energy. We also introduce run-ahead lanes, a complexity and energy efficient means of tolerating variable latency from crossbar contention, cache bank conflicts, cache misses, and the memory system. Run-ahead lanes only synchronize on dependencies or when explicitly directed","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129749757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Performance Modeling and Optimization of a High Energy Colliding Beam Simulation Code 高能碰撞束流仿真代码的性能建模与优化
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188557
H. Shan, E. Strohmaier, J. Qiang, D. Bailey, K. Yelick
{"title":"Performance Modeling and Optimization of a High Energy Colliding Beam Simulation Code","authors":"H. Shan, E. Strohmaier, J. Qiang, D. Bailey, K. Yelick","doi":"10.1145/1188455.1188557","DOIUrl":"https://doi.org/10.1145/1188455.1188557","url":null,"abstract":"An accurate modeling of the beam-beam interaction is essential to maximizing the luminosity in existing and future colliders. BeamBeam3D was the first parallel code that can be used to study this interaction fully self-consistently on high-performance computing platforms. Various all-to-all personalized communication (AAPC) algorithms dominate its communication patterns, for which we developed a sequence of performance models using a series of micro-benchmarks. We find that for SMP based systems the most important performance constraint is node-adapter contention, while for 3D-torus topologies good performance models are not possible without considering link contention. The best average model prediction error is very low on SMP based systems with of 3% to 7%. On torus based systems errors of 29% are higher but optimized performance can again be predicted within 8% in some cases. These excellent results across five different systems indicate that this methodology for performance modeling can be applied to a large class of algorithms","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133862355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Design and Implementation of a One-Sided Communication Interface for the IBM eServer Blue Gene IBM eServer Blue Gene单侧通信接口的设计与实现
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188580
M. Blocksome, C. Archer, T. Inglett, P. McCarthy, M. Mundy, J. Ratterman, A. Sidelnik, B. Smith, G. Almási, J. Castaños, D. Lieber, J. Moreira, S. Krishnamoorthy, V. Tipparaju, J. Nieplocha
{"title":"Design and Implementation of a One-Sided Communication Interface for the IBM eServer Blue Gene","authors":"M. Blocksome, C. Archer, T. Inglett, P. McCarthy, M. Mundy, J. Ratterman, A. Sidelnik, B. Smith, G. Almási, J. Castaños, D. Lieber, J. Moreira, S. Krishnamoorthy, V. Tipparaju, J. Nieplocha","doi":"10.1145/1188455.1188580","DOIUrl":"https://doi.org/10.1145/1188455.1188580","url":null,"abstract":"This paper discusses the design and implementation of a one-sided communication interface for the IBM Blue Gene/L supercomputer. This interface facilitates ARMCI and the Global Arrays toolkit and can be used by other one-sided communication libraries. New protocols, interrupt driven communication, and compute node kernel enhancements were required to enable these libraries. Three possible methods for enabling ARMCI on the Blue Gene/L software stack are discussed. A detailed look into the development process shows how the implementation of the one-sided communication interface was completed. This was accomplished on a compressed time scale with the collaboration of various organizations within IBM and open source communities. In addition to enabling the one-sided libraries, bandwidth enhancements were made for communication along a diagonal on the Blue Gene/L torus network. The maximum bandwidth improved by a factor of three. This work will enable a variety of one-sided applications to run on Blue Gene/L","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130024356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A Performance Comparison Through Benchmarking and Modeling of Three Leading Supercomputers: Blue Gene/L, Red Storm, and Purple 蓝色基因/L、红色风暴和紫色三种领先超级计算机的基准测试和建模性能比较
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188534
A. Hoisie, Greg Johnson, D. Kerbyson, M. Lang, S. Pakin
{"title":"A Performance Comparison Through Benchmarking and Modeling of Three Leading Supercomputers: Blue Gene/L, Red Storm, and Purple","authors":"A. Hoisie, Greg Johnson, D. Kerbyson, M. Lang, S. Pakin","doi":"10.1145/1188455.1188534","DOIUrl":"https://doi.org/10.1145/1188455.1188534","url":null,"abstract":"This work provides a performance analysis of three leading supercomputers that have recently been deployed: Purple, Red Storm and Blue Gene/L. Each of these machines is architecturally diverse, with very different performance characteristics. Each contains over 10,000 processors and has a system peak of over 40 Teraflops. We analyze each system using a range of micro-benchmarks which include communication performance as well as quantifying the impact of the operating system. The achievable application performance is compared across the systems. The application performance is confirmed via the use of detailed application models which use the underlying performance characteristics as measured by the micro-benchmarks. We also compare the machines in a realistic production scenario in which each machine is used so as to maximize its memory usage with the applications executed in a weak-scaling mode. The results also help illustrate that achievable performance is not directly related to the peak performance","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123858989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
Modeling Pulse Propagation and Scattering in a Dispersive Medium: Performance of MPI/OpenMP Hybrid Code 脉冲在色散介质中的传播和散射建模:MPI/OpenMP混合代码的性能
ACM/IEEE SC 2006 Conference (SC'06) Pub Date : 2006-11-11 DOI: 10.1145/1188455.1188555
R. Rosenberg, G. Norton, J. Novarini, W. Anderson, M. Lanzagorta
{"title":"Modeling Pulse Propagation and Scattering in a Dispersive Medium: Performance of MPI/OpenMP Hybrid Code","authors":"R. Rosenberg, G. Norton, J. Novarini, W. Anderson, M. Lanzagorta","doi":"10.1145/1188455.1188555","DOIUrl":"https://doi.org/10.1145/1188455.1188555","url":null,"abstract":"Accurate modeling of pulse propagation and scattering is of great importance to the Navy. In a non-dispersive medium a fourth order in time and space 2-D finite difference time domain (FDTD) scheme representation of the linear wave equation can be used. However when the medium is dispersive one is required to take into account the frequency dependent attenuation and phase velocity. Using a theory first proposed by Blackstock, the linear wave equation has been modified by adding an additional term (the derivative of the convolution between the causal time domain propagation factor and the acoustic pressure) that takes into account the dispersive nature of the medium. This additional term transforms the calculation from one suitable to a workstation into one very much suited to a large-scale computational platform, both in terms of computation and memory. With appropriate distribution of data, good scaling can be achieved up to thousands of processors. Due to the simple structure of the code, it is easily parallelized using three different techniques: pure MPI, pure OpenMP and a hybrid MPI/OpenMP. We use this real life application to evaluate the performance of the latest multi-cpu/multicore platforms available from the DoD HPCMP","PeriodicalId":333909,"journal":{"name":"ACM/IEEE SC 2006 Conference (SC'06)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115177025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信