ACM/IEEE SC 2000 Conference (SC'00)最新文献

筛选
英文 中文
Improving Fine-Grained Irregular Shared-Memory Benchmarks by Data Reordering 通过数据重排序改进细粒度不规则共享内存基准测试
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-11-01 DOI: 10.1109/SC.2000.10009
Y. C. Hu, A. Cox, W. Zwaenepoel
{"title":"Improving Fine-Grained Irregular Shared-Memory Benchmarks by Data Reordering","authors":"Y. C. Hu, A. Cox, W. Zwaenepoel","doi":"10.1109/SC.2000.10009","DOIUrl":"https://doi.org/10.1109/SC.2000.10009","url":null,"abstract":"We demonstrate that data reordering can substantially improve the performance of fine-grained irregular shared-memory benchmarks, on both hardware and software shared-memory systems. In particular, we evaluate two distinct data reordering techniques that seek to co-locate in memory objects that are in close proximity in the physical system modeled by the computation. The effects of these techniques are increased spatial locality and reduced false sharing. We evaluate the effectiveness of the data reordering techniques on a set of five irregular applications from SPLASH-2 and Chaos. We implement both techniques in a small library, allowing us to enable them in an application by adding less than 10 lines of code. Our results on one hardware and two software shared-memory systems show that, with data reordering during initialization, the performance of these applications is improved by 12%-99% on the Origin 2000, 30%-366% on TreadMarks, and 14%-269% on HLRC.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"os-4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127991657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Parallel Algorithms for Radiation Transport on Unstructured Grids 非结构网格上辐射输运的并行算法
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-11-01 DOI: 10.1109/SC.2000.10030
S. Plimpton, B. Hendrickson, S. Burns, William C. McLendon
{"title":"Parallel Algorithms for Radiation Transport on Unstructured Grids","authors":"S. Plimpton, B. Hendrickson, S. Burns, William C. McLendon","doi":"10.1109/SC.2000.10030","DOIUrl":"https://doi.org/10.1109/SC.2000.10030","url":null,"abstract":"The method of discrete ordinates is commonly used to solve the Boltzmann radiation transport equation for applications ranging from simulations of fires to weapons effects. The equations are most efficiently solved by sweeping the radiation flux across the computational grid. For unstructured grids this poses several interesting challenges, particularly when implemented on distributed-memory parallel machines where the grid geometry is spread across processors. We describe a asynchronous, parallel, message-passing algorithm that performs sweeps simultaneously from many directions across unstructured grids. We identify key factors that limit the algorithm’s parallel scalability and discuss two enhancements we have made to the basic algorithm: one to prioritize the work within a processor’s subdomain and the other to better decompose the unstructured grid across processors. Performance results are give for the basic and enhanced algorithms implemented withi a radiation solver running on hundreds of processors of Sandia’s Intel Tflops machine and DEC-Alpha CPlant cluster.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125608794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
A scalable SNMP-based distributed monitoring system for heterogeneous network computing 一个可扩展的基于snmp的异构网络计算分布式监控系统
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-11-01 DOI: 10.1109/SC.2000.10058
R. Subramanyan, J. Miguel-Alonso, J. Fortes
{"title":"A scalable SNMP-based distributed monitoring system for heterogeneous network computing","authors":"R. Subramanyan, J. Miguel-Alonso, J. Fortes","doi":"10.1109/SC.2000.10058","DOIUrl":"https://doi.org/10.1109/SC.2000.10058","url":null,"abstract":"Traditional centralized monitoring systems do not scale to present-day large, complex, network- computing systems. Based on recent SNMP standards for distributed management, this paper addresses the scalability problem through distribution of monitoring tasks, applicable for tools such as SI- MONE (SNMP-based monitoring prototype implemented by the authors). Distribution is achieved by introducing one or more levels of a dual entity called the Intermediate Level Manager (ILM) between a manager and the agents. The ILM accepts monitoring tasks described in the form of scripts and delegated by the next higher entity. The solution is flexible and integratable into a SNMP tool without altering other system components. A testbed of up to 1024 monitoring elements is used to assess scalability. Noticeable improvements in the round trip delay (from seconds to less than tenth of a second) were observed when more than 200 monitoring elements are present and as few as 2 ILM's are used.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117044559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Data Access Performance in a Large and Dynamic Pharmaceutical Drug Candidate Database 大型动态候选药物数据库中的数据访问性能
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-11-01 DOI: 10.1109/SC.2000.10049
Zina Ben-Miled, Yang Liu, D. Powers, O. Bukhres, Michael Bem, Robert Jones, Robert J. Oppelt, Sam A. Milosevich
{"title":"Data Access Performance in a Large and Dynamic Pharmaceutical Drug Candidate Database","authors":"Zina Ben-Miled, Yang Liu, D. Powers, O. Bukhres, Michael Bem, Robert Jones, Robert J. Oppelt, Sam A. Milosevich","doi":"10.1109/SC.2000.10049","DOIUrl":"https://doi.org/10.1109/SC.2000.10049","url":null,"abstract":"An explosion in the amount of data generated through chemical and biological experimentation has been observed in recent years. This rapid proliferation of vast amounts of data has led to a set of cheminformatics and bioinformatics applications that manipulate dynamic, heterogeneous, and massive data. An example of such applications in the pharmaceutical industry is the computational process involved in the early discovery of lead drug candidates for a given target disease. This computational process includes repeated sequential and random accesses to a drug candidate database. Using the above pharmaceutical application, an experimental study was conducted in this paper that shows that for optimal performance, the degree of parallelism exploited in the application should be adjusted according to the drug candidate database instance size and the machine size. Additionally, different degrees of parallelism should be used depending on whether the access to the drug candidate database is random or sequential.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125887531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Failure of TCP in High-Performance Computational Grids 高性能计算网格中TCP的失效
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-08-01 DOI: 10.1109/SC.2000.10039
Wu-chun Feng, P. Tinnakornsrisuphap
{"title":"The Failure of TCP in High-Performance Computational Grids","authors":"Wu-chun Feng, P. Tinnakornsrisuphap","doi":"10.1109/SC.2000.10039","DOIUrl":"https://doi.org/10.1109/SC.2000.10039","url":null,"abstract":"Distributed computational grids depend on TCP to ensure reliable end-to-end communication between nodes across the wide-area network (WAN). Unfortunately, TCP performance can be abysmal even when buffers on the end hosts are manually optimized. Recent studies blame the self-similar nature of aggregate network traffic for TCP’s poor performance because such traffic is not readily amenable to statistical multiplexing in the Internet, and hence computational grids. In this paper, we identify a source of self-similarity previously ignored, a source that is readily controllable - TCP. Via an experimental study, we examine the effects of the TCP stack on network traffic using different implementations of TCP. We show that even when aggregate application traffic ought to smooth out as more applications’ traffic are multiplexed, TCP induces burstiness into the aggregate traffic load, thus adversely impacting network performance. Furthermore, our results indicate that TCP performance will worsen as WAN speeds continue to increase.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131908383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 115
Extending OpenMP For NUMA Machines 为NUMA机器扩展OpenMP
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-08-01 DOI: 10.1109/SC.2000.10019
John Bircsak, Peter Craig, RaeLyn Crowell, Z. Cvetanovic, Jonathan Harris, C. A. Nelson, Carl D. Offner
{"title":"Extending OpenMP For NUMA Machines","authors":"John Bircsak, Peter Craig, RaeLyn Crowell, Z. Cvetanovic, Jonathan Harris, C. A. Nelson, Carl D. Offner","doi":"10.1109/SC.2000.10019","DOIUrl":"https://doi.org/10.1109/SC.2000.10019","url":null,"abstract":"This paper describes extensions to OpenMP that implemen data placemen features needed for NUMA architectures. OpenMP is a collection of compiler directives and library routines used to write portable parallel programs for shared-memory architectures. Writing efficient parallel programs for NUMA architectures, which have characteristics of both shared-memory and distributed-memory architectures, requires that a programmer control the placement of data in memory and the placement of computations that operate on that data. Optimal performance is obtained when computations occur on processors that have fast access to the data needed by those computations. OpenMP-designed for shared-memory architectures-does not by itself address these issues. The extensions to OpenMP Fortran presented here have been mainly taken from High Performance Fortran. The paper describes some of the techniques that the Compaq Fortran compiler uses to generate efficient code based on these extensions. I also describes some additional compiler optimizations, and concludes with some preliminary results.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131795780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 129
Scalable Algorithms for Adaptive Statistical Designs 自适应统计设计的可扩展算法
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-08-01 DOI: 10.1155/2000/508081
R. Oehmke, J. Hardwick, Q. Stout
{"title":"Scalable Algorithms for Adaptive Statistical Designs","authors":"R. Oehmke, J. Hardwick, Q. Stout","doi":"10.1155/2000/508081","DOIUrl":"https://doi.org/10.1155/2000/508081","url":null,"abstract":"We present a scalable, high-performance solution to multidimensional recurrences that arise in adaptive statistical designs. Adaptive designs are an important class of learning algorithms for a stochastic environment, and we focus on the problem of optimally assigning patients to treatments in clinical trials. While adaptive designs have significant ethical and cost advantages, they are rarely utilized because of the complexity of optimizing and analyzing them. Computational challenges include massive memory requirements, few calculations per memory access, and multiply-nested loops with dynamic indices. We analyze the effects of various parallelization options, and while standard approaches do not work well, with effort an efficient, highly scalable program can be developed. This allows us to solve problems thousands of times more complex than those solved previously, which helps make adaptive designs practical. Further, our work applies to many other problems involving neighbor recurrences, such as generalized string matching.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124215485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Scalable Molecular Dynamics for Large Biomolecular Systems 大型生物分子系统的可扩展分子动力学
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-08-01 DOI: 10.1155/2000/750827
R. Brunner, James C. Phillips, L. Kalé
{"title":"Scalable Molecular Dynamics for Large Biomolecular Systems","authors":"R. Brunner, James C. Phillips, L. Kalé","doi":"10.1155/2000/750827","DOIUrl":"https://doi.org/10.1155/2000/750827","url":null,"abstract":"We present an optimized parallelization scheme for molecular dynamics simulations of large biomolecular systems, implemented in the production-quality molecular dynamics program NAMD. With an object-based hybrid force and spatial decomposition scheme, and an aggressive measurement-based predictive load balancing framework, we have attained speeds and speedups that are much higher than any reported in literature so far. The paper first summarizes the broad methodology we are pursuing, and the basic parallelization scheme we used. It then describes the optimizations that were instrumental in increasing performance, and presents performance results on benchmark simulations.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130144139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Self-Consistent Langevin Simulation of Coulomb Collisions in Charged-Particle Beams 带电粒子束中库仑碰撞的自洽朗格万模拟
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-05-01 DOI: 10.1109/SC.2000.10047
J. Qiang, R. Ryne, S. Habib
{"title":"Self-Consistent Langevin Simulation of Coulomb Collisions in Charged-Particle Beams","authors":"J. Qiang, R. Ryne, S. Habib","doi":"10.1109/SC.2000.10047","DOIUrl":"https://doi.org/10.1109/SC.2000.10047","url":null,"abstract":"In many plasma physics and changed-particle beam dynamics problems, Coulomb collisions are modeled by a Fokker-Planck equation. In order to incorporate these collisions, we present a three-dimensional parallel Langevin simulation method using a Particle-In-Cell (PIC) approach implemented on high-performance parallel computers. We perform, for the first time, a fully self-consistent simulation, in which the friction and diffusion coefficients are computed from first principles. We employ a two-dimensional domain decomposition approach within a message passing programming paradigm along with dynamic load balancing. Object oriented programming is used to encapsulate details of the communication syntax as well as to enhance reusability and extensibility. Performance tests on the SGI Origin 2000, IBM SP RS/6000 and the Cray T3E-900 have demonstrated good scalability. As a test example, we demonstrate the collisional relaxation to a final thermal equilibrium of a beam with an initially anisotropic velocity distribution.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125051750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Using High-Speed WANs and Network Data Caches to Enable Remote and Distributed Visualization 使用高速广域网和网络数据缓存实现远程和分布式可视化
ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-04-18 DOI: 10.1109/SC.2000.10002
W. Bethel, B. Tierney, Jason R. Lee, D. Gunter, Stephen Lau
{"title":"Using High-Speed WANs and Network Data Caches to Enable Remote and Distributed Visualization","authors":"W. Bethel, B. Tierney, Jason R. Lee, D. Gunter, Stephen Lau","doi":"10.1109/SC.2000.10002","DOIUrl":"https://doi.org/10.1109/SC.2000.10002","url":null,"abstract":"Visapult is a prototype application and framework for remote visualization of large scientific datasets. We approach the technical challenges of tera-scale visualization with a unique architecture that employs high speed WANs and network data caches for data staging and transmission. This architecture allows for the use of available cache and compute resources at arbitrary locations on the network. High data throughput rates and network utilization are achieved by parallelizing I/O at each stage in the application, and by pipelining the visualization process. On the desktop, the graphics interactivity is effectively decoupled from the latency inherent in network applications. We present a detailed performance analysis of the application, and improvements resulting from field-test analysis conducted as part of the DOE Combustion Corridor project.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133224504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 141
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信