ACM/IEEE SC 1999 Conference (SC'99)最新文献

筛选
英文 中文
Evaluating Titanium SPMD Programs on the Tera MTA 在Tera MTA上评价钛SPMD项目
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331575
Carleton Miyamoto, Chang Lin
{"title":"Evaluating Titanium SPMD Programs on the Tera MTA","authors":"Carleton Miyamoto, Chang Lin","doi":"10.1145/331532.331575","DOIUrl":"https://doi.org/10.1145/331532.331575","url":null,"abstract":"While the common trend in building large-scale multiprocessors is to use commodity compute nodes that are increasingly powerful and have deep memory hierarchies, the Tera MTA uses a different design point, with a relatively flat memory system, no processor caches, and hardware support for light-weight multithreading, which is used to mask memory latency. In this paper we explore the implementation of Titanium, a language with coarse-grained SPMD parallelism, onto the MTA. The major concerns in obtaining high performance on the MTA are sufficient degrees of parallelism, good load balance, and low synchronization overhead. We show that by adding loop level parallelism, Titanium applications have sufficient parallelism for the MTA, and as expected, application writers do not need to orchestrate data layout. We evaluate multiple implementations of the Titanium synchronization constructs, which include barriers and monitors. We then explore several scheduling strategies, and find that the distinction between SPMD and loop level parallelism proves to be surprisingly useful. The two-level parallelism structure can be used to throttle thread migration, which lowers thread creation overhead and synchronization. We use a combination of micro-benchmarks and applications to demonstrate these results.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115151511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Adaptive Performance Prediction for Distributed Data-Intensive Applications 分布式数据密集型应用的自适应性能预测
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331568
M. Faerman, Alan Su, R. Wolski, F. Berman
{"title":"Adaptive Performance Prediction for Distributed Data-Intensive Applications","authors":"M. Faerman, Alan Su, R. Wolski, F. Berman","doi":"10.1145/331532.331568","DOIUrl":"https://doi.org/10.1145/331532.331568","url":null,"abstract":"The computational grid is becoming the platform of choice for large-scale distributed data-intensive applications. Accurately predicting the transfer times of remote data files, a fundamental component of such applications, is critical to achieving application performance. In this paper, we introduce a performance prediction method, AdRM (Adaptive Regression Modeling), to determine file transfer times for network-bound distributed data-intensive applications. We demonstrate the effectiveness of the AdRM method on two distributed data applications, SARA (Synthetic Aperture Radar Atlas) and SRB (Storage Resource Broker), and discuss how it can be used for application scheduling. Our experiments use the Network Weather Service [36, 37], a resource performance measurement and forecasting facility, as a basis for the performance prediction model. Our initial findings indicate that the AdRM method can be effective in accurately predicting data transfer times in wide-area multi-user grid environments.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116970065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 99
The Diesel Combustion Collaboratory: Combustion Researchers Collaborating over the Internet 柴油燃烧合作实验室:燃烧研究人员在互联网上合作
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331596
C. Pancerella, L. Rahn, Christine L. Yang
{"title":"The Diesel Combustion Collaboratory: Combustion Researchers Collaborating over the Internet","authors":"C. Pancerella, L. Rahn, Christine L. Yang","doi":"10.1145/331532.331596","DOIUrl":"https://doi.org/10.1145/331532.331596","url":null,"abstract":"The Diesel Combustion Collaboratory (DCC) is a pilot project to develop and deploy collaborative technologies to combustion researchers distributed throughout the DOE national laboratories, academia, and industry. The result is a problem-solving environment for combustion research. Researchers collaborate over the Internet using DCC tools, which include: a distributed execution management system for running combustion models on widely distributed computers, including supercomputers; web-accessible data archiving capabilities for sharing graphical experimental or modeling data; electronic notebooks and shared workspaces for facilitating collaboration; visualization of combustion data; and video-conferencing and data-conferencing among researchers at remote sites. Security is a key aspect of the collaborative tools. In many cases, we have integrated these tools to allow data, including large combustion data sets, to flow seamlessly, for example, from modeling tools to data archives. In this paper the authors describe the work of a larger collaborative effort to design, implement and deploy the DCC.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126552734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Terascale Spectral Element Algorithms and Implementations 兆级光谱元素算法和实现
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331599
H. Tufo, P. Fischer
{"title":"Terascale Spectral Element Algorithms and Implementations","authors":"H. Tufo, P. Fischer","doi":"10.1145/331532.331599","DOIUrl":"https://doi.org/10.1145/331532.331599","url":null,"abstract":"We describe the development and implementation of an efficient spectral element code for multimillion gridpoint simulations of incompressible flows in general two- and three-dimensional domains. Key to this effort has been the development of scalable solvers for elliptic problems and a stabilization scheme that admits full use of the method’s high-order accuracy. We review these and other recently developed algorithmic underpinnings that have resulted in good parallel and vector performance on a broad range of architectures and that, with sustained performance of 319 GFLOPS on 2048 nodes of the Intel ASCI-Red machine at Sandia, readies us for the multithousand node terascale computing systems now coming on line at the DOE labs.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125658657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 137
Scheduling Constrained Dynamic Applications on Clusters 受调度约束的集群动态应用
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331578
K. Knobe, James M. Rehg, A. Chauhan, R. Nikhil, U. Ramachandran
{"title":"Scheduling Constrained Dynamic Applications on Clusters","authors":"K. Knobe, James M. Rehg, A. Chauhan, R. Nikhil, U. Ramachandran","doi":"10.1145/331532.331578","DOIUrl":"https://doi.org/10.1145/331532.331578","url":null,"abstract":"There is an emerging class of computationally demanding multimedia applications involving vision, speech and interaction with the real world (e.g., CRL’s Smart Kiosk). These applications are highly parallel and require low latencies for good performance. They are well-suited for implementation on clusters of SMP's, but they require efficient scheduling of application tasks. General purpose schedulers produce high latencies because they lack knowledge of the dependencies between tasks. Previous research in optimal scheduling has been limited to static problems. In contrast, our application is highly dynamic as the optimal schedule depends upon the behavior of the kiosk's customers. We observe that the dynamism of our application class is constrained, in that there are a small number of operating regimes which are determined by the state of the application. We present a framework for optimal scheduling of constrained dynamic applications. The results of an experimental comparison with a hand-tuned schedule are promising.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"592 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134424864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
H-RMC: A Hybrid Reliable Multicast Protocol for the Linux Kernel H-RMC: Linux内核的混合可靠组播协议
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331540
P. McKinley, R. Rao, R. F. Wright
{"title":"H-RMC: A Hybrid Reliable Multicast Protocol for the Linux Kernel","authors":"P. McKinley, R. Rao, R. F. Wright","doi":"10.1145/331532.331540","DOIUrl":"https://doi.org/10.1145/331532.331540","url":null,"abstract":"This paper describes H-RMC, a reliable multicast protocol designed for implementation in the Linux kernel. H-RMC takes advantage of IP multicast and is primarily a NAK-based protocol. To accommodate low-loss environments, where feedback in the form of NAKs is scarce, H-RMC receivers return periodic update messages in the absence of other reverse traffic. H-RMC uses a combination of rate-based and window-based flow control. The sender maintains minimal information about each receiver so that buffered data is not released prematurely, and polls receivers in case it has not heard from them at the time of buffer release. Combined, these techniques produce a reliable multicast data stream with a relatively low rate of feedback. Performance results show that adequate kernel buffer space, combined with a two-stage rate control method and polling, are effective in minimizing feedback from receivers and thereby in maintaining reasonable throughputs.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115567744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
DeepView: A Channel for Distributed Microscopy and Informatics 深度视图:分布式显微镜和信息学的通道
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331597
B. Parvin, John R. Taylor, G. Cong, M. O'Keefe, M. Barcellos-Hoff
{"title":"DeepView: A Channel for Distributed Microscopy and Informatics","authors":"B. Parvin, John R. Taylor, G. Cong, M. O'Keefe, M. Barcellos-Hoff","doi":"10.1145/331532.331597","DOIUrl":"https://doi.org/10.1145/331532.331597","url":null,"abstract":"This paper outlines the requirements, architecture, and design of a \"Microscopy Channel\" over the wide area network. A microscopy channel advertises a listing of available online microscopes, where users can seamlessly participate in an experiment, acquire expert opinions, collect and process data, and store this information in their electronic notebook. The proposed channel is a collaborative problem solving environment (CPSE) that allows for both synchronous and asynchronous collaboration. Our testbed includes several unique electron and optical microscopes with applications ranging from material science to cell biology. We have studied current commercial CORBA services and concluded that three basic services are needed to meet the extensibility and functionality constraints. These include: Instrument Services (IS), Exchange Services (ES), and Computational Services (CS). These services sit on top of CORBA and its enabling services (naming, trading, security, and notification). IS provide a layer of abstraction for controlling any type of microscope. ES provide a common set of utilities for information management and transaction. CS provide the analytical capabilities needed for online microscopy and PSE.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"42 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131611165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Compiler-supported simulation of highly scalable parallel applications 编译器支持的高度可扩展并行应用程序模拟
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331533
Vikram S. Adve, R. Bagrodia, E. Deelman, T. Phan, R. Sakellariou
{"title":"Compiler-supported simulation of highly scalable parallel applications","authors":"Vikram S. Adve, R. Bagrodia, E. Deelman, T. Phan, R. Sakellariou","doi":"10.1145/331532.331533","DOIUrl":"https://doi.org/10.1145/331532.331533","url":null,"abstract":"In this paper, we propose and evaluate practical, automatic techniques that exploit compiler analysis to facilitate simulation of very large message-passing systems. We use a compiler-synthesized static task graph model to identify the control-flow and the subset of the computations that determine the parallelism, communication and synchronization of the code, and to generate symbolic estimates of sequential task execution times. This information allows us to avoid executing or simulating large portions of the computational code during the simulation. We have used these techniques to integrate the MPI-Sim parallel simulator at UCLA with the Rice dHPF compiler infrastructure. The integrated system can simulate unmodified High Performance Fortran (HPF) programs compiled to the Message-Passing Interface standard (MPI) by the dHPF compiler, and we expect to simulate MPI programs as well. We evaluate the accuracy and benefits of these techniques for three standard benchmarks on a wide range of problem and system sizes. Our results show that the optimized simulator has errors of less than 17% compared with direct program measurement in all the cases we studied, and typically much smaller errors. Furthermore, it requires factors of 5 to 2000 less memory and up to a factor of 10 less time to execute than the original simulator. These dramatic savings allow us to simulate systems and problem sizes 10 to 100 times larger than is possible with the original simulator.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130643254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Running EveryWare on the Computational Grid 在计算网格上运行每个软件
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331538
R. Wolski, J. Brevik, C. Krintz, Graziano Obertelli, N. Spring, Alan Su
{"title":"Running EveryWare on the Computational Grid","authors":"R. Wolski, J. Brevik, C. Krintz, Graziano Obertelli, N. Spring, Alan Su","doi":"10.1145/331532.331538","DOIUrl":"https://doi.org/10.1145/331532.331538","url":null,"abstract":"The Computational Grid [10] has recently been proposed for the implementation of high-performance applications using widely dispersed computational resources. The goal of a Computational Grid is to aggregate ensembles of shared, heterogeneous, and distributed resources (potentially controlled by separate organizations) to provide computational \"power\" to an application program. In this paper, we provide a toolkit for the development of Grid applications. The toolkit, called EveryWare, enables an application to draw computational power transparently from the Grid. The toolkit consists of a portable set of processes and libraries that can be incorporated into an application so that a wide variety of dynamically changing distributed infrastructures and resources can be used together to achieve supercomputer-like performance. We provide our experiences gained while building the EveryWare toolkit prototype and the first true Grid application.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134556563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
A Personal Supercomputer for Climate Research 一台用于气候研究的个人超级计算机
ACM/IEEE SC 1999 Conference (SC'99) Pub Date : 1900-01-01 DOI: 10.1145/331532.331591
J. Hoe, C. Hill, A. Adcroft
{"title":"A Personal Supercomputer for Climate Research","authors":"J. Hoe, C. Hill, A. Adcroft","doi":"10.1145/331532.331591","DOIUrl":"https://doi.org/10.1145/331532.331591","url":null,"abstract":"We describe and analyze the performance of a cluster of personal computers dedicated to coupled climate simulations. This climate modeling system performs comparably to state-of-the-art supercomputers and yet is affordable by individual research groups, thus enabling more spontaneous application of high-end numerical models to climate science. The cluster's novelty centers around the Arctic Switch Fabric and the StarT-X network interface, a system-area interconnect substrate developed at MIT. A significant fraction of the interconnect's hardware performance is made available to our climate model through an application-specific communication library. In addition to reporting the overall application performance of our cluster, we develop an analytical performance model of our application. Based on this model, we define a metric, Potential Floating-Pointing Performance, which we use to quantify the role of high-speed interconnects in determining application performance. Our results show that a high-performance interconnect, in conjunction with a light-weight application-specific library, provides efficient support for our fine-grain parallel application on an otherwise general-purpose commodity system.","PeriodicalId":354898,"journal":{"name":"ACM/IEEE SC 1999 Conference (SC'99)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115881603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信