Proceedings of the IEEE/ACM SC95 Conference最新文献_第4页

The Emperor Has No Clothes: What HPC Users Need to Say and HPC Vendors Need to Hear 皇帝没有衣服:HPC用户需要说什么和HPC供应商需要听到什么

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224172

C. Pancake

引用次数: 6

Storm Watch: A Tool for Visualizing Memory System Protocols Storm Watch:一个可视化内存系统协议的工具

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224287

Trishul M. Chilimbi, T. Ball, S. Eick, J. Larus

{"title":"Storm Watch: A Tool for Visualizing Memory System Protocols","authors":"Trishul M. Chilimbi, T. Ball, S. Eick, J. Larus","doi":"10.1145/224170.224287","DOIUrl":"https://doi.org/10.1145/224170.224287","url":null,"abstract":"Recent research has offered programmers increased options for programming parallel computers by exposing system policies (e.g., memory coherence protocols) or by providing several programming paradigms (e.g. message passing and shared memory) on the same platform. Increased flexibility can lead to higher performance, but it is also a double-edged sword that demands a programmer understand his or her application and system at a more fundamental level. Our system, Tempest, allows a programmer to select or implement communication and memory coherence policies that fit an application's communication patterns. With it, we have achieved substantial performance gains without making major changes in programs. However, the process of selecting, designing, and implementing coherence protocols is difficult and time consuming, without tools to supply detailed information about an application's behavior and interaction with the memory system. StormWatch is a new visualization tool that aids a programmer through four mechanisms: tightly-coupled bidirectionally linked views, interactive filters, animation, and performance slicing. Multiple views present several aspects of program behavior simultaneously and show the same phenomenon from different perspectives. Real-time linking between views enables a programmer to explore levels of abstraction by changing a view and observing the effect on other views. Interactive filters, along with bidirectional linking, can isolate the effects of statements, loops, procedures, or files. StormWatch can also animate a program's dynamic behavior to show the evolution of program execution and communication. Finally, performance slicing captures causality among events. The examples in the paper illustrate how StormWatch helped us substantially improve the performance of two applications.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"255 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133231841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Wide-Area Gigabit Networking: Los Alamos HIPPI-SONET Gateway 广域千兆网络:洛斯阿拉莫斯hipi - sonet网关

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224313

W. S. John, D. DuBois

引用次数: 1

The Synergetic Effect of Compiler, Architecture, and Manual Optimizations on the Performance of CFD on Multiprocessors 编译器、体系结构和人工优化对多处理器CFD性能的协同效应

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224426

M. Kuba, C. Polychronopoulos, K. Gallivan

{"title":"The Synergetic Effect of Compiler, Architecture, and Manual Optimizations on the Performance of CFD on Multiprocessors","authors":"M. Kuba, C. Polychronopoulos, K. Gallivan","doi":"10.1145/224170.224426","DOIUrl":"https://doi.org/10.1145/224170.224426","url":null,"abstract":"This paper discusses the comprehensive performance profiling, improvement and benchmarking of a Computational Fluid Dynamics code, one of the Grand Challenge applications, on three popular multiprocessors. In the process of analyzing performance we considered language, compiler, architecture, and algorithmic changes and quantified each of them and their incremental contribution to bottom-line performance. We demonstrate that parallelization alone cannot result in significant gains if the granularity of parallel threads and the effect of parallelization on data locality are not taken into account. Unlike benchmarking studies that often focus on the performance or effectiveness of parallelizing compilers on specific loop kernels, we used the entire CFD code to measure the global effectiveness of compilers and parallel architectures. We probed the performance bottlenecks in each case and derived solutions which eliminate or neutralize the performance inhibiting factors. The major conclusion of our work is that overall performance is extremely sensitive to the synergetic effects of compiler optimizations, algorithmic and code tuning, and architectural idiosyncrasies.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114780606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Case Study in Parallel Scientific Computing: The Boundary Element Method on a Distributed-Memory Multicomputer 并行科学计算的实例研究:边界元法在分布式存储多计算机上的应用

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224277

R. Natarajan, D. Krishnaswamy

引用次数: 16

Analysis of Multilevel Graph Partitioning 多层图划分分析

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224229

G. Karypis, Vipin Kumar

引用次数: 360

Efficient Algorithms for Atmospheric Correction of Remotely Sensed Data 遥感数据大气校正的有效算法

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224194

Hassan Fallah-Adl, J. JáJá, S. Liang, Y. Kaufman, J. Townshend

引用次数: 9

Mobile Robots Teach Machine-Level Programming 移动机器人教授机器级编程

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224205

P. Teller, T. Dunning

{"title":"Mobile Robots Teach Machine-Level Programming","authors":"P. Teller, T. Dunning","doi":"10.1145/224170.224205","DOIUrl":"https://doi.org/10.1145/224170.224205","url":null,"abstract":"We feel strongly that a contemporary introductory course in machine organization and assembly language should focus on the essentials of how computers execute programs, and not be distracted by the complications of the extraordinarily sophisticated microprocessors that are available today. These essentials should form a strong base of knowledge from which students can draw as they continue their education in computer science. Ideally these goals should be attained in an environment that fosters experimentation and cooperation, and with the aid of projects that generate interest and enthusiasm among the students. We have developed and are currently teaching a course at New Mexico State University that meets many of these goals. The course concentrates on a simple but relatively complete microprocessor architecture, that of the Motorola 68HC11 processor. Three different teaching techniques are used to encourage experimentation and team work: learning sessions, simulator labs, and microprocessor labs. New concepts are introduced in learning sessions, which combine traditional lecturing with student exploration. The understanding of these new concepts is strengthened through labs and assignments. Simulator labs and assignments, which require interaction with a simulator of the Motorola 68HC11 microprocessor, focus on the 68HC11's instruction set architecture. Microprocessor labs and assignments, which essentially are designing and building sessions, focus on the use of a 68HC11 microprocessor to control a motorized vehicle. During microprocessor labs students populate printed circuit cards, build motorized vehicles (or other roboticized exotica), and design and implement assembly language programs that provide communication between a personal computer and a 68HC11 processor, and a 68HC11 processor and a motorized vehicle. We have found that the costs of running this course are minimal and the results are very favorable in terms of student enthusiasm and achievement.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123556845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Controlling Application Grain Size on a Network of Workstations 在工作站网络上控制应用程序粒度

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224497

B. Siegell, P. Steenkiste

引用次数: 7

Architecture-Adaptable Finite Element Modelling: A Case Study Using an Ocean Circulation Simulation 建筑适应性有限元模型:使用海洋环流模拟的案例研究

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI: 10.1145/224170.224501

S. Kumaran, Robert N. Miller, M. J. Quinn

引用次数: 3