The Sixth Distributed Memory Computing Conference, 1991. Proceedings最新文献_第9页

Performance and Assembly Language Programming of the iPSC/860 System iPSC/860系统的性能与汇编语言编程

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633312

D. Scott, G. Withers

引用次数: 9

Characterizing the Balance of Parallel 1/0 Systems 描述并行1/0系统的平衡

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633363

J. French

{"title":"Characterizing the Balance of Parallel 1/0 Systems","authors":"J. French","doi":"10.1109/DMCC.1991.633363","DOIUrl":"https://doi.org/10.1109/DMCC.1991.633363","url":null,"abstract":"High pzrformance 110 subsystems are a key element in parallel computer systems that intend to compete with traditional supercomputers in the solution of large scientific and engineering problems. The hardware and software organization and perfomiance of these U0 subsystems are fundamental issues in parallel computer systems that have not been explored in depth. Two commercially available parallel file systems are the Intel iPSC/2 Concurrent File System (CFS) and the NCUBE/ten NChannel board and disk farm. Both systcms are aimed at support of high volume, large block I/O of the type typical of large scientific computations. The evaluation of these systems has proved difficult. There are many parameters affecting performance and the system dynamics are quite complex. In this paper we examine a method of quantifying the balance of an 1/0 system, that is, how well it services I/O rcquests with respect to fairness and distribution of overheads. One may gauge the degree of balance in a systcrn by asking: When resources become saturated, is the bottleneck felt equally by each process or are some processes given preferential service? This paper explores a simple yardstick of system balance. 1. Quantifying and Measuring Parallel I/O Suppose that we have p processes reading (writing) a file of N bytes in parallel. Each process i reads (writes) N n bytes in time ti where n = -. The individual data P transfer rate of a particular processor i is given by ri = z. The average individual data transfer rate is ti given by 7 = 1 firi. P I = There are at least two reasonable measures of the aggregate data transfer rate of the p processors. In the first case, we sum the data rates of the individual processors. This gives rise to the quantity ri called the maximum sustained aggregate rate (ma-SAR). We call this I = 4 tThis research was supported in part by JPL Contract #957721 and by the Department of Energy under Grant DE-FG05-88ER25063. measure the “maximum” rate because, by construction, it assumes that each processor i contributes a rate ri and all processors contribute during the same time instant, however brief. From the definition of F above, we see that max-SAR = ri = p 7 . (1) I = f i This interpretation is illustrated in Figure l(a). In the second case, we consider that all N bytes move through the system in z = max ti time units. That is, the entire file is not transferred until the slowest processor finishes reading (writing) its partition of the file. This gives rise to the quantity called the minimum sustained aggregate rate (min-SAR). We call this a “minimum” rate since this is the rate that an outside observer will perceive as the rate at which the entire processor ensemble is operating. From the definitions above, we see that 1","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115500873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Routing between Subcubes in a Hypercube 超多维数据集中的子数据集之间的路由

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633149

S. Padmanabhan

引用次数: 3

Efficient Communication Primitives on Circuit-Switched Hypercubes 电路交换超立方体上的高效通信原语

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633172

Ching-Tien Ho, M. Raghunath

{"title":"Efficient Communication Primitives on Circuit-Switched Hypercubes","authors":"Ching-Tien Ho, M. Raghunath","doi":"10.1109/DMCC.1991.633172","DOIUrl":"https://doi.org/10.1109/DMCC.1991.633172","url":null,"abstract":"We give practical algorithms, complexity analysis and implementation for all-to-all personalized communication and matrix transpose (with two-dimensional partitioning of the matrix) on hypercubes. We assume the following communication characteristics: circuitswitched e-cube routing and one-port communication model. For all-to-all personalized communication, we propose a hybrid algorithm that combines the well-known recursive doubling algorithm [22,12] and a direct-route algorithm [26,23]. Our hybrid algorithm balances between data transfer time and start-up time of these two algorithms, and its communication complexity is estimated to be better than the two previous algorithms for a range of machine parameters. For matrix transpose with two-dimensional partitioning of the matrix, our algorithm is measured to be better than the recursive transpose algorithm [8] by n nearest-neighbor communications [12]. Our algorithm takes advantage of circuit-switched routing and is congestion-free for a hypercube with e-cube routing. We also suggest a way of storing the matrix such that the transpose operation can take advantage of the routing of the machine.","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121069900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Mapping Precedence-Constrained Simulation Tasks for a Parallel Environment 并行环境下映射优先约束仿真任务

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633067

J. Sartor, G. Lamont, R. Hammell, T. Hartrum

{"title":"Mapping Precedence-Constrained Simulation Tasks for a Parallel Environment","authors":"J. Sartor, G. Lamont, R. Hammell, T. Hartrum","doi":"10.1109/DMCC.1991.633067","DOIUrl":"https://doi.org/10.1109/DMCC.1991.633067","url":null,"abstract":"The Mapping Problem Classical results on the deterministic precedence- constrained scheduling problem are almost exclusively concerned with a single iteration of the task system. This paper explores the problem of mapping deter- ministic tasks to processors in a parallel simulation environment, with each task iterating multiple times. Counterexamples are shown to demonstrate that mul- tiple passes through an optimal mapping for one iter- ation of a task system may produce less-than-optimal results when compared to mappings based on the it- erative nature of the simulation. A level strategy for assigning iterative tasks to processors is developed, and theoretical and experimental results are discussed for different mapping strategies in a VHDL simulation. This paper examines the classical multiprocessor scheduling problem for application to deterministic simulation systems. The tasks in these systems are characterized by iterative executions: each task exe- cutes more than once in the course of a simulation run. The general task scheduling problem and its relation- ship to the mapping problem for simulation tasks are introduced. The problem space is constrained, lim- iting the scope of the study to systems which map equal-execution time tasks into identical processors. A theoretical basis for the level strategy of iterative task assignment is summarized, and a polynomial- time algorithm based on this strategy is given. The results of hypercube experiments based on different mapping strategies are discussed with application to VHDL logic simulation.","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133487266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ALDIMS - A Language for Programming Distributed Memory Multiprocessors 分布式内存多处理器编程语言

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633132

K. G. Kumar, D. Kulkarni, A. Basu, A. Paulraj

引用次数: 0

Static Program Assignment in Circuit Switched Multiprocessors 电路开关多处理器中的静态程序分配

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633136

J. Lindberg

引用次数: 1

A Parallel Approach To Solving a 3-D Finite Element Problem on a Distributed Memory MIMD Machine 分布式存储器MIMD机三维有限元问题的并行求解方法

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633157

A. Amin, A. Chaudhary, P. Sadayappan

引用次数: 2

Multidimensional Spreadsheets in a Graphical Symbolic Debugger for the Ncube 多维电子表格中的图形符号调试器的Ncube

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633125

A. Couch, D.W. Krumme

引用次数: 1

Processor-Time Tradeoffs for Cayley Graph Interconnection Networks Cayley图互连网络的处理器时间权衡

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI: 10.1109/DMCC.1991.633348

Marc Baumslagt, A. Rosenberg

引用次数: 8