Conference on Hypercube Concurrent Computers and Applications最新文献

筛选
英文 中文
Statistical gravitational lensing on the Mark III hypercube 马克III超立方体的统计引力透镜
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1989-01-03 DOI: 10.1145/63047.63050
J. Apostolakis, C. Kochanek
{"title":"Statistical gravitational lensing on the Mark III hypercube","authors":"J. Apostolakis, C. Kochanek","doi":"10.1145/63047.63050","DOIUrl":"https://doi.org/10.1145/63047.63050","url":null,"abstract":"We describe a parallel algorithm for the nonlinear optics problem of gravitational lensing. The method is a “ray-tracing” method which studies the statistical properties of the image population associated with a gravitational lens. A parallel computer is needed because the spatial resolution requirements of the problem make the program too large to run on conventional machines. The program is implemented on the Mark III hypercube to take maximum advantage of this machine's 128 Mbytes of memory. The concurrent implementation uses a scattered domain decomposition and the CrOS III communications routines. The communications in the problem are so irregular that no completely satisfactory implementation was made in terms of the execution time of the program: the maximum speed-up relative to a sequential implementation is a factor of 4 on a 32 node machine. However, the goal of efficiently using all of the Mark III's memory was achieved, and the execution time was not the limiting factor in the problem. If the crystal router were used, the implementation would be much more efficient. Development of the program was terminated at this stage, however, because we were able to extract the physics of interest without the more sophisticated communications routines.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132739172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Comparison of two-dimensional FFT methods on the hypercube 超立方体上二维FFT方法的比较
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1989-01-03 DOI: 10.1145/63047.63099
C. Chu
{"title":"Comparison of two-dimensional FFT methods on the hypercube","authors":"C. Chu","doi":"10.1145/63047.63099","DOIUrl":"https://doi.org/10.1145/63047.63099","url":null,"abstract":"Complex two-dimensional FFTs up to size 256 x 256 points are implemented on the Intel iPSC/System 286 hypercube with emphasis on comparing the effects of data mapping, data transposition or communication needs, and the use of distributed FFTs. Two new implementations of the 2D-FFT include the Local-Distributed method which performs local FFTs in one direction followed by distributed FFTs in the other direction, and a Vector-Radix implementation that is derived from decimating the DFT in two-dimensions instead of one. In addition, the Transpose-Split method involving local FFTs in both directions with an intervening matrix transposition and the Block 2D-FFT involving distributed FFT butterflies in both directions are implemented and compared with the other two methods. Timing results show that on the Intel iPSC/System 286, there is hardly any difference between the methods, with the only differences arising from the efficiency or inefficiency of communication. Since the Intel cannot overlap communication and computation, this forces the user to buffer data. In some of the methods, this causes processor blocking during communication. Issues of vectorization, communication strategies, data storage and buffering requirements are investigated. A model is given that compares vectorization and communication complexity. While timing results show that the Transpose-Split method is in general slightly faster, our model shows that the Block method and Vector-Radix method have the potential to be faster if the communication difficulties were taken care of. Therefore if communication could be “hidden” within computation, the latter two methods can become useful with the Block method vectorizing the best and the Vector-Radix method having 25% fewer multiplications than row-column 2D-FFT methods. Finally the Local-Distributed method is a good hybrid method requiring no transposing and can be useful in certain circumstances. This paper provides some general guidelines in evaluating parallel distributed 2D-FFT implementations and concludes that while different methods may be best suited for different systems, better implementation techniques as well as faster algorithms still perform better when communication become more efficient.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"57 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114124612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A distributed hypercube file system 分布式超立方体文件系统
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1989-01-03 DOI: 10.1145/63047.63093
R. Flynn, H. Hadimioglu
{"title":"A distributed hypercube file system","authors":"R. Flynn, H. Hadimioglu","doi":"10.1145/63047.63093","DOIUrl":"https://doi.org/10.1145/63047.63093","url":null,"abstract":"For the hypercube, an autonomous physically interconnected file system is proposed. The resulting distributed file system consists of an I/O organization and a software interface. The system is loosely-coupled architecturally but from operating systems point of view a tightly-coupled system is formed in which interprocessor messages are handled differently from file accesses. A matrix multiplication algorithm is given to show how the distributed file system is utilized.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122340281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Implementing the beam and warming method on the hypercube 在超立方体上实现光束和加热方法
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1989-01-03 DOI: 10.1145/63047.63061
J. Bruno, P. Cappello
{"title":"Implementing the beam and warming method on the hypercube","authors":"J. Bruno, P. Cappello","doi":"10.1145/63047.63061","DOIUrl":"https://doi.org/10.1145/63047.63061","url":null,"abstract":"Numerical simulation of a wide range of physical phenomena typically involves enormous amounts of computation and, for scores of practical problems, these simulations cannot be carried out even on today's fastest supercomputers. The economic and scientific importance of many of these problems is driving the explosive research in computer architecture, especially the work aimed at achieving ultra high-speed computation by exploiting concurrent processing. Correspondingly, there is great interest in the design and analysis of numerical algorithms which are suitable for implementation on concurrent processor systems.\u0000In this paper we consider the implementation of the Beam and Warming implicit factored method on a hypercube concurrent processor system. We present a set of equations and give the numerical method in sufficient detail to illustrate and analyze the problems which arise in implementing this numerical method. We show that there are mappings of the computational domain onto the nodes of a hypercube concurrent processor system which maintain the efficiency of the numerical method. We also show that better methods do not exist.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122910231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
LU decomposition of banded matrices and the solution of linear systems on hypercubes 带阵的LU分解及超立方体上线性系统的解
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1989-01-03 DOI: 10.1145/63047.63124
D. Walker, T. Aldcroft, A. Cisneros, G. Fox, W. Furmanski
{"title":"LU decomposition of banded matrices and the solution of linear systems on hypercubes","authors":"D. Walker, T. Aldcroft, A. Cisneros, G. Fox, W. Furmanski","doi":"10.1145/63047.63124","DOIUrl":"https://doi.org/10.1145/63047.63124","url":null,"abstract":"We describe the solution of linear systems of equations, Ax = b, on distributed-memory concurrent computers whose interconnect topology contains a two-dimensional mesh. A is assumed to be an M×M banded matrix. The problem is generalized to the case in which there are nb distinct right-hand sides, b, and can thus be expressed as AX = B, where X and B are both M×nb matrices. The solution is obtained by the LU decomposition method which proceeds in three stages: (1) LU decomposition of the matrix A, (2) forward reduction, (3) back substitution. Since the matrix A is banded a simple rectangular subblock decomposition of the matrices A, X, and B over the nodes of the ensemble results in excessive load imbalance. A scattered decomposition is therefore used to decompose the data. The sequential and concurrent algorithms are described in detail, and models of the performance of the concurrent algorithm are presented for each of the three stages of the algorithm. In order to ensure numerical stability the algorithm is extended to include partial pivoting. Performance models for the pivoting case are also given. Results from a 128-node Caltech/JPL Mark II hypercube are presented, and the performance models are found to be a good agreement with these data. Indexing overhead was found to contribute significantly to the total concurrent overhead.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131566794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Implemention of a divide and conquer cyclic reduction algorithm on the FPS T-20 hypercube 一种分治循环约简算法在FPS T-20超立方体上的实现
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1989-01-03 DOI: 10.1145/63047.63111
C. Cox
{"title":"Implemention of a divide and conquer cyclic reduction algorithm on the FPS T-20 hypercube","authors":"C. Cox","doi":"10.1145/63047.63111","DOIUrl":"https://doi.org/10.1145/63047.63111","url":null,"abstract":"A simple variant of the odd-even cyclic reduction algorithm for solving tridiagonal linear systems is presented. The target architecture for this scheme is a parallel computer with nodes which are vector processors, such as the Floating Point Systems T-Series hypercube. Of particular interest is the case where the number of equations is much larger than the number of processors. The matrix system is partitioned into local subsystems, with the partitioning governed by a parameter which determines the amount of redundancy in computations. The algorithm proceeds after the distribution of local systems with independent computations, all-to-all broadcast of a small number of equations from each processor, solution of this subsystem, more independent computations, and output of the solution. Some redundancy in calculations between neighboring processors results in minimized communication costs. One feature of this approach is that computations are well balanced, as each processor executes an identical algebraic routine.\u0000A brief description of the standard cyclic reduction algorithm is given. Then the divide and conquer strategy is presented along with some estimates of speedup and efficiency. Finally, an Occam program for this algorithm which runs on the FPS T-20 computer is discussed along with experimental results.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129106495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Gauss-Jordan inversion with pivoting on the Caltech Mark II hypercube Caltech Mark II超立方体上具有旋转的高斯-乔丹反演
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1989-01-03 DOI: 10.1145/63047.63123
P. Hipes, A. Kuppermann
{"title":"Gauss-Jordan inversion with pivoting on the Caltech Mark II hypercube","authors":"P. Hipes, A. Kuppermann","doi":"10.1145/63047.63123","DOIUrl":"https://doi.org/10.1145/63047.63123","url":null,"abstract":"The performance of a parallel Gauss-Jordan matrix inversion<supscrpt>1,2</supscrpt> algorithm on the Mark II hypercube<supscrpt>3</supscrpt> at Caltech is discussed. We will show that parallel Gauss-Jordan inversion is superior to parallel Gaussian elimination <italic>for inversion</italic>, and discuss the reasons for this. Empirical and theoretical efficiencies for parallel Gauss-Jordan inversion as a function of matrix dimension for different numbers and configurations of processors are presented. The theoretical efficiencies are in <italic>quantitative</italic> agreement with the empirical efficiencies.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115120010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Chess on a hypercube 在超立方体上下棋
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1988-04-11 DOI: 10.1145/63047.63088
E. Felten, S. Otto
{"title":"Chess on a hypercube","authors":"E. Felten, S. Otto","doi":"10.1145/63047.63088","DOIUrl":"https://doi.org/10.1145/63047.63088","url":null,"abstract":"We report our progress on computer chess last described at the Second Conference on Hypercubes. Our program follows the strategy of currently successful sequential chess programs: searching of an alpha-beta pruned game tree, iterative deepening, transposition and history tables, specialized endgame evaluators, and so on. The search tree is decomposed onto the hypercube (an NCUBE) using a recursive version of the principal-variation-splitting algorithm. Roughly speaking, subtrees are searched by teams of processors in a self-scheduled manner.\u0000A crucial feature of the program is the global hashtable. Hashtables are important in the sequential case, but are even more central for a parallel chess algorithm. The table not only stores knowledge but also makes the decision at each node of the chess tree whether to stay sequential or to split up the work in parallel. In the language of Knuth and Moore, the transposition table decides whether each node of the chess tree is a type 2 or a type 3 node and acts accordingly. For this data structure the hypercube is used as a shared-memory machine. Multiple writes to the same location are resolved using a priority system which decides which entry is of more value to the program. The hashtable is implemented as “smart” shared memory.\u0000Search times for related subtrees vary widely (up to a factor of 100) so dynamic reconfiguration of processors is necessary to concentrate on such “hot spots” in the tree. A first version of the program with dynamic load balancing has recently been completed and out-performs the non-load-balancing program by a factor of three. The current speedup of the program is 101 out of a possible 256 processors.\u0000The program has played in several tournaments, facing both computers and people. Most recently it scored 2-2 in the ACM North American Computer Chess Championship.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1988-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116570488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Rapid prototyping of a parallel operating system for a generalized hypercube 广义超立方体并行操作系统的快速原型设计
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1900-01-01 DOI: 10.1145/62297.62339
E. Gehringer, Brian D. Harry
{"title":"Rapid prototyping of a parallel operating system for a generalized hypercube","authors":"E. Gehringer, Brian D. Harry","doi":"10.1145/62297.62339","DOIUrl":"https://doi.org/10.1145/62297.62339","url":null,"abstract":"B-HIVE is an experimental multiprocessor system under construction at North Carolina State University. Its operating system is derived from XINU, an operating system designed for teaching purposes. XINU was chosen because it is unusually well documented and supplied most of the features that were necessary at the outset of the project. Among the few changes made to XINU are a supervisor state and an interprocessor communication system.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115660376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A dynamic load balancer on the Intel hypercube Intel超立方体上的动态负载平衡器
Conference on Hypercube Concurrent Computers and Applications Pub Date : 1900-01-01 DOI: 10.1145/62297.62328
J. Koller
{"title":"A dynamic load balancer on the Intel hypercube","authors":"J. Koller","doi":"10.1145/62297.62328","DOIUrl":"https://doi.org/10.1145/62297.62328","url":null,"abstract":"A class of commonly encountered problems requires dynamic load balancing for efficient use of concurrent processors. We are developing a test bed for dynamic load balancing studies, and have chosen the MOOSE operating system and the Intel iPSC as our environment. We discuss these choices, and how we are implementing a general purpose dynamic load balancer.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123765219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信