{"title":"Hues in control? (massively parallel computers)","authors":"D. Schaefer, R. Portee","doi":"10.1109/FMPC.1992.234938","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234938","url":null,"abstract":"A methodology for the description and analysis of massively parallel computers is presented. Massively parallel structures are modeled with a data path graph, a precedence graph, and a control structure. The control structure, specified with colored Petri nets, employs nomenclature that provides the concise representation of thousands of Petri places and transitions.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115139603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient algorithms for locating a core of a tree network with a specified length","authors":"S. Peng, W. Lo","doi":"10.1109/FMPC.1992.234904","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234904","url":null,"abstract":"The authors present efficient algorithms for finding a core of tree with a specified length for both sequential and parallel computational models. The algorithms can be readily extended to a tree network in which arcs have nonnegative integer lengths. The authors also present a parallel version of the algorithm on an EREW PRAM (parallel random access machine) model. The results presented might provide a basis for the study of other facility shapes such as trees and forests of fixed sizes.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133399409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A scalable multicast service for mesh networks","authors":"H. Xu, P. McKinley, L.M. Ni","doi":"10.1109/FMPC.1992.234893","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234893","url":null,"abstract":"The authors investigate the scalability of a multicast algorithm designed for wormhole-routed mesh networks. The algorithm, known as the U-mesh algorithm, is shown to scale well in four ways: with the dimension of the mesh, with the number of destinations, with the system size, and with the problem size. It is demonstrated that the only factor that affects the multicast latency is the number of destinations and that, for a given number of destinations, the number of time steps required to perform a multicast operation is minimal. Performance measurements of implementations on a 64-node nCUBE-2 and a 168-node Symult 2010 are given.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116148685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Staggered distribution: a loop allocation scheme for dataflow multiprocessor systems","authors":"J. T. Lim, A. Hurson, B. Lee, B. Shirazi","doi":"10.1109/FMPC.1992.234944","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234944","url":null,"abstract":"The authors present a staggered distribution scheme for DOACROSS loops. The scheme uses heuristics to distribute the loop iterations unevenly among processors in order to mask the delay caused by data dependencies and inter-PE (processing element) communication. Simulation results have shown that this scheme is effective for loops that have a large degree of parallelism among iterations. The scheme, due to its nature, distributes loop iterations among PEs based on architectural characteristics of the underlying organization, i.e. processor speed and communication cost. The maximum speedup attained is very close to the maximum speedup possible for a particular loop even in the presence of inter-PE communication cost. This scheme utilizes processors more efficiently, since, relative to the equal distribution approach, it requires fewer processors to attain maximum speedup. Although this scheme produces an unbalanced distribution among processors, this can be remedied by considering other loops when making the distribution to produce a balanced load among processors.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124503261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Establishing an MPP guidepost","authors":"S. Nelson","doi":"10.1109/FMPC.1992.234878","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234878","url":null,"abstract":"It is noted that no system has appeared which has been recognized by the high-performance computing community as the guidepost for massively parallel processing (MPP). The author describes what is necessary for a system to become such a guidepost. A comparison is made to the CRAY-1, a system which serves as a guidepost for vector computing and serves as a standard by which all types of high-performance computing systems are measured. It is claimed that establishing an MPP guidepost requires building a system that delivers on the promised potential of scalable parallel computing, by providing high sustained performance on a wide variety of production application, with very little programming effort.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114419029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal implementation of parallel divide-and-conquer algorithms on de Bruijn networks","authors":"Xiaoxiong Zhong, V. Lo, S. Rajopadhye","doi":"10.1109/FMPC.1992.234914","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234914","url":null,"abstract":"Studies the problem of optimal implementation of parallel divide-and-conquer algorithms on binary de Bruijn networks. A divide-and-conquer algorithm is modeled as a temporal complete binary tree computation structure. An important contraction property between two successive binary de Bruijn networks is revealed. A twice-size complete binary tree is mapped to a de Bruijn network. Two nodes in the complete binary tree are mapped to a single node. The mapping is of dilation one, communication contention free and of good load balance.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123852068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Choudhary, G. Fox, S. Ranka, S. Hiranandani, K. Kennedy, C. Koelbel, C. Tseng
{"title":"Compiling Fortran 77D and 90D for MIMD distributed-memory machines","authors":"A. Choudhary, G. Fox, S. Ranka, S. Hiranandani, K. Kennedy, C. Koelbel, C. Tseng","doi":"10.1109/FMPC.1992.234911","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234911","url":null,"abstract":"The authors present an integrated approach to compiling Fortran 77D and Fortran 90D programs for efficient execution on MIMD (multiple-instruction multiple-data) distributed-memory machines. the integrated Fortran D compiler relies on two key observations. First, array constructs may be scalarized into FORALL loops without loss of information. Second, loop fusion, partitioning, and sectioning optimizations are essential for both Fortran D dialects. A portable run-time library can also reduce the complexity and machine-dependence of the compiler. All optimizations except coarse-grain pipelining and data prefetching have been implemented in the current Fortran D compiler prototype.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122375583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Radiation-magnetohydrodynamics of plasmas on parallel supercomputers","authors":"O. Yasar, G. Moses, T. Tautges","doi":"10.1109/FMPC.1992.234915","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234915","url":null,"abstract":"Presents a parallel computational model to simulate plasmas in the radiation-magnetohydrodynamics (R-MHD) framework. The solution of the radiation field usually dominates the R-MHD computation. The authors solve the linear Boltzmann equation for the radiation field intensity, using the deterministic S/sub N/ discrete ordinates method. Choosing an energy-domain decomposition the authors have implemented the S/sub N/ method on a parallel processor, the Intel iPSC/860, and the speedups are very favorable. Increasing almost linearly with the number of processors, the speedup reaches 14 on 16 processors. A comparison of timing measurements between a single processor CRAY Y-MP and a 16 processor iPSC/860 implementation strongly favors parallelism by a factor of 3.7.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123101671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Siegel, L. Valiant, P. Woodward, M. J. Flynn, L.M. Ni
{"title":"Perspectives on massively parallel computation","authors":"H. Siegel, L. Valiant, P. Woodward, M. J. Flynn, L.M. Ni","doi":"10.1109/FMPC.1992.234875","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234875","url":null,"abstract":"The areas of algorithms applications, architectures, and system software are discussed with reference to massively parallel computation.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114992491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Embedding multilevel structures into massively parallel hypercubes-connection machine results for computer vision algorithms","authors":"Sotirios G. Ziavras","doi":"10.1109/FMPC.1992.234913","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234913","url":null,"abstract":"Investigates the problem of embedding multilevel structures into hypercubes. The widely used pyramid belongs to the class of multilevel structures. Although several algorithms have been proposed for embedding pyramids into hypercubes, there do not exist algorithms for embedding general multilevel structures. For the special case of the pyramid, this research carries out a comparative analysis that involves four embedding algorithms. Results for a Connection Machine system CM-2 containing 16384 processors are presented, including the general case.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"360 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122770144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}