{"title":"MPPs, Amdahl's law, and comparing computers","authors":"M. Annaratone","doi":"10.1109/FMPC.1992.234879","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234879","url":null,"abstract":"The author examines Amdahl's law in the context of parallel processing and provides some arguments as to what the applicability of this law really is. Amdahl's law establishes an upper bound on the available parallelism given the fraction of sequential code present in an application. In this paper, Amdahl's law is revisited to derive a formulation which allows one to carry out some quantitative analysis. The claim that MPPs (massively parallel processors) are special-purpose systems is also addressed.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"163 11-12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114037958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Out of core dense solvers on Intel parallel supercomputers","authors":"D. Scott","doi":"10.1109/FMPC.1992.234876","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234876","url":null,"abstract":"Certain engineering problems give rise to systems of linear equations AX=B, where A is a large, dense matrix. ProSolver-DES is an out-of-core dense solver for the Intel iPSC/860 system which was specifically designed for such problems. Recently Intel introduced an alternate slab-style solver which does full column pivoting. The performance of this solver is slightly less than that of ProSolver-DES, but still obtains well over 4 Gflops on a 128-node system. The Paragon-XP/S is Intel's next-generation parallel supercomputer. It provides a quantum leap in performance over the iPSC/860 in all design dimensions. At the same time, the programming model fully supports the message passing model of the iPSC/860, and programs written for the iPSC/860 will run unmodified on the Paragon system. The dense solvers will be available on Paragon and will run significantly faster.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128177696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithm for scheduling independent jobs on partitionable hypercubes","authors":"B. Narahari, Ramesh Krishnamurti","doi":"10.1109/FMPC.1992.234928","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234928","url":null,"abstract":"The authors consider the problem of nonpreemptive scheduling of w independent tasks on an n processor partitionable hypercube system. Each task can be executed on a subcube of any dimension, with a smaller execution time on larger subcubes. The schedule must determine the subcube to be allocated to each task, with the objective of minimizing the overall finishing time. The authors present a polynomial time approximation algorithm which generates a schedule whose finishing time is within twice that of the optimal schedule.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125609969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optical interconnects for multiprocessors cost performance trade-offs","authors":"P. Lalwaney, L. Zenou, A. Ganz, I. Koren","doi":"10.1109/FMPC.1992.234948","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234948","url":null,"abstract":"The authors demonstrate the performance advantages of wavelength division multiplexing (WDM) based optical interconnects in the face of partial structures dictated by the hardware restrictions of the currently available technology. Because the cost of optical communication hardware for WDM-star-based interconnects may be high, reduced cost structures have been introduced. The performance of the optical implementations of the reduced cost structures is compared to that of the electronic implementations for the hypercube topology. The performance is compared in terms of the communication overhead in implementing two commonly used algorithms on these structures. Results indicate that, in most situations, the optically implemented reduced cost variations perform better than the electronic implementations. Moreover, the hardware cost-performance tradeoffs show that among the optically implemented schemes, the performance degradation of the reduced cost variations is not significant in view of the hardware savings involved.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114287904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance prediction of message passing SIMD multiprocessor systems","authors":"S. Noh, A. Agrawala","doi":"10.1109/FMPC.1992.234922","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234922","url":null,"abstract":"The paper focuses on two points: (1) the prediction of the execution signature of massively parallel applications prior to execution/implementation based on a more informative characterization of the workload, and (2) the definition of a more general form of speedup and efficiency. The systems considered are of SIMD message passing paradigm.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132892254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Massively parallel simulation of a class of discrete event systems","authors":"P. Vakili, Levent Mollamustafaoglu, Yu-Chi Ho","doi":"10.1109/FMPC.1992.234886","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234886","url":null,"abstract":"The authors describe a novel approach for the simulation of discrete-event systems on massively parallel computers. In spite of considerable partial parallelism that exists in discrete-event systems, the simulation of a single discrete-event system is intrinsically asynchronous and highly data dependent, and its implementation on massively parallel SIMD (single-instruction multiple-data) computers is particularly difficult. The authors propose to simulate several such systems simultaneously, and in parallel. These are variants of a nominal system with different system parameter values or operating policies. A single clock mechanism is used that drives all the variants in parallel, synchronizes their trajectories, and is the basis of the SIMD implementation. The authors describe a general discrete-event simulator that is developed for implementation on the SIMD MasPar MP-1 computer, and they address the computational issues related to their approach in this context.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128731358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel algorithms for all maximal equally-spaced collinear sets and all maximal regular lattices","authors":"L. Boxer, R. Miller","doi":"10.1109/FMPC.1992.234905","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234905","url":null,"abstract":"The authors present parallel solutions to the AMESCS (all maximal equally-spaced collinear subset) and AMRSS (all maximal regularly-spaced subset) problems and show how their solutions to the latter generalize to the AMRSDLS (all maximal regularly-spaced D-dimensional lattice subsets) problem. Their algorithms differ significantly from the optimal sequential algorithms presented in A.B. Kahng and G. Robins (1991), which do not scale well to (massively) parallel machines. The optimality of the authors' Arbitrary CRCW PRAM (parallel random access machine) algorithms is open; however, the algorithms they present are within a logarithmic factor of optimal. Further, the algorithms are optimal for the mesh-connected computer.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"247 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134179427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Representations of Borel Cayley graphs","authors":"K. W. TANGt, Biuce W. Arden","doi":"10.1109/FMPC.1992.234888","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234888","url":null,"abstract":"It is shown that all degree-4 Borel Cayley graphs can also be represented by more restrictive chordal rings (CRs) through a constructive proof. All bidirectional, degree-4 Borel Cayley graphs have the more restrictive CR representations, and hence Hamiltonian cycles always exist for these graphs. A step-by-step algorithm to transform any degree-4 Borel Cayley graph into CR graphs is provided. Examples are used to illustrate this concept.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121300371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
P. Wilsey, D. Hensgen, N. Abu-Ghazaleh, C.E. Slusher, D.Y. Hollinden
{"title":"The concurrent execution of non-communicating programs on SIMD processors","authors":"P. Wilsey, D. Hensgen, N. Abu-Ghazaleh, C.E. Slusher, D.Y. Hollinden","doi":"10.1109/FMPC.1992.234908","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234908","url":null,"abstract":"This paper explores the use of SIMD (single-instruction multiple-data) (or SIMD-like) hardware to support the efficient interpretation of concurrent, noncommunicating programs. This approach places compiled programs into the local memory space of each distinct processing element (PE). Within each PE, a local program contour is initialized, and the instructions are interpreted in parallel across all of the PEs by control signals emanating from the central control unit. Initial experiments have been conducted with two distinct software architectures (MINTABs and MIPS R2000) on the MasPar MP-1 and two distinct applications (program mutation analysis and Monte Carlo simulation). While these experiments have shown only marginal performance improvement, it appears that, with several minor hardware modifications, SIMD-like hardware can be constructed that will cost-effectively support both SIMD and MIMD (multiple-instruction multiple-data) processing.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"524 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123063464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data-parallel visualisation using multi-dimensional transformations","authors":"G. Vezina, P. K. Robertson","doi":"10.1109/FMPC.1992.234954","DOIUrl":"https://doi.org/10.1109/FMPC.1992.234954","url":null,"abstract":"The authors show how a flexible resampling approach can be embedded within massively parallel implementations of multidimensional transformation algorithms based on one-dimensional resampling operations. They provide a consistent solution to the resampling requirements across visualization applications. Based on this framework, two applications are outlined: a surface perspective viewing algorithm with hidden-surface removal and a volume rendering algorithm. These algorithms include regular and irregular resampling requirements. The algorithm considered here is well-suited to data-parallel SIMD (single-instruction multiple-data) processing, and performance on surface and volume visualization is sufficient to achieve interactive manipulation on large SIMD arrays.<<ETX>>","PeriodicalId":117789,"journal":{"name":"[Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130390687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}